Other Parts Discussed in Thread: PROCESSOR-SDK-OMAPL138
I am trying to find out how long a matrix multiplication takes on different processors. I found this link which gave me some basic numbers for 16x16 timing.
https://www.ti.com/processors/digital-signal-processors/core-benchmarks/core-benchmarks.html
However I am more interested in multiplication with bigger size. Do you happen to have more info that you can provide?
I currently have access to C6748 with diplib. I did some timing analysis and it scaled badly when the matrix size grew. I have a couple suspicions on why that is the case but do you happen to have an good explanation for that?
Here are the timings that I have collected with different dimensions: