Hi
We were trying to profile the UDMA transfers to understand if we can mask it behind the computation time on C7x.
I was trying to count cycles using TSC on c7x between trigger and completion of UDMA transfer. The numbers observed look very huge.
Even for the simple udma block copy example, on an average I'm getting 14 cycles per byte of transfer. Hence for a transfer of 1024 bytes I'm getting around 14900 cycles.
These numbers don't look rite to me. the places where I'm measuring the cycles count as shown below.

Is this right?