This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H12: IDMA transfer vs. memcpy() for on-chip data transfer

Part Number: 66AK2H12

Hi,

          I am trying to use IDMA Channel 1 for the data transfer between L2 SRAM and L1 SRAM in Keystone 2 project. Sometimes I see that the time taken by IDMA is more than that of memcpy() function or even the manual copy using for loops. Is there any reason behind this? The IDMA config function is inlined and and it writes 16bytes into configuration registers. The memcpy() function involves few stack operations which should compensate for the IDMA config cost. How the memcpy() is implemented internally? Does it use IDMA internally? I am using -O3 level of optimization and my stack is in L2 SRAM.

Thanks

- Gopal

  • Hi Gopal,

    I've notified the design team. Their feedback will be posted here.

    NOTE that answers may be delayed, because of Christmas holidays.

    Best Regards,
    Yordan
  • Dear Gopal

    I have two comments to make.

    It is possible that IDMA is slower than memcpy.In fact you already hinted that there is overhead associated with the IDMA operation. If you want to see how the run-time library implements memcpy function write a small program, look at the disassembly  window and step through the code. You may see 64-bits loads and 64-bits stores in parallel. (look at the CPU and instructions set  http://www.ti.com/lit/ug/sprugh7/sprugh7.pdf     to understand the instruction set)

    The second comment is more important - the purpose of the IDMA that moves data between L1D and L2 (or the other way) is to move data at the same time that the CPU does something else in a ping-pong fashion (see for example http://encyclopedia2.thefreedictionary.com/Ping-pong+buffer   )   So that the IDMA hides the cycles of data move from L2 to L1D (and back) because it is done during the CPU processing time of the other buffer

    Does it answer your question?  If so, please close the thread

    Ran