This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DMA transfer issue of AM3874

Other Parts Discussed in Thread: AM3874

Hi,

I have one question regarding AM3874 DMA transfer.

At customer site, the strange behavior is ovserved. It seems that the load of CPU becomes heavy and CPU performance is decrease.

Symptom:

When DMA transfer is executed,  the below strange behavior happens.

1. USB memory recognition is delayed.

2. Application (does not use DMA transfer) execution time becomes longer.

3. The occupation rate of CPU is 15% higher than "no DMA tranfer"(CPU memory transfer).

Do you know the above symptom? Or how can I investigate this issue by debuger?

Please advise me.

I appreciate your quick reply.

Best regards,

Michi

  • This is additional information.
    DMA transfer is executed as the below.
    Source address : SRAM (it is connected to GPMC of AM3874)
    Destination address: DDR3( address is non cacheable area)

    Michi
  • Michi,

    The purpose of the DMA is to off load the CPU (Cortex-A8 ARM) with the memory transfer. You are using the EDMA, right? May be the DMA transfer is too short in term of bytes and the DMA is interrupting the CPU (to inform the CPU that is ready with the transfer) too often. What is the size of the transfer, what is the amount of data that the DMA transfers? If you increase the amount of data to transfer, is there any improvement of the CPU load?

    BR
    Pavel
  • Dear Pavel-san,

    Thank you for your support.

    Customer made more analysis for this issue.
    According to the analysis, CPU is executed only one cycle while one DMA transfer.
    one DMA transfer is approximately 500 clyce(93us).
    If some instruction execution is needed 10 cycle, 930 us is needed as execution time when DMA transfer.
    They think this is the cause that CPU execution is delayed.

    How does CPU execution is inserted within DMA transer? Is it possible by register setting? Please let me know.

    Best regards,
    Michi
  • Michi,

    The primary function of DMA is to move data without direct CPU involvement. After DMA executes the transfer it can notify the CPU (interrupt) or start another transfer (chaining/linking).

    You have several options when you need to move a block of memory. From a high level, you can either choose to use the CPU (like a standard memcpy() ), or use the provided EDMA3 peripheral. memcpy() uses a load/store to move the data which can tax your CPU and keep it from doing more important jobs like executing an algorithm.

    The EDMA3 peripheral performs this load/store using its own buses and therefore takes no CPU cycles away from your application.

    Can you try DDR3 to DDR3 DMA transfer, and see if the CPU load is normal? If yes, then we can focus on the GPMC timing (asyn/sync) DMA settings.

    Regards,
    Pavel

  • Dear Pavel-san,

    Thank you for your support.

    Regarding DDR3 to DDR3 DMA transfer, I will try to request it to my customer.

    But customer is no time. This failure is happened in market. So they would like to solve ASAP.
    Could you give me some advise DMA transfer setting?

    Customer would like CPU to execute instruction prior to DMA transfer between one block transfer and next one block transfer.
    Do you know how to prioritize CPU higher than DMA?

    I appreciate your quick reply.

    Best regards,
    Michi
  • Michi,

    Michi Yama said:
    Could you give me some advise DMA transfer setting?

    Regarding DMA transfer settings, see AM387x TRM, chapter 8 EDMA and the wikis below:

    Michi Yama said:
    Do you know how to prioritize CPU higher than DMA?

    See AM387x TRM, section 8.4.11.4 Performance Considerations and the below pdf file.

    DM814x_DM810x_Performance.pdf

    Regards,
    Pavel

  • Dear Pavel-san,

    Thank you for your continuous support.

    I reported to customer "pressure control for interconnect" and "prioriy control for EMIF" with document that you gave me.
    But my uderstanding was wrong.
    Customer would like to break while DMA transfer from GPMC to DRAM. In concretely, they would like to take an intermission between A-CNT and B-CNT(After A-CNT and before B-CNT). In this timing, they would like CPU to execute.
    For this action, they tried to use RDRATE register (Read Rate Register).

    According to the AM3874 TRM page 1366,
    8.6.2.5 Read Rate register(RDRATE)
    The EDMA3 transfer controller issues read commands at a rate controlled by the read rate register
    (RDRATE). The RDRATE defines the number of idle cycles that the read controller must wait before
    issuing subsequent commands.

    Customer configured the register setting. But only one cycle was inseted between A-CNT and B-CNT.

    Also, according to the TRM page 1303, In "Figure 8-39. Smaller Packet Data Transfers Example", it is written
    in the below.
    "Time gaps allow other transfers on the same priority level to be performed"
    They would like to use "Time gaps". Do you know how to set for "time gaps"?
    Now customer does not set the chain mode.

    I appreciate your quick reply.

    Best regards,
    Michi
  • Michi,

    Michi Yama said:
    Customer would like to break while DMA transfer from GPMC to DRAM. In concretely, they would like to take an intermission between A-CNT and B-CNT(After A-CNT and before B-CNT). In this timing, they would like CPU to execute.

    The A8 CPU will execute in parallel with the EDMA transfer. During EDMA transfer, the CPU will execute some other application, then when EDMA transfer is ready, EDMA will notify (with sending interrupt) the CPU that is ready.

    BR
    Pavel

  • Pavel-san,

    Thank you for your quick reply.

    >The A8 CPU will execute in parallel with the EDMA transfer. During EDMA transfer, the CPU will execute some other application, 

    In this case, A8 CPU is trying to access the same address area for DMA transfer. But its address is used for DMA transfer.So CPU

    is waited until DMA transfer completed.

    So customer would like to break for CPU access. Otherwise CPU waits long time.

    Is there any idea for this?

    Best regards,

    Michi

  • Michi,

    Michi Yama said:

    In this case, A8 CPU is trying to access the same address area for DMA transfer. But its address is used for DMA transfer.So CPU

    is waited until DMA transfer completed.

    One option is to provide different DDR3 addr for EDMA and CPU, thus they will transfer data at the same time.

    You can adjust the priorities in L3 interconnect and DMM/EMIF, the Cortex-A8 with highest, then A8 CPU should no wait until DMA transfer complete.

    Please adjust the A8 CPU settings in:

    1. INIT_PRIORITY_0[1:0] HOST_ARM = 0x3

    2. DMM_PEG_PRIO0[2:0] P0 = 0x0

    See also if the below sections of the TRM will be in help:

    8.5.4.4 Ping-Pong Buffering
    8.5.4.4.1 Synchronization with the CPU
    8.5.4.5.2 Breaking Up Large Transfers with Intermediate Chaining

    Regards,
    Pavel


  • Dear Pavel-san,

    Thank you for your support.

    My customer succeeded to break up the EDMA transfer by using chain register.

    As you adivsed, customer refered  "8.5.4.5.2 Breaking Up Large Transfers with Intermediate Chaining"

    Please see the below.

     DMAtransfer_E.ppt

    Is there any tips and caution for using the intermediate chaining?

    Please advise me.

    Best regards,

    Michi

  • Michi,

    Michi Yama said:
    Is there any tips and caution for using the intermediate chaining?

    You can find info regarding intermediate chaining in the below resources:

    AM387x TRM, chapter EDMA

    BR
    Pavel