This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM335X: Heavy EDMA on GPMC Causes Other EDMA Channels to Stall

Part Number: PROCESSOR-SDK-AM335X

Greetings, everyone! I'm acquiring A-to-D conversion data from an FPGA which uses the GPMC driven by the EDMA to retrieve a constant flow of data. 4 times a second I get 65536 samples of A-to-D and each sample is 64 bits wide, and the EDMA transfers are being requested using 3 linked PaRAM blocks with PaRAM[52] kicking things off (since the GPMC Peripheral is channel 52.)

That actually works just fine, I get all my samples continually, and I write the data in to two Ping-Pong buffers without losing data and without duplicates.

My problem is that while the GPMC is retrieving the constant data, I can't use any other channel, such as channel 42 which is a UART Transmit channel. While the GPMC is receiving data, when I attempt to use EDMA on channel 42 to transfer a frame of data, that stalls, it takes 45 seconds to transmit 4 bytes.

I had hoped that the solution would be to use a different Transfer Controller so that my Transfer Request issued for the UART would go to Transfer Controller 1 while my GPMC's Transfer Requests would go to Transfer Controller 0. I did that inasmuch as I assign a different queue whether I'm setting up to use channel 52 or not:

mpst_edmaccChannel->queueNum = (men_channelID == 52 ? EDMA_QUEUE : EDMA_QUEUE + 1);
(void)EDMAChConfig(mpst_edma3cc->base, mpst_edma3cc->channelType, mpst_edma3cc->channelNum, mpst_edmaccChannel);

That did not make a bit of difference, the UART EDMA transfer of data is stalled badly because the GPMC data is flooding in through the EDMA controller. If I disable the GPMC data flow, my EDMA transfers to the UART work without a problem, they are now stalled or slowed down at all.

So what's the solution? Second 11.3.11.1 DMA Channel To Event Queue Mapping for SPRUH73P revised March 2017 should have solved it for me however simply picking a different TC(n) and Queue(n) did not.

That made me look at section 11.4.1.8, the Queue Priority Register inasmuch as I have all Queue(n)s set to the same priority (their default power-up values.) But that should not matter, am I correct? The TC getting the TR should be timely since GPMC is using TC0 and everything TC1, so there should not be a bottleneck stopping my UART's TRs from getting to its TC, right?

So I'm unable to explain or fix why my UART transmits are stalled and I have some hopes that someone may point me to a section of the SPRUH73P or other manual which gives a clue or a reason (or a solution) to configuring the EDMA engine so that my UART Channel 42 transfer request is processed in a timely way without adversely impacting the GPMC data flow using Channel 52.

Would someone who might have a solution or a clue as to where my code could use a fix let me know, please?

Thanks!

  • Hi,

    Please post what software you are using, and which version?
  • Hi Fredric,

    Have you tried changing your interconnect priority levels for your different transfer controllers? See section 7.2.3.3.1 ("Intitiator Priority Control for Interconnect") and the CTRL_INIT_PRIORITY_0/1 registers in the AM437x TRM.

    Regards,
    Melissa
  • Fredric,

    I just realized that your question was about AM335x, not AM437x. The AM335x init_priority_0/1 registers have a similar functionality as AM437x's CTRL_INIT_PRIORITY_0/1 registers.
  • I'm not sure what you mean, the software I'm using is my own. :) Do you mean which version of Starterware?  If so, I believe I'm using 2.0.1.1 though I'm not certain. I'm using the EDMA.C library offered by TI's Starterware, version I'm not sure.

  • That's an interesting idea, in my user manual that's section 9.2.4.3 however that would not solve the problem inasmuch as the DMA's priority opposed to other interconnected buses aren't the problem, the DMA is getting serviced in a timely way without a problem, it's the three Transfer Controllers within the DMA that need to be nudged toward having TC1 be serviced frequently despite TC0 being constantly in use. For this design, Transfer Controller 0 within the DMA is constantly, solidly in use so that other transfer requests sent to the DMA are stalled even for several minutes until the DMA engine decides it's okay to process another pending transfer request.

    There is another register -- QUEPRI Register (offset = 284h) [reset = 777h] -- which changes the priority of the 3 queues associated with the 3 Transfer Controllers I can alter to see if it makes a bit of difference, however I doubt that that will solve it, the DMA is simply busy handling a high-volume of data, and requests for other transfers between other devices are stalled.

    Still, that was a good suggestion and it looks like I should play with that also, see if I should elevate the DMA interconnect's priority above all else just because I'm using DMA heavily. Interesting! Thank you.
  • :) Not a problem, I understood what you were noting and located it in the manual for my CPU. :) The cores are much alike, TI does a very good job carrying functionality over from one of their CPUs to others, it makes our software development efforts a bit easier.

    Thanks!
  • I found the problem and it is actually something that TI's Starterware does and something which TI should fix if they have not already done so.

    In TI's edma.c module function EDMAChToEvtQueueMap() contains code that looks like this:

    HW_WR_FIELD32(baseAddr + EDMA_TPCC_DMAQNUM(chNum >> 3U),
    EDMA_TPCC_DMAQNUM_E0, queueNum);

    The MACRO EDMA_TPCC_DMAQNUM is defined like this:

    #define EDMA_TPCC_DMAQNUM(n) ((uint32_t)0x240U + (n * 0x4U))

    When in the real world, a proper MACRO would do this:

    #define EDMA_TPCC_DMAQNUM(n) ((uint32_t)0x240U + ((n) * 0x4U))

    Argument n needed to be wrapped in parenthesis because the pre-processor was computing "chNum >> 3U * 0x240U" correctly but not what the TI programmer (or who ever wrote the MACRO) intended. The resulting address of "baseAddr + chNum >> 3U * 0x240U" ended up being merely baseAddr, so in edma.c there is no way to change the Queue and thus the Transfer Controller  to anything other than Queue / TR zero.

    Now the question is whether this has been fixed in later versions of Starterware. The question for the company I work for is whether we should go through all Starterware MACROs and wrap all arguments in parenthesis -- which is considered the Best Practices thing to do any way.

    I won't mark this as fixed yet because I'm curious what others might say about this, about MACRO arguments that contain pre-processor work to do while the MACRO itself f"forgot" to get wrapped in parenthesis. Some compilers produce desired code, some do not.

  • Hi Fredric,

    Thanks for finding this out! The Starterware is obsolete but part of those code are maintained in Processor SDK RTOS. For AM335x, the code is under pdk_am335x_1_0_xx\packages\ti\starterware.

    The mentioned dal\edma.c file still has function EDMAChToEvtQueueMap(), and it used MARCO:
    #define EDMA_TPCC_DMAQNUM(n) ((uint32_t)0x240U + (n * 0x4U))

    There are multiple macro definitions like this. E.g:
    #define EDMA_TPCC_QRAE(n) ((uint32_t)0x380U + (n * 0x4U))

    Those needs to be corrected. I filed a bug ticket for this!

    Regards, Eric
  • Sounds good, thanks! Yes, it resolved the issue here though we will track down the remaining MACRO expansion problems and we'll wrap in (). Thanks!