This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6657: EDMA3 With UART Problem

Part Number: TMS320C6657

Customer reporting issue with EDMA3 UART

    1. Setup an EDMA3 circular buffer to receive UART input.  The EDMA3 sets up to be auto reload.
    1. Setup an  EDMA3 for EMIF16 input of 16K audio data.   The data come in 1.33ms.  When EDMA3 completes, its ISR submit another transfer, hence continuously going this way.

Observations:

    1. UART can receive small number of bytes without problem, anything around than 10 bytes.
    1. When UART receives more than 30 bytes, It shuts down.  Meaning the UART will not receive any data.  They can see UART line has received data on scope.  But the its holding register shows previous value without change.
    1. If they stop EMIF16 EDMA3, UART can received 50K of continuous data without problem.  UART running at 915000 baud, and its FIFO is enabled.
    1. In UART line status register, there is no error indication.

UART and EDMA3 setup:

  • Hi Lawerence,

    If they stop EMIF16 EDMA3, UART can received 50K of continuous data without problem. UART running at 915000 baud, and its FIFO is enabled.


    Is this an OS or a bare metal application? If OS, are you using Processor SDK RTOS? Which version?

    Best Regards,
    Yordan
  • They are using bios_6_45_01_29.

  • Hi Lawrence
    To rule out EDMA bandwidth/real time contention issues can you get more details on

    1) SRC/DST for UART and EMIF traffic on c665x ( I can reverse engineer this for PaRAM) but would like to make sure for test vs long term application we understand where does UART and EMIF payload need to reside. It may not be a bad idea to keep the UART vs EMIF on different memories to further avoid resource contention.
    2) Please help confirm that the traffic for EMIF and UART is mapped to separate queues/TCs - this is likely most important to first cross check/rule out.
    3) It would be additionally good to understand TC/Que priority for the EMIF vs UART traffic


    Based on the experiments do you think the customer is suspecting bandwidth / real time issues or some software ISR/chomping type issues?
  • Mukul,

    From the customer:

    1) SRC/DST for UART and EMIF traffic on c665x but would like to make sure for test vs long term application TI understand where does UART and EMIF payload need to reside. It may not be a bad idea to keep the UART vs EMIF on different memories to further avoid resource contention.
    >>: All EDMA3 parameters are sent you. UART and EMIF16 receive buffers are in L2 internal memory. I tried to each out and that causes problem.

    2) Please help confirm that the traffic for EMIF and UART is mapped to separate queues/TCs - this is likely most important to first cross check/rule out.
    >>: please elaborate on this and provide more details. It seems that CSL only support CSL_TPCC_2 for EDMA3.

    3) It would be additionally good to understand TC/Que priority for the EMIF vs UART traffic
    >>: love to.


    Regards,

    Lawrence
  • Lawrence
    Can you please confirm which version of McSDK you are using so that we know what CSL RL functions you are using etc?

    Can you also have them share the DMA initialization code - we want to see how they are mapping DMA channels for UART and EMIF16 to which queues/TC. Or if they don't want to share, please the entire CC dump ( code snippets preferred).

    Also ask them to provide EDMA_CC QUEPRI values. This is the register that manages Queue/TC priority with respect to each other (assuming they are on separate TCs) and we should set the priority of traffic that is more critical/real time at higher priority.

    Note as per datasheet UART traffic can only be submitted to Queue 0 / TC0 and Queue 3/TC3. EMIF16 traffic can be read/written from all 4 TCs in the EDMA

    Regards
    Mukul
  • Customer using McSDK 2.1.2.6. Company policy doesn’t allow releasing of code, they can provide regiester dumps if needed.  If there are specific address needed, let me know.

     

    UART uses EDMA3 queue 0 and EMIF16 uses queue 3.


    Here are the calls to open the EDMA3 channels.

    UART:
        chParamBRx.regionNum = CSL_EDMA3_REGION_GLOBAL;
        chParamBRx.chaNum = 14
    hEdmaUartBRevt = CSL_edma3ChannelOpen(&edmaUartBRcvObj, 2, &chParamBRx, &uartStatus);

    EMIF16:
    chAttr.regionNum = CSL_EDMA3_REGION_GLOBAL;
    chAttr.chaNum    = 1;
    EMIF16hChannel = CSL_edma3ChannelOpen(&EMIF16chObj, 2, &chAttr, &status);

  • Hi Lawrence

    As discussed offline

    The above info does not clarify whether UART and EMIF were indeed on separate queue/TC, similar to the init code above you can share the DMAQNUM setup or register dump to double check.

    CC.QUEPRI register could be set differently for Q0/TC0 vs Q3/TC3 . Was this tried?

    Additionally in previous post you had the following response

    1) SRC/DST for UART and EMIF traffic on c665x but would like to make sure for test vs long term application TI understand where does UART and EMIF payload need to reside. It may not be a bad idea to keep the UART vs EMIF on different memories to further avoid resource contention.

    >>: All EDMA3 parameters are sent you. UART and EMIF16 receive buffers are in L2 internal memory. I tried to each out and that causes problem.

    What does highlighted section imply? Can they tr to have UART in L2 memory and EMIF16 payload in DDR memory just for test purposes

    It is important to see if this is a  EDMA  or src/destination bottleneck or something wrong with the way the interrupts/auto reload/ initialization  of UART vs EMIF16 is being handled. 

    When they say that the UART receive is ongoing, but no updates in UART registers, did they see what is the CC registers like ER, EMR , SER etc>

    As previously requested if they cannot share code, maybe they need to share the CC reg dump 

  • Mukul,

    They played with different combinations of the priority, no change in the issue.   The conclusion is the existing setting is the optimal.

    Here is the DMAQNUM register dumps.

  • Hi Lawrence
    I am looking for DMAQNUM register values , which is offset 0x240.
    As you can also see quepri is set to all 0's which , which means all TCs are at equal and highest priority
    What you are showing is QueTCMAP , which we know is default Q0 to TC0 , Q1 to TC1 etc

    Regards
    Mukul
  • Here is the correct register dump

    I also reconfirmed they tried different priority scheme but having them all the same worked the best overall.

  • Hi Lawrence
    Thanks. This does confirm DMA to QNUM mapping.

    The only other thing I can suggest is to try to put UART vs EMIF16 buffers on L2 vs some other memory , just for test.

    If that does not help, it is likely not an EDMA, teranet blocking /bottle neck issue.

    It maybe that somewhere in completion ISRs or some other place there is some resourcing chomping or registers cleared , or maybe it is not a software issue and something on their board. Hard to tell.

    It would be good to get a register dump for entire CC space when the UART transfers are stuck.

    >>If they stop EMIF16 EDMA3, UART can received 50K of continuous data without problem. UART running at 915000 baud, and its FIFO is enabled.
    Can you elaborate on this. I am assuming that when they EMIF16 EDMA3 stopped , this is an initialization time step to test UART alone, it is not that implying that while UART stopped transferring data, it resumes transfering data if EMIF16 EDMA3 data traffic is stopped. Please confirm.


    How are they writing the event handlers / ISR for EMIF16 vs UART transfer completion?
    Have they ensured there is no accidental clear of lateched events etc (ER clear or IPR clear etc)?
  • Mukul,

    Customer already have UART and EMIF16 receive buffers are in L2 internal memory

    I will ask them to provide a register dump for UART when the transfer are suck and your last 2 questions.
  • Hi Lawrence
    What i meant was to further seperate out the traffic to different end points.
    Keep UART rx buffers in L2 and EMIF16 in external memory or vice versa

    Regards
    Mukul;
  •  UART Register dump:

    Answer to your questions:

    How are you writing the event handlers / ISR for EMIF16 vs UART transfer completion? 

    >>EMIF16 DMA is single shot and UART Rx DMA is continuous.

    They also indicated that there are no accidental clear of latched events

    Just a reminder of the issue they are seeing:

    2. When UART receives more than 30 bytes, It shuts down.  Meaning the UART will not receive any data.  I can see UART line has received data on scope.  But the its holding register shows previous value without change.


    3. If I stop EMIF16 EDMA3, UART can received 50K of continuous data without problem.  Our UART running at 915000 baud, and its FIFO is enabled

    Is there anything else you want me to ask them to try besides separating the EMIF16 and UART receive buffer to a different memory?

  • Thanks Lawrence.
    I believe Raja asked for EMIF16 param - but I pointed to him that we have the info from the related thread that SGQ posted.

    So suggestions on table to debug
    1) Put EMIF16 and UART buffers in different memory
    2) See if it possible to break the EMIF16 transfers into smaller chunks per transfer request - I believe the EMIF16 transfers are important and real time as they had something to do with audio - however if it is not performance critical , they can also try to increase the time between every transfer by increasing the value for time between TC reads by programming the RDRATE register in TC.

    If these suggestion don't make any difference then I am at a loss of what is going on but I would be skeptical if it is a throughput/bottleneck type issue.

    Regards
    Mukul
  • Mukul,

    Response from the customer:

    1. Put EMIF16 and UART receive buffers in different memory (not both in L2) 

    >> I putEMIF16 buffer in MSMC and vis versa.  Doesn’t work.  Both memories have to be in L2.

    2) See if it possible to break the EMIF16 transfers into smaller chunks per transfer request - I believe the EMIF16 transfers are important and real time as they had something to do with audio - however if it is not performance critical , they can also try to increase the time between every transfer by increasing the value for time between TC reads by programming the RDRATE register in TC.

    >> Due to real time issue, we cannot break the transfers nor wait longer time in between.

    Regards,

    Lawrence

  • >> I putEMIF16 buffer in MSMC and vis versa. Doesn’t work. Both memories have to be in L2.

    What does "doesn't work" mean? They tried it and it didn't alleviate the issue or they did not try it because for some reason the payloads have to be in L2?

    Sorry looks like we are running out of ideas
  • Mukul,

    Could the problem be that they DDR bandwidth issue? Is the EDMA setup for only a single transfer at a time? Should they be doing a burst transfer to improve DDR throughput and efficiency? How should the EDMA be setup for that?
  • Hi Lawrence
    It can be potentially be a peak instantaneous bandwidth/latency tolerance issue for UART x'fers as the EMIF transfers are big chunk of data contiguously being transferred - the overall DDR bandwidth is much higher than what a single EMIF16 tc (much slower than onchip memory for DDR form/to memory read/write dmas can throw at the system).

    To rule out DDR bandwidth arbitration issues.
    I recommend quickly try changing ddr controller VBUSM_CONFIG register PR_OLD_COUNT value from default 0xff to something like 0x20.
    if that does not help , they can also read the sections on command starvation and arbitration/ with quality of service programming for various different masters in the DDR TRM

    www.ti.com/.../sprugv8e.pdf

    condensed details also summarized on the following wiki
    processors.wiki.ti.com/.../Keystone_SoC_Level_Optimizations

    to rule out any software issues, perhaps if possible can they try lowering the DDR speed - if it is DDR bandwidth issue,it should become more prominent?

    Regards
    Mukul
  • How the UART RX DMA is triggered? As per the above configuration, SRC is UART0 RBR, destination is L2. Once the channel using Param set 9 is triggered, it will trigger the channel using param set 10 continuously as it is linked to itself.

    Can you try the EDMA channel triggered by RX event when the FIFO reached to certain level?
    Table 6-25. EDMA3_CC Events for C665x
    EVENT
    NUMBER EVENT EVENT DESCRIPTION
    0 TCP3D_AREVT0 TCP3D_A receive event0
    1 TCP3D_AREVT1 TCP3D_A receive event1
    2 TINT2L Timer2 interrupt low
    3 TINT2H Timer2 interrupt high
    4 URXEVT UART0 receive event
    5 UTXEVT UART0 transmit event
  • ACNT = 1 and BCNT = 0x8000 with A-sync, SRC in incremental mode. DST in incremental mode. Given SRC is FIFO with 16-bytes depth, I do not understand use of incremental mode. Why BCNT = 0x8000?
  • Raja,

    Answers to your questions:

    How the UART RX DMA is triggered? As per the your register  configuration, SRC is UART0 RBR, destination is L2. Once the channel using Param set 9 is triggered, it will trigger the channel using param set 10 continuously as it is linked to itself

     

    >Customer continuously trigger DMA using link reload and circular buffer.

     

    Can you try the EDMA channel triggered by RX event when the FIFO reached to certain level?

     

    >Yes, FIFO is triggered at 1 byte level. 

     

    ACNT = 1 and BCNT = 0x8000 with A-sync, SRC in incremental mode. DST in incremental mode. Given SRC is FIFO with 16-bytes depth, it is not clear the use of incremental mode.  

     

    >For UART received register address, both INC or CONST works the same

     

    Why BCNT = 0x8000?

     

    >this is the size of circular buffer.