This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

6614 UDMA Support

Hi All, 

We are  working on SCMCSDK 2.0.0.9 on ti 6614 EVM. We are using syslib queuering channel for communication between DSP/ARM. The syslib 

using UDMA channel underlying for the same side. We see on ARM that sometimes when DSP sends two messages to ARM 

for e.g. 

DSP send A

DSP send B

At ARM 

ARM receives A

ARM Received corrupt message instead of B - Message which was sent by the DSP long ago to ARM. This is the meesage which was in UDMA buffer when it was used previously. 

This happens randomly. This always happen at the ARM side. We ensure from the DSP side that message and Pktbuffer correctly sent on queuering channel.

Could you please suggest how to move forward on this?

Regards,

Parshant

  • Hi,

    Is there a particular reason you are using this version of SC-MCSDK?

     

    Thanks

    Cesar

  • Hi Cesar,

    No specific reason. We have started with this version and does many changes for our based on this version only. We do not see any fix
    happen in udma user app, kernel etc. Please share if this problem addressed in later version of syslib or in kernel.

    Regards,
    Parshant
  • I think this version is several years old. It is always possible that there are bug fixes in later versions.

    Syslib comes with a set of unit tests. Are you able to run these tests successfully?

    Thanks
    Cesar
  • Hi Cesar,

    Yes, syslib tests works fine. Even our application runs fine. The problem occur after multiple message exchanged between arm DSP.  There is small protocol 

    between ARM and DSP running on top of queue ring channel:

    Arm send a message - It sends a request and put  a seq number in the message, say 0 in first packet

    DSP receives a packet, process it, send response and add the same sequence id which he received in the request. In this case 0.

    ARM on each request keep on increasing the sequence number. Now it may happen multiple request queued up at DSP and than DSP process

    all the request in one go and send back the response to ARM. At ARM side, the response receiver wake up and start matching the req and corresponding

    response. 

    In some scenario it has been observed that suppose DSP submit two packets on queuering channel, ARM receives two packet ( Msgcom_queueRingGetBuffer) but the content are not what was expected.

    One packet has correct contents and the other buffer contents are the one when this buffer used last time. It is not showing the contents which DSP has sent. 

    We have seen also at kernel UDMA driver and we found the contents are corrupted at kernel level itself. It seems that something wrong happen at DMA or at

    qmss. We need some pointers to look up in this issue. Why the content of  buffer changing between DSP and ARM?

    At DSP side, we have verified the packet before submission on queuering channel as well as packet contents in CACHE/DDR. We found the packet

    contents correct.

    Regards,

    Parshant

  • Hello All,
    Can anybody explain the root cause of this issue? I face the same issue with K2K too. I use MSCDK: 3.0.3.15 with Syslib 3.0.1.0.

    I see two issues while performing continuous loop-back IPC (DSP<->ARM), a simple test to ensure the reliability:
    1) Randomly the messages are missed.
    2) In ARM side, Randomly the bufferLen fetched by Msgcom_getMessage <which is ptrChannel->queueRingInfo.rxBuffer.bufferLen> is corrupted/stale data.

    Any clues on the root cause?
  • Hi,

    Could it be possible to get your test case so we can reproduce the issue on our side?

    Thanks

    Cesar

  • Hello Cesar,

    We have implemented some communication mechanism using UDMA queueRing type channels.

    I am also seeing this kind of issue inn our application. Specifically I have observed that sometimes when DSP sends two back to back packets to ARM, ARM receives two packets but content of both packets are exactly replica of each other. So only one packet is received correctly while other one gets corrupted.

    Although this behaviour is never observed on DSP side. DSP is always receiving the correct packet from ARM.

    Can you please suggest how can I further debug into this issue?

    Regards
    Naveen
  • Hi Cesar,

    I tried to reproduce this in syslib unit test project, but couldn't reproduce this.

    I observe this in complex environment, where much more resources are used and all cores are active. It is difficult to isolate the exact combination of resource usage triggering the issue. How you would tackle if MSGCOM direct interrupt trigger is missed at ARM side? and when the received Queuering data are stale/corrupt?

  • Hi,

    Are you using msgcom in a "data burst" environment?

     

    If the system is running out of buffers on the ARM side, you will see messages dropped. This is expected.

     

    Could you try to increase the number of buffers and check if the behavior changes?

     

    Thanks

    Cesar

  • Hello Cesar,

    On dsp side we are feeding at most two packets for ARM simultaneously.

    We do not suspect that system is getting out of descriptors/packets at ARM (we have tried with increased packets as well).
    Also we are not seeing the any missing packet at ARM instead we are seeing that the content of packet is corrupted. In case the system goes out of packets then we should see the missed packets.

    Regards
    Naveen
  • Hi Cesar,

    I am doing a loop back test, where the next message triggered after receiving ack from ARM. Also, Direct interrupt mode configured for one reader and one writer.   The heap is also allocated with huge number of packets.

    This rules out the "Data Burst" environment and lack of buffers issue I believe.

    I have an another question: Is it possible to control the packet allocation from the heap in a round-robin fashion? I understand its not the optimal way, but like to know whether it could be achieved.