This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM437X: Randomly occurring Exceptions relating to Null Pointer in DMA Callback Function

Part Number: PROCESSOR-SDK-AM437X

Hi All,

We are using PDK1.0.4 with selected files from PDK1.0.7 in order to give support for DMA SPI Receive, which was not supported by the PDK1.0.4 issue/release. (Ti RTOS not Linux).

The reason for our not adopting a later PDK version in it's entirety is down to the fact that we made significant special modifications to the PDK1.0.4 release, which has effectively meant that for the most part we are stuck with this release.

We have the SPI DMA transmit on one processor working nicely, and now have the SPI DMA receive working on the other processor connected via SPI.

Everything seems fine for an apparently random time between a few seconds and 70 minutes (this is the longest we've seen the system run before crashing). We execute the DMA callback routine many, many times (normal rate in our test code is 10Hz) so in the case where it fails after 70 minutes that's 42,000 calls completed successfully before failing.

No other events are occurring which could be affecting this.

Are there any known problems with PDK1.0.7 which might be causing this behaviour?

Any pointers or suggestions would be most welcome.

regards

Paul Jacomb

 

  • Paul,

    Could you please indicate if the SPI Master mode or SLave mode setup with the driver? Also, does the issue occur at a certain SPI speed? Can you please describe the master slave setup and what device/HW is used in the setup.

    I am checking for know issues with SPI DMA for the version you indicated. I recommend that you look at the commit for SPI_v1.c in the git repo to see if your version may need one of these fixes.

    git.ti.com/.../SPI_v1.c

    Hope this helps.

    Regards,
    Rahul
  • Hi Rahul,





    Thanks for showing an interest! Both Master and Slave ends have been observed crashing in this way. We have changed the SPI speed from 3Mhz Clock rate up to 16MHz and this doesn't appear to affect the outcome.


    We are using AM4377BZDNA80 processors on each end. Further investigation shows that the crash only seems to occur when we have both SPI0 and SPI1 active, SPI0 has one processor as Master, and the other SPI (1) has the other processor as master.


    hope this helps.


    regards


    Paul Jacomb
  • Paul,

    I am working with the developer in evaluating if this behavior has been previously been reported. while I didn`t come across a issue with the SPI driver, it does appear that such an issue has been reported with our UART driver recently :

    e2e.ti.com/.../2814893

    This makes me think that there may be a similar issue that we need to check for with other drivers. I will let you know if this is confirmed. In the meantime can you look at the fix provided for UART driver and see if that works for you with MCSPI. The fix required the driver to record the TCC being used with each instance.

    Regards,
    Rahul
  • Hi Rahul,





    Thanks for your contribution. We are still experiencing the problem, which is still randomly occurring after between 0 and about 90 minutes. Further investigation has shown that the problem always occurs on the receive side rather than the transmitter.


    We were using SPI0 and SPI1 simultaneously with CPU-A being master on SPI0 and slave on SPI1 and obviously vice versa for CPU-B. Whilst experimenting with both SPI's active, we saw these exceptions occurring on both ends. However, by restricting activity to one SPI only, and disabling the other, we find that the problem only occurs on the receiving end.


    Can you please expand on what the TCC is?


    Many thanks


    Paul Jacomb
  • Paul,

    TCC is the transfer controller used in the EDMA for transferring the data.

    Regards,
    Rahul
  • Rahul,


    Thanks for your contribution to this problem.


    We have resolved the issue, the problem was found to be a circular buffer (used for supplying messages to the SPI) which was not being correctly handled, resulting in occasional illegal length messages being generated, which were causing the DMA to write to areas of memory which were vital to the normal operation of the system.


    Our board has been running now for nearly 18 hours without any fault.


    Many thanks


    Paul Jacomb