Because of the Thanksgiving holiday in the U.S., TI E2E™ design support forum responses may be delayed from November 25 through December 2. Thank you for your patience.

AM625: SPI with DMA and bpw != 8r: stuck in 'ioctl' indefinitely

Part Number: AM625

Tool/software:

When using SPI with DMA and spidev, ioctl never returns under certain conditions.

The easiest way to replicate this is to use a transfer size equal to the spidev max transfer size (4096 by default).

If you instead use a smaller transfer size in order to avoid this deadlock, it will just deadlock some time later (after some random number of transfers).

Steps to replicate (this example uses spi0):

1. Add DMA to SPI in device tree, if it isn't already:

	dmas = <&main_pktdma 0xc300 0>, <&main_pktdma 0x4300 0>;
	dma-names = "tx0", "rx0";

2. Create an empty file of 4096 bytes to use as test transfer:

dd if=/dev/zero of=zeros bs=4096 count=1

3. Run spidev_test:

spidev_test -D /dev/spidev0.0 -i zeros -b 32

after this, you never get back to the shell, and it is stuck somewhere in the ioctl call. My guess would be some issue with the SPI/DMA/spidev driver interrupts, with timing or some race condition etc.

  • Sounds like the issue happens when DMA is NOT used?

    What happens when you use a size less than 4096?

    Regards, Andreas

  • Andreas,

    The issue doesn't happen when the DMA is not present in the SPI node of the device tree.

    If you use a size less than 4096, the issue does not appear in a single transfer.

    In my application, which has many transfers happening all the time, the issue will appear after some random amount of time has passed. In my application, none of the transfers are 4096 as that would make the issue appear in the very first transfer.

    The issue is much quicker to appear when using RT linux. This makes me think the problem is related to timing (race condition) related to SPI/DMA/IRQ.

    But, regardless, you can replicate the issue as I described above.

  • Ok thanks for the extra background, looks like something is not going well here.

    What exact Kernel version (tag, or associated SDK version) are you using?

    Have you made any changes to the Kernel besides the usual DTS or CONFIG changes?

    Regards, Andreas

  • Using VAR-SOM-AM62, am62-yocto-kirkstone-6.1.83_09.02.01.10-v1.0, which is based on TI SDK 09_02_01_10.

    The only change I make for this test is adding the DMA to the SPI in the device tree. Otherwise it is a very basic image (var-thin-image) for the VAR-SOM-AM62. No other changes whatsoever, in order to provide the cleanest test bed possible.

    I think if you take your EVK and make the changes I described (add DMA to the SPI, and add a spidev device if not already present), and perform the test as above, you can easily replicate the issue.

  • I think if you take your EVK and make the changes I described (add DMA to the SPI, and add a spidev device if not already present), and perform the test as above, you can easily replicate the issue.

    Yes that's what I was thinking, I'll give this a shot, probably tomorrow Friday or Monday. If I can find a way to re-create this the analysis will be a lot easier.

    Regards, Andreas

  • Andreas,

    Have you replicated the issue following the steps outline, on the TI EVK?

  • Andreas,

    I get daily lockups with this SPI ioctl call issue, which requires hardware reset to resolve.
    Have you replicated the issue using the method outline in my original post?

  • Using transfer sizes other than 8-bit I seem to be able to re-create what you were seeing. Like that hang using `-b 32` you reported. And when using `-b 16` it also runs into some (different) issue...

    root@am62xx-evm:~# ./spidev_test -D /dev/spidev1.0 -i zeros -b 16
    spi mode: 0x0
    bits per word: 16
    max speed: 500000 Hz (500 kHz)
    [  172.775441] spidev spi1.0: DMA RX last word empty
    
    (...hangs...)

    It could be the issue so far has gone undetected because most people use 8-bit transfers. I can also see the McSPI peripheral driver has quite a bit of special handling for splitting/managing 16 and 32-bit wide words in the context of the DMA; I wonder if something in the logic there isn't 100% correct.

    Why are you using 32-bit wide words, and how critical is this for you to resolve? If it is to boost effective throughout (by minimizing inter-byte gaps) you can also have a look at this E2E FAQ here: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1356551/faq-am6x-optimizing-spi-transfer-inter-byte-gaps-using-the-dma-in-linux for some other methods to do that.

    I can file an internal bug report to have this investigated but those things usually don't go very quick.

    Regards, Andreas

  • It's pretty important for me. But, I think it should work regardless, or just go ahead and say it is not supported since it definitely doesn't work.

  • It's pretty important for me. 

    Can you please explain why (your exact use case). Better understanding what's behind may help to position fixing this from a priority POV.

    Regards, Andreas

  • Throughput and less CPU usage. But, let's drop that for now.

    The same problem occurs with 8 bit transfers. It isn't so easily reproducible as the 16/32 bit case as shown in my original post. But in my application, after some hours, etc. the ioctl call never returns.

    I will try to create a single file c application to replicate the issue. Then I will post the binary and source here for you to try on the TI EVK.

  • Throughput and less CPU usage. But, let's drop that for now.

    If DMA is used it shouldn't really matter; I'm curious if you collected any data to back this up.

    The same problem occurs with 8 bit transfers. It isn't so easily reproducible as the 16/32 bit case as shown in my original post. But in my application, after some hours, etc. the ioctl call never returns.

    Ok, that's not something we have heard of I think. We'll need to investigate. There's quite a few folks using DMA & 8-bit transfer without apparent issues.

    I will try to create a single file c application to replicate the issue. Then I will post the binary and source here for you to try on the TI EVK.

    Thank you, that'll be critical to be able to analyze/debug this. Also please keep an eye on the Kernel log in the moment of "crash" / hang, in case there's anything of interest.

    Regards, Andreas

  • Andreas,

    I have figured out how to replicate the issue with a very simple application.

    The key, which is something it took a while to realize, is that it requires an ethernet load to make it show up.

    Since my test application was just a loop doing SPI reads, it was not revealing any issue. But, while comparing to my own application, I noticed it would happen when I was connected to the HTTP server.

    So, in order to replicate the issue, you need to run some ethernet load alongside the test application. The easiest way to do this is with iperf3:

    iperf3 -c IP_ADDRESS -b 100000000 -t 9999

    Where IP_ADDRESS is the IP address of a machine running iperf3 as a server.

    So, the complete test setup looks like this:
    - SSH session with htop open

    - SSH session with the iperf3 running

    - Serial terminal open, launch the test application

    And then observe the test application hit 100% CPU load in htop. At this point the hardware is in an unrecoverable state, and requires a reboot.

    I've attached the source code for the minimal test application, along with the binary.
    It may or may not be necessary to run RT linux for this issue to show up (or show up quicker).

    spi_dma_test.zip

  • Hi,

    thanks for the additional info, and debug work. It sounds very labor some what you are going through, so thank you for the energy and patience you have invested into this. I'll review your materials tomorrow and see if I can re-create it and report back

    At this point the hardware is in an unrecoverable state, and requires a reboot.

    The system should never get into this state, we need to analyze and fix this.

    Regards, Andreas

  • To be clear, by unrecoverable I mean you cannot stop the execution of the application using SPI that has become stuck.

    No signal will stop that thread/app. You can still "do other things".

  • I have this setup now like you said and it's running. Let me see how long it runs...

    I'm using the regular (non-RT) Kernel at the moment.

    Regards, Andreas

  • So I had it run overnight with no issues, it's still going strong. So let me build/deploy an RT kernel to this setup and try again to see if that has any impact.

    Regards, Andreas

  • I have your test app running now using an RT kernel I built based on the SDK v10.0 release. And doing the iperf3 client/server, as well as htop -- all same as during the previous test. Let me run this overnight and report back.

    Regards, Andreas