This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC1310 SPI slave - hangs

Other Parts Discussed in Thread: CC1310, EK-TM4C1294XL, EK-TM4C129EXL

I am using TI-RTOS (2_21_00_06) SPI drivers for using 1310 in the slave configuration.

void appLanTaskFunction (UArg arg0, UArg arg1)
{
    SPI_Handle handle;
    SPI_Params params;
    SPI_Transaction transaction;
    uint8_t buf[6];

    // Init SPI and specify non-default parameters
    SPI_Params_init(&params);
    params.bitRate             = 1000000;
    params.frameFormat         = SPI_POL0_PHA1;
    params.mode                = SPI_SLAVE;
    params.transferMode        = SPI_MODE_BLOCKING;
    params.transferTimeout     = SPI_WAIT_FOREVER;

    handle = SPI_open(Board_SPI0, &params);
    SPI_control(handle, SPICC26XXDMA_RETURN_PARTIAL_ENABLE, NULL);

    while (1)
    {
        memset(buf, 0, sizeof(buf));
        transaction.count = sizeof(buf);
        transaction.txBuf = 0;
        transaction.rxBuf = buf;

        SPI_transfer(handle, &transaction);

        asm("nop");
    }
}

When spi master send more then sizeof(buf) bytes, CC1310 hangs.
Hang occurs only if RETURN_PARTIAL is enabled (this option is necessary for me).
If configured transferMode=SPI_MODE_CALLBACK, hang occurs too.

I set breakpoint to asm("nop") and when master send less then 6 bytes, program gets into the breakpoint.
If master send more then 6 bytes, SPI_transfer function never unblock my task, and program never gets into the breakpoint, even if master do next transaction with size less then 6 bytes.

Please help organize the protection from this hang.

  • Ilya,

    My understanding is that your SPI frame size is 8-bits and the transfer is configured for 6 frames (i.e. 6 bytes). The SPI master should not send more than 6 frames on any given transfer. It sounds to me that the SPI master is sending a larger transfer.

    Make sure that your SPI master makes transfers no larger than 6 frames at a time and allows some time between transfers for the SPI slave to prepare the next transfer buffer.

    ~Ramsey

  • Ramsey,

    Yes, hang occurs when the SPI master makes transfers with larger than 6 frames at a time.
    I know that, the SPI master should not send more than 6 frames on any given transfer.

    Do you think it is normal to hang? A simple buffer overflow in TI-RTOS drivers results in crashing system.
    My device must be stable, independently of the number frames sended by SPI master.
    Please help organize the protection from this hang.
  • Ilya,

    Okay, now I understand. You are testing the robustness of the SPI driver by having the master intentionally send a larger transfer then is expected by the slave. Have I got this correct?

    I will have to investigate this issue. I would expect the slave to experience a FIFO overflow if the master sent sufficient data. But if the master sent only one extra frame, that data would probably be in the SPI FIFO and not cause any issue. I'm not sure how the slave will detect this and recover.

    I will setup a test case and try to reproduce your failure. Certainly, a system hang would be considered a bug. But I'm not sure what protocol to use between the master and slave to recover from this. Do you have any suggestion? Something simple, just for test purpose.

    ~Ramsey

  • Ramsey, you understand correctly that the master intentionally send a larger transfer then is expected by the slave.
    Master enough to send one large transfer and slave will hang.
    I not use any protocol between the master and slave, in my test master send buffer {0x00, 0x01, 0x02, 0x03, 0x04, 0x06, 0x07}.
  • Ilya,

    I want to give you an update. I have reproduced the same behavior you are reporting. I don't have the same device as you (CC1310). I am using EK-TM4C1294XL as master and EK-TM4C129EXL as slave. But when I send a SPI transfer from master to slave with 1 frame more then expected, the slave locks up in Hwi dispatcher.

    I will try to reproduce on CC1310 as well. I will keep you posted.

    ~Ramsey

  • Ilya,

    Here is another update. I have been able to reproduce the issue using two CC1310 LaunchPads. We are studying the results.

    ~Ramsey

  • Subscribing to this, and very interested in the resolution. I've been fighting something like this for a while on the Tiva eval board. My impression is that something changed in the SPI driver or perhaps more specifically the SpiTivaDMA code, between 2.12 and 2.16.  It's made an application that could deal gracefully with overruns into something much more fragile and very difficult to debug.

  • Ilya,

    Here are the results of our failure analysis. As you already know, the SPI driver uses the uDMA controller to assist with the data transfer. The uDMA controller uses the transaction count to know when the transaction has completed. As such, when a short transaction is issued by the SPI Master, the uDMA controller would wait forever.

    When SPICC26XX_RETURN_PARTIAL_ENABLE is used, the SPI driver enables an interrupt on the SPI nCS pin. When the SPI Master releases the chip select signal, the SPI Slave will receive an interrupt which the SPI driver uses to cancel the current DMA transaction and notify the application that the SPI transaction has completed.

    The problem occurs when the SPI Master sends a full transaction. Right after the SPI Master has sent the last frame, it releases the chip select signal. On the SPI Slave, the last frame is still in the Rx FIFO when the nCS pin interrupt is taken. This interrupt is handled by the PIN driver ISR which schedules a Swi. (All ISRs registered with the PIN driver are executed in Swi context.) While this is happening, the uDMA controller has transferred out the last frame from the Rx FIFO and raised its completion signal to the SPI Slave, which then raises its own interrupt. As the PIN ISR unwinds, the Hwi dispatcher invokes the SPI ISR (SPICC26XXDMA_hwiFxn), which also schedules a Swi.

    With both interrupts serviced, the SYS/BIOS scheduler starts running the pending Swi objects. The first Swi to run is the PIN Swi which ultimately calls SPICC26XXDMA_csnDeassertCallback. This will cancel the current uDMA transfer (which has already completed by this time) and modifies the SPI object state by setting the current transfer pointer to NULL. Then the next Swi runs which calls SPICC26XXDMA_swiFxn. This function has an unguarded de-reference through the current transfer pointer, which is NULL at this time. This is a posted-write operation which ultimately causes a bus fault.

    I also investigated your test case, as I understand it. The sequence is slightly different but with the same end result. In your test case, you are sending short-transfers from master to slave. At the end of the short transfer, the SPI Slave uDMA controller is still waiting for the full transfer. However, the nCS pin interrupt is raised when the SPI Master releases the chip select. This causes the SPI Slave driver to cancel the uDMA transaction and notifies the application that the SPI transaction has completed. All is well.

    However, when the SPI Master sends an overrun transaction (i.e. more frames than expected), the following happens. Once the SPI Slave has received the expected frame count, the uDMA controller counter has counted down to zero which raises the completion signal to the SPI Slave. This raises the SPI interrupt which posts the SPI Swi (SPICC26XXDMA_swiFxn). The Swi starts running. While this is happening, the SPI Master continues to send the extra frames (3 in my test). This causes the SPI Slave controller to detect overrun and raises the SPI interrupt again. At this point, the SPI ISR will preempt the running SPI Swi. The exact timing will vary depending you your frame size. For me, the overrun interrupt preempts the SPI Swi very early. The SPI ISR detects that an overrun has occurred and cancels the uDMA transaction. It also drains the Rx FIFO and clears the current transfer pointer in the SPI object. When the SPI ISR completes, execution unwinds back into the preempted SPI Swi. The Swi then de-references the current transfer pointer, which is NULL now and that causes a bus fault, same as before.

    I will report this bug to the driver team for resolution. I'm also trying to find a "quick fix" for you. Unfortunately, everything I've tried so far does not work very well. I'll keep you posted.

    ~Ramsey

  • Ramsey, thanks for the information. I will wait for the decision of this problem.
  • Robert,

    Some drivers, including SPI, had significant changes between TI-RTOS 2.12 and TI-RTOS 2.16. To identify the exact changes, I would need to know the exact software releases you are using, which drivers, and which Tiva device. From what I've learned by working this thread, the Tiva and CC1310 SPI drivers are quite different. Although I was able to reproduce the same failure on both devices, the respective bugs turned out to be completely unrelated.

    ~Ramsey

  • Ilya,

    I have a "quick fix" for you regarding your particular use-case. Knowing that the issue is a race condition between two interrupts and Swis, I've modified the driver slightly to guard against concurrency issues. My testing is yielding good results, but not perfect. My test case mixes up short-, full-, and overrun-transactions from Master to Slave (I have not done any such testing in the opposite direction). 

    For short-transactions, the slave receives the data correctly.

    For overrun-transactions, the slave receives the data as if it was a full-transaction. In other words, the slave will return from SPI_transfer with a full buffer of data. The slave is unaware that the master had sent more data. The SPI driver discards the overrun data and prepares the SPI controller for the next transaction. You will need to pass some data (maybe buffer size) in your payload in order to detect this situation in your application.

    For full-transactions, the results are mixed. I've observed about a 50% failure rate. In the failure case, the driver drops the last frame. This comes about from the race condition between the uDMA completion signal and the SPI nCS signal. Both of these signals are asserted very close to each other making the timing very sensitive.

    My suggestion is to avoid using full-transfers. It sounds like you require RETURN_PARTIAL to be enabled. Maybe you can configure your system with a little headroom in your transaction buffer size. This would allow you to use short-transactions reliably. The buffer size check would give you robustness in the case of overrun-transactions.

    I've attached my modified copy of the SPI driver. I've added some more instrumentation. I've also commented my changes with "QUICK FIX" for easy identification. Add this file to your CCS project which builds your executable and it will be used instead of the one from the library. Or you can review the TI-RTOS User Guide Section 8 Rebuilding TI-RTOS if you want to rebuild the drivers.

    5582.SPICC26XXDMA.c

    I hope this help you to make progress. Good luck.

    ~Ramsey

  • Ilya,

    One more important item I forgot to mention. In your code fragment above, you are calling SPI_control to enable partial transaction return. You make this call before your main loop. However, this control setting is *not* persistent. It is activated only for the next transaction. If you want all your transactions to have this enabled, you must move the call to SPI_control inside your loop and call it before *every* call to SPI_transfer.

    I believe this is a bug. I think the setting should be persistent. Also, why have a disable function if the setting is not persistent? However, in the current implementation, the partial return is disabled in the nCS callback. So, you must re-enable it before every transfer.

    ~Ramsey

  • Ramsey,

    I believe this is a bug. I think the setting should be persistent

    yes, this bug was introduced in TI-RTOS 2.21 by TIDRIVERS-556 and I prompted to re-open it last week.

  • Tell me please, when will release TI-RTOS with fixing of this bug?

  • Anyone know when this hangs will fixed?
  • Ilya,

    I apologize for abandoning this thread.

    The driver bug regarding this issue has been fixed. Unfortunately, the fix is not yet available in a product release. Furthermore, the new product will be incompatible with your existing TI-RTOS CC13xx/CC26xx 2.21.00.06 release. Therefore, I have requested a patch for this fix which you will be able to use with your existing TI-RTOS release. I hope it will be only a few days until this patch is available. Once the patch is available, I will post it to this thread.

    ~Ramsey

  • Ramsey, thanks for the info, I'll wait for the patch.
    Is the new product wiil be absolutely incompatible with 2.21.00.06 release or migration will be available?
    When a new release is expected?
  • Hi,

    sorry, I lost track of this thread. In this post I did a quick port of the SPI driver from the upcoming SDK release back to to TI-RTOS 2.21. Can you give it a try?

  • Richard, this link does not work. When I click on it, written: "Unfortunately, the page you've requested no longer exists. Please use the search form above to locate the information you're interested in".
  • Richard, I include modified driver files into my IAR project, but it did not help (after including, I made full rebuild project).

    "Quick fix" made by Ramsey (Oct 13) is also not help me. Сalling SPI_control to enable partial transaction return, I move inside loop before every call to SPI_transfer.
  • Ok, I'll have a look into this one on Friday. Since the issue was posted in the general TI-RTOS forum (which is totally fine), I didn't pay attention.

  • Hello Richard,

    Can you update status of this question? As I understand you had to check it at November 18.
    There are no still no solution.
  • Hi,

    I have it on my to-do list since a long time, but I a was not able to run an investigation due to other high-priority tasks. Sorry, I can't provide a date when I can look at it. I'll try to prioritize it though. Sorry for this vague answer.