This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SPI with uDMA cause delay

Other Parts Discussed in Thread: CC1310, SYSBIOS, CC3200

Im using a cc1310 cpu with TI RTOS (tirtos_cc13xx_cc26xx_2_15_00_17).

This system is configured to communicate as SPI-Master using SPI driver (SPICC26XXDMA.h).

It works, but between the DMA transactions I have a delay of about 125us. I activated the logger and discovered 4 delay of about 30us.

Line

rel. Time

Time

1

0

114685058

0

Cortex_M3_0

SPI:(@40000000) DMA transfer enabled

2

0

114685058

0

Cortex_M3_0

SPI:(@40000000) DMA transaction: @200037e8, rxBuf: @0; txBuf: @200008f1; Count: 2

3

30518

114715576

0

Cortex_M3_0

SPI:(@40000000) transfer pending on transferComplete semaphore

4

0

114715576

0

Cortex_M3_0

SPI:(@40000000) hwi interrupt context start

5

30517

114746093

0

Cortex_M3_0

SPI:(@40000000) hwi interrupt context end

6

0

114746093

0

Cortex_M3_0

SPI:(@40000000) hwi interrupt context start

7

0

114746093

0

Cortex_M3_0

SPI:(@40000000) hwi interrupt context end

8

30518

114776611

0

Cortex_M3_0

SPI:(@40000000) swi interrupt context start

9

0

114776611

0

Cortex_M3_0

SPI:(@40000000) DMA transaction: @200037e8 complete

10

0

114776611

0

Cortex_M3_0

SPI DMA:(@40000000) posting transferComplete semaphore

11

30517

114807128

0

Cortex_M3_0

SPI:(@40000000) swi interrupt context end

Looking at the drivers code, I figured out: 

Line 2->3: SSIIntEnable() is called.

Line 10->11: Semaphore_post() is called.

I wonder where those delays are come from?

Regards Armin

  • Hi Armin,

    How are you measuring the time - are these direct reads of the timer register? What is the timer frequency? It looks peculiar to me that many of the operations took 0 time, while others seem to take the same time delta. I am wondering if the timer has really low resolution relative to the speed at which things are happening, so effectively you are just seeing an increment in time whenever the timer increments by 1 tick?

    Best regards,
    Vincent
  • Those logs are from the analysis dashboard in CCS, which show the Log_printX function output from the driver. I'm not sure where that gets its timestamps from, though.
  • These times are in ns - according to the documentation and it matches the real word.

    This is one SPI transfer.

    The next transfers have a delay of about 128us shown in the next picture.

    The Logger show 4 delays of bout 30.5us, so all together 122us. I think those 4 delay cause my problem of a fast SPI communication.

    Update:

    For illustration I set a GPIO (violet) just before SPI_transfer() (BLOCKING) and cleared it right afterwards.

  • Hi Robert,

    Could you go over what the yellow and green traces represent?

    From the logger output, I do agree that there is roughly 122us between lines 1 and 11. My point is that you cannot necessarily conclude you have 4 delays between lines 2&3, 4&5, 7&8, 10&11, but rather the delays are spread across all lines. Because the timestamps are based off the 32kHz RTC counter, it does not have the resolution beyond roughly 31 us, since that is the amount of time between each counter increment. It just so happened that the counter was incremented on lines 3, 5, 8, and 11, so that is when you see the jump in time.

    Best regards,
    Vincent
  • Hi Vincent

    Yellow is SPI data

    Green is SPI clock.

    The 32kHz timebase for the times, also explains why the offset is always  ~30.5us.

    But from the violet pin we see, the delay happens after the SPI transmission and I don't know why.

  • Hi Armin,

    From the log it looks like the DMA was initiated just before line 3 and was completed by line 6. So at least 30 us is attributable to the DMA transfer and to invoke the ISR. The rest of the time is used to handle the interrupt thru the Hwi and the Swi, and for configuring the DMA initially.

    What is your BIOS.libType set to? Make sure you are not using the debug library as that is the slowest. BIOS.LibType_Custom would be a better choice. Another thought is you may want to turn off the logger to see if it reduces the delay.

    Best regards,
    Vincent
  • Hi Vincent

    I used the instrumented Library, because I discovered the delay using the default one.

    I'm using BIOS.LibType_Custom and also with noInstumentation and Logger off - I still have the delay.

    Could it be caused by an error I explored in the ROV -> BIOS -> Scan for errors:

    ti.sysbios.knl.Semaphore : Basic : (0x20002ed0) : pendElems : Error: Problem scanning pend Queue: JavaException: java.lang.Exception: Target memory read failed at address: 0xbebebebe, length: 8This read is at an INVALID address according to the application's section map. The application is likely either uninitialized or corrupt.

    Regards

    Armin

  • Hi Armin,

    How fast is your M3 running? Depending on your clock speed, maybe it does take this long to do the transfer and handle the interrupt?

    Have you thought of increasing your transfer size? Each DMA transfer has a fixed overhead regardless of size, so you'd be more efficient with larger transfer sizes (I think you are transferring 2 bytes at a time right now).

    As for the error, it is possible you have simply run the scan at a point when the semaphore queue is not initialized, similar to what this user encountered: https://e2e.ti.com/support/wireless_connectivity/bluetooth_low_energy/f/538/p/449119/1614856#pi239031350=2

    Best regards,

    Vincent

  • Also if you can pass along your .cfg file, I could take a peek to see if there is anything else that could help speed this up.
  • Hi Vincent

    I did merge my transfers to make it faster, but I have a problem while the slave is sending an interrupt. There I have to read a register and write the same data back to another register of the slave. And with the delayed SPI communication I'm to slow.

    This would be my .cfg file.

    0677.custom.cfg

    To reproduce my problem, I investigated your sample application "lcdSmartRF06EB_CC1310DK_7XD_TI_CC1310F128".

    I made a simple modification to send data to the SPI very quickly:

    /* Write welcome message to buffer and send it to the LCD */
    LCD_bufferClear(lcdHandle, 1);
    LCD_bufferPrintString(lcdHandle, 1, "Hello SmartRF06EB!", 0, LCD_PAGE0);
    LCD_bufferPrintString(lcdHandle, 1, "Low Priority Task", 0, LCD_PAGE1);
    LCD_bufferPrintString(lcdHandle, 1, "Writing to Buffer 1", 0, LCD_PAGE2);
    while (1) {
    LCD_update(lcdHandle, 1);
    }

    while (1) {
    /* Toggle Board_LED1 */
    PIN_setOutputValue(pinHandle, Board_LED1, !PIN_getOutputValue(Board_LED1));

    I made a endless while loop over LCD_update(). I looked up the code and LCD_update() mainly does a for loop over LCD_PAGE_COUNT and sends LCD_gotoXY() and LCD_sendData() witch are 2 SPI_transfers.

    Here you can see the packages send to the display (yellow = /CS, green = SPI clk, violet=SPI mosi:

    Zooming in we see a transfer of 3 bytes for LCD_gotoXY() needs also 87.7us

    Would be nice if we could speed up the SPI-DMA driver to be able to communicate to slave devices in "real time".

    Best regards Armin

  • Hi Armin,

    I took a look at your .cfg file, but nothing jumps out.

    Given you have such a small transfer size, you may benefit if the driver were to do the transfers without using the DMA, so that the system won't have to deal with the resulting latency. I inquired with the driver team, but unfortunately there isn't a way today to improve the performance of the driver for small transfers on the CC1310. However, it turns out they did add a polling transfer mode to the SPI driver on the CC3200 to speed up small transfers in tirtos_cc32xx_2_16_00_08\products\tidrivers_cc32xx_2_16_00_08\packages\ti\drivers\spi\SPICC3200DMA.c (look for 'minDmaTransferSize', which is used in the SPI*_transfer() function to determine if a transfer is too small to be worthwhile using the DMA). Maybe if you study the code and compare it with the CC1310 driver implementation, you can add that feature to the driver on the CC1310.

    Best regards,
    Vincent