This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hello everyone,
I'm trying to implement a communication to a slave through SPI. As I want to improve the performance by decreasing the time of the communication I want to use the DMA. I noticed I was getting some erroneous data back from the slave so I've implemented the Loopback mode in the QSSI to deal with and debug the problem. If I write and read to/from the QSSI using software my application works correctly, even when talking to the slave. I've tried implementing the basic mode of the uDMA transfer first on the Tx and then the Rx channel and again I haven't encountered any problems.
The next step I did was the implementation of the Scatter-Gather mode but this is where I've run into strange behaviour I can't wrap my head around. The idea is to use two uDMA task lists, one for the Rx and one for the Tx channel. The transmit task list first sends 2 bytes that represent the address used by the slave. Followed is a a task which serves as a delay function and it does so by transferring some dummy values from one place to another within the memory of the microcontroller. Then there's a task that sends 14 bytes of data. The final task in the list is storing a value into a variable that is used as a semaphore in the program so I can check when the Tx channel of the uDMA has finished.
The Rx task list is quite similar. It reads two bytes and stores them into a buffer. Then the next task reads 14 bytes and stores them into the same buffer beginning at the third place right after the first two bytes. Then there's another, different variable, that serves as a semaphore but for the Rx channel.
Using a Loopback mode I can see what is the result of such a DMA transfer. It seems that the first two bytes are received correctly, as are the next 7. Then there is one byte that's missing before the sequence continues with another 6 bytes. So in total 2 + 7 + 6 = 15. The Tx channel of the uDMA is finished as the return value of the uDMAChannelIsEnabled() is zero. The Rx channel is still enabled as it's waiting for another byte. If I send this byte manually in the program then the uDMA finishes but that one byte in the middle is lost.
I've been trying to make a sense of this but I just can't seem to locate the problem. My code is added below for further clarification. Any kind of help, ideas or suggestions will be really appriciated.
tDMAControlTable dmaTaskSpiWrite[] = { // Address write uDMATaskStructEntry(2, UDMA_SIZE_8, UDMA_SRC_INC_8, &debugTxBuf[0], UDMA_DST_INC_NONE, (void *)(SSI0_BASE + SSI_O_DR), UDMA_ARB_1, UDMA_MODE_PER_SCATTER_GATHER), // Delay uDMATaskStructEntry(40, UDMA_SIZE_8, UDMA_SRC_INC_NONE, &value0x00, UDMA_DST_INC_NONE, debugDelayBuf, UDMA_ARB_1, UDMA_MODE_MEM_SCATTER_GATHER), // Write the rest uDMATaskStructEntry(14, UDMA_SIZE_8, UDMA_SRC_INC_8, &debugTxBuf[2], UDMA_DST_INC_NONE, (void *)(SSI0_BASE + SSI_O_DR), UDMA_ARB_1, UDMA_MODE_PER_SCATTER_GATHER), // Finish with the semaphore uDMATaskStructEntry(1, UDMA_SIZE_8, UDMA_SRC_INC_NONE, &value0xFF, UDMA_DST_INC_NONE, &semaphoreTx, UDMA_ARB_1, UDMA_MODE_AUTO) }; tDMAControlTable dmaTaskSpiRead[] = { // Address read uDMATaskStructEntry(2, UDMA_SIZE_8, UDMA_SRC_INC_NONE, (void *)(SSI0_BASE + SSI_O_DR), UDMA_DST_INC_8, &debugRxBuf[0], UDMA_ARB_1, UDMA_MODE_PER_SCATTER_GATHER), // Read the rest uDMATaskStructEntry(14, UDMA_SIZE_8, UDMA_SRC_INC_NONE, (void *)(SSI0_BASE + SSI_O_DR), UDMA_DST_INC_8, &debugRxBuf[2], UDMA_ARB_1, UDMA_MODE_PER_SCATTER_GATHER), // Set the semaphore uDMATaskStructEntry(1, UDMA_SIZE_8, UDMA_SRC_INC_NONE, &value0xFF, UDMA_DST_INC_NONE, &semaphoreRx, UDMA_ARB_1, UDMA_MODE_AUTO) }; int main(void) { my_freqency = SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ | SYSCTL_OSC_MAIN | SYSCTL_USE_PLL | SYSCTL_CFG_VCO_480), 120000000); ConfigureUART(); SysCtlPeripheralEnable(SYSCTL_PERIPH_UDMA); uDMAEnable(); uDMAControlBaseSet(pui8ControlTable); SysCtlPeripheralEnable(SYSCTL_PERIPH_SSI0); SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOA); GPIOPinTypeGPIOOutput(GPIO_PORTA_BASE, GPIO_PIN_3); // GPIO_DIR_MODE_OUT GPIOPinWrite(GPIO_PORTA_BASE, GPIO_PIN_3, GPIO_PIN_3); //Deselect PDI GPIOPinConfigure(GPIO_PA2_SSI0CLK); GPIOPinConfigure(GPIO_PA4_SSI0XDAT0); //GPIO_PA4_SSI0RX GPIOPinConfigure(GPIO_PA5_SSI0XDAT1); //GPIO_PA5_SSI0TX GPIOPadConfigSet(GPIO_PORTA_BASE, GPIO_PIN_2 | GPIO_PIN_4, GPIO_STRENGTH_8MA, GPIO_PIN_TYPE_STD); GPIOPinTypeSSI(GPIO_PORTA_BASE, GPIO_PIN_2 | GPIO_PIN_4 | GPIO_PIN_5); SSIDisable(SSI0_BASE); SSIConfigSetExpClk(SSI0_BASE, my_freqency, SSI_FRF_MOTO_MODE_0, SSI_MODE_MASTER, 1000000, 8); SSIEnable(SSI0_BASE); SSIDMAEnable(SSI0_BASE, SSI_DMA_RX | SSI_DMA_TX); uDMAChannelAssign(UDMA_CH10_SSI0RX); uDMAChannelAttributeDisable(UDMA_CHANNEL_SSI0RX, UDMA_ATTR_ALTSELECT | UDMA_ATTR_USEBURST | UDMA_ATTR_HIGH_PRIORITY | UDMA_ATTR_REQMASK); uDMAChannelAssign(UDMA_CH11_SSI0TX); uDMAChannelAttributeDisable(UDMA_CHANNEL_SSI0TX, UDMA_ATTR_ALTSELECT | UDMA_ATTR_USEBURST | UDMA_ATTR_HIGH_PRIORITY | UDMA_ATTR_REQMASK); HWREG(SSI0_BASE + SSI_O_CR1) |= SSI_CR1_LBM; // // *I test the correct behaviour of the Loopback mode here by writing and then reading to/from the QSSI* // // // Check SSI FIFO status --> They're both empty prior the enabling the uDMA // if(HWREG(SSI0_BASE + SSI_O_SR) & SSI_SR_RNE) UARTprintf("Rx FIFO not empty\r\n"); else UARTprintf("Rx FIFO empty\r\n"); if(HWREG(SSI0_BASE + SSI_O_SR) & SSI_SR_TFE) UARTprintf("Tx FIFO empty\r\n"); else UARTprintf("Tx FIFO not empty\r\n"); UARTprintf("\r\n"); // // buffer initialisation // for(cnt = 0; cnt < DEBUG_BUF_SIZE; cnt++) { debugRxBuf[cnt] = 0xFF; debugTxBuf[cnt] = 0; } debugTxBuf[0] = 0xAA; debugTxBuf[1] = 0xBB; debugTxBuf[2] = 0x12; debugTxBuf[3] = 0x34; debugTxBuf[4] = 0x56; debugTxBuf[5] = 0x78; debugTxBuf[6] = 0x9A; debugTxBuf[7] = 0xBC; debugTxBuf[8] = 0xDE; debugTxBuf[9] = 0x12; debugTxBuf[10] = 0x34; debugTxBuf[11] = 0x56; debugTxBuf[12] = 0x78; debugTxBuf[13] = 0x9A; debugTxBuf[14] = 0xBC; debugTxBuf[15] = 0xDE; semaphoreRx = 0; semaphoreTx = 0; // // uDMA Control Table setup // uDMAChannelScatterGatherSet(UDMA_CHANNEL_SSI0RX, sizeof(dmaTaskSpiRead)/sizeof(tDMAControlTable), dmaTaskSpiRead, true); uDMAChannelScatterGatherSet(UDMA_CHANNEL_SSI0TX, sizeof(dmaTaskSpiWrite)/sizeof(tDMAControlTable), dmaTaskSpiWrite, true); uDMAChannelEnable(UDMA_CHANNEL_SSI0RX); uDMAChannelEnable(UDMA_CHANNEL_SSI0TX); while(semaphoreRx == 0 || semaphoreTx == 0) { // It is stuck in here } // // *Printout goes here* // while(true) { } }
I haven't tried this particular combination, but I've seen symptoms like this (overrun) when the Tx/Rx channel priorities are the same. Does it act different if you give the Rx channel higher priority? Something like
> uDMAChannelAttributeEnable( UDMA_CH10_SSI0RX,UDMA_ATTR_HIGH_PRIORITY);
Hello Aljaz,
Nothing sticks out to me immediately, so I will want to try and run this on my hardware to see how it was working. Is this something you ran on a TM4C LaunchPad and could provide the headers if so? I will try and debug tomorrow afternoon and see what I can uncover regarding the behavior you are seeing.
Bruce, thanks for the suggestion! Unfortunately there doesn't seem to be any difference with the high priority of the SSI0RX channel enabled or disabled.
Ralph, I'm running this on custom made boards, all of which are using TM4C1290NCPDT. I'm attaching a list of used header files included in this basic project I'm using for the sole purpose of testing the DMA operations over the SPI.
Funnily enough, the two different boards I've tested this on exhibit wrong behaviour although both in their own way. The second board I've tried running this code on correctly stores 2 + 14 bytes into the debugRxBuffer but in this case the uDMA is still waiting at a task
// Read the rest uDMATaskStructEntry(14, UDMA_SIZE_8, UDMA_SRC_INC_NONE, (void *)(SSI0_BASE + SSI_O_DR), UDMA_DST_INC_8, &debugRxBuf[2], UDMA_ARB_1, UDMA_MODE_PER_SCATTER_GATHER)
If I make a short delay inside the while
(semaphoreRx == 0 || semaphoreTx == 0)
and then manually send a random value to the QSSI inside this loop, the uDMA finishes, both semaphores are set, but of course the last byte that was sent is left inside the Rx FIFO.
Some further details that might be useful: if I'm using UDMA_ARB_4 instead of UDMA_ARB_1 with the task responsible for 14 bytes, then the behaviour is again different. I've also observed that If I call a function SPIDMADisable(SSI0_BASE, SSI_DMA_RX | SSI_DMA_TX) before i.e. uDMAChannelScatterGatherSet() and then call SPIDMAEnable(SSI0_BASE, SSI_DMA_RX | SSI_DMA_TX) after enabling both channels uDMAChannelEnable(), that's when I get the behaviour mentioned in the OP. If I don't use SPIDMADisable/SPIDMAEnable then the first byte read has some garbage value.
Here's a list of header files I currently have in the project:
#include <stdio.h> #include <stdbool.h> #include <stdint.h> #include "inc/hw_memmap.h" #include "inc/hw_flash.h" #include "inc/hw_gpio.h" #include "inc/hw_ssi.h" #include "inc/hw_sysctl.h" #include "inc/hw_types.h" #include "inc/hw_uart.h" #include "inc/hw_udma.h" #include "driverlib/gpio.h" #include "driverlib/flash.h" #include "driverlib/pin_map.h" #include "driverlib/rom.h" #include "driverlib/ssi.h" #include "driverlib/sysctl.h" #include "driverlib/uart.h" #include "driverlib/udma.h" #include "utils/uartstdio.h"
Hello Aljaz,
Thanks for the headers and the added information, it took a bit longer to get the project up and running than I had hoped.
I was looking through what you said and you said:
Aljaz Prislan said:Using a Loopback mode I can see what is the result of such a DMA transfer. It seems that the first two bytes are received correctly, as are the next 7. Then there is one byte that's missing before the sequence continues with another 6 bytes. So in total 2 + 7 + 6 = 15. The Tx channel of the uDMA is finished as the return value of the uDMAChannelIsEnabled() is zero. The Rx channel is still enabled as it's waiting for another byte. If I send this byte manually in the program then the uDMA finishes but that one byte in the middle is lost.
The data I see is as follows:
[0] | 0x000000AA (Hex) |
[1] | 0x000000FF (Hex) |
[2] | 0x00000012 (Hex) |
[3] | 0x56000034 (Hex) |
[4] | 0x78000000 (Hex) |
[5] | 0x00000000 (Hex) |
[6] | 0x000000FF (Hex) |
[7] | 0x000000FF (Hex) |
Is that correct then? That is how the data should be received? If so, can you describe what should be in array numbers 4, 5, and 6?
If the data isn't correct, maybe I am using the wrong definitions for the variables. This what I did to get everything to compile from a global variable standpoint:
#define DEBUG_BUF_SIZE 64 uint32_t debugTxBuf[DEBUG_BUF_SIZE]; uint32_t debugRxBuf[DEBUG_BUF_SIZE]; uint32_t semaphoreTx, semaphoreRx; uint32_t g_ui32SysClock; uint32_t value0x00[40] = {0}; uint32_t value0xFF = 0xFF; uint32_t debugDelayBuf[DEBUG_BUF_SIZE];
Hello Ralph and thank you for the reply.
Sorry for not including the variables in my original post. You're quite correct, I'll post the variable types and their initialisation values so we're on the same page. Buffers all have the type uint8_t as this is the size used for the SPI communication. Variables value0x00 and value0xFF are just name holders for a location in memory where I store values of 0 and 255, which can then be used by the uDMA. So here's what I have:
unsigned char debugTxBuf[DEBUG_BUF_SIZE]; unsigned char debugRxBuf[DEBUG_BUF_SIZE]; unsigned char debugDelayBuf[DEBUG_BUF_SIZE]; unsigned int semaphoreRx; unsigned int semaphoreTx; unsigned int value0x00 = 0x00; unsigned int value0xFF = 0xFF;
The idea is to send 16 bytes (2 + 14) from debugTxBuf and once the whole operation is finished you get the same copy in the debugRxBuf (at least first 16 bytes):
debugTxBuf[0] = 0xAA; --> debugRxBuf[0] == 0xAA; debugTxBuf[1] = 0xBB; --> debugRxBuf[1] == 0xBB; debugTxBuf[2] = 0x12; --> debugRxBuf[2] == 0x12; debugTxBuf[3] = 0x34; --> debugRxBuf[3] == 0x34; debugTxBuf[4] = 0x56; --> debugRxBuf[4] == 0x56; debugTxBuf[5] = 0x78; --> debugRxBuf[5] == 0x78; debugTxBuf[6] = 0x9A; --> debugRxBuf[6] == 0x9A; debugTxBuf[7] = 0xBC; --> debugRxBuf[7] == 0xBC; debugTxBuf[8] = 0xDE; --> debugRxBuf[8] == 0xDE; debugTxBuf[9] = 0x12; --> debugRxBuf[9] == 0x12; debugTxBuf[10] = 0x34; --> debugRxBuf[10] == 0x34; debugTxBuf[11] = 0x56; --> debugRxBuf[11] == 0x56; debugTxBuf[12] = 0x78; --> debugRxBuf[12] == 0x78; debugTxBuf[13] = 0x9A; --> debugRxBuf[13] == 0x9A; debugTxBuf[14] = 0xBC; --> debugRxBuf[14] == 0xBC; debugTxBuf[15] = 0xDE; --> debugRxBuf[15] == 0xDE;
These are the values that I get in the debugRxBuf:
[0] 0xaa [1] 0xbb [2] 0x12 [3] 0x34 [4] 0x56 [5] 0x78 [6] 0x9a [7] 0xbc [8] 0xde [9] 0x34 [10] 0x56 [11] 0x78 [12] 0x9a [13] 0xbc [14] 0xde [15] 0xff // This index never gets written to (debugRxBuf is initialised to all 0xFF at the beginning) and the semaphoreRx is still not set
I've done some further testing. I followed the initialization instructions in the datasheet and using direct register entry (with the help of HWREG() macro) I configured uDMA and SPI. I've tested a BASIC mode of uDMA operation on the SSI0 TX channel by setting to write two bytes to the TX FIFO. I then enable the uDMA SSI0 TX channel and after the operation is finished I read the content of the RX FIFO. What I get is not two bytes but three:
[0] 0xFF [1] 0xAA [2] 0xBB
which of course isn't correct and I suspect that because of this fundamental problem, my original Scatter-Gather operation doesn't work either. I'm attaching the code for this program but I assumed that if I followed what's in the datasheet I was supposed to get the expected behaviour. I really don't see what else I'm missing. This code should compile:
#include <stdio.h> #include <stdbool.h> #include <stdint.h> #include "inc/hw_memmap.h" #include "inc/hw_flash.h" #include "inc/hw_gpio.h" #include "inc/hw_ssi.h" #include "inc/hw_sysctl.h" #include "inc/hw_types.h" #include "inc/hw_uart.h" #include "inc/hw_udma.h" #include "driverlib/gpio.h" #include "driverlib/flash.h" #include "driverlib/pin_map.h" #include "driverlib/rom.h" #include "driverlib/ssi.h" #include "driverlib/sysctl.h" #include "driverlib/uart.h" #include "driverlib/udma.h" #include "utils/uartstdio.h" // uDMA Control Table #pragma DATA_ALIGN(pui8ControlTable, 1024) tDMAControlTable pui8ControlTable[64]; // // Variables // #define DEBUG_BUF_SIZE 256 unsigned char debugTxBuf[DEBUG_BUF_SIZE]; unsigned char debugRxBuf[DEBUG_BUF_SIZE]; unsigned char debugDelayBuf[DEBUG_BUF_SIZE]; unsigned int semaphoreRx; unsigned int semaphoreTx; unsigned int value0x00 = 0x00; unsigned int value0xFF = 0xFF; uint32_t my_freqency; void ConfigureUART(void) { // // Enable the GPIO Peripheral used by the UART. // SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOA); // // Enable UART0 // SysCtlPeripheralEnable(SYSCTL_PERIPH_UART0); // // Configure GPIO Pins for UART mode. // GPIOPinConfigure(GPIO_PA0_U0RX); GPIOPinConfigure(GPIO_PA1_U0TX); GPIOPinTypeUART(GPIO_PORTA_BASE, GPIO_PIN_0 | GPIO_PIN_1); // ROM_GPIOPinTypeGPIOOutput(GPIO_PORTA_BASE, GPIO_PIN_0 | GPIO_PIN_1); //SYNC/PD in CS // ROM_GPIOPinWrite(GPIO_PORTE_BASE, GPIO_PIN_0 | GPIO_PIN_1, GPIO_PIN_0 | GPIO_PIN_1);//SYNC/PD=0 CS4=0 // // Use the internal 16MHz oscillator as the UART clock source. // UARTClockSourceSet(UART0_BASE, UART_CLOCK_PIOSC); UARTFIFOEnable(UART0_BASE); // // Initialize the UART for console I/O. // UARTStdioConfig(0, 115200, 16000000); } // Configure the STG command UART and its pins. #define STG_UART_BAUDRATE 38400 // default UART baudrate int main(void) { volatile unsigned int cnt; my_freqency = SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ | SYSCTL_OSC_MAIN | SYSCTL_USE_PLL | SYSCTL_CFG_VCO_480), 120000000); ConfigureUART(); // Buffers initialisation for(cnt = 0; cnt < DEBUG_BUF_SIZE; cnt++) { debugRxBuf[cnt] = 0xFF; debugTxBuf[cnt] = 0x00; } debugTxBuf[0] = 0xAA; debugTxBuf[1] = 0xBB; debugTxBuf[2] = 0x12; debugTxBuf[3] = 0x34; debugTxBuf[4] = 0x56; debugTxBuf[5] = 0x78; debugTxBuf[6] = 0x9A; debugTxBuf[7] = 0xBC; debugTxBuf[8] = 0xDE; debugTxBuf[9] = 0x12; debugTxBuf[10] = 0x34; debugTxBuf[11] = 0x56; debugTxBuf[12] = 0x78; debugTxBuf[13] = 0x9A; debugTxBuf[14] = 0xBC; debugTxBuf[15] = 0xDE; // // Config uDMA // // Enable clock to the uDMA HWREG(SYSCTL_RCGCDMA) = SYSCTL_RCGCDMA_R0; // Enable uDMA controller HWREG(UDMA_CFG) = UDMA_CFG_MASTEN; // Set the location of the channel control table HWREG(UDMA_CTLBASE) = (uint32_t)pui8ControlTable; // Tx channel: Configure attributes // Set Normal priority HWREG(UDMA_PRIOCLR) = UDMA_CHANNEL_SSI0TX; // Select primary channel of the control table HWREG(UDMA_ALTCLR) = UDMA_CHANNEL_SSI0TX; // Use single and burst requests HWREG(UDMA_USEBURSTCLR) = UDMA_CHANNEL_SSI0TX; // Allow the uDMA controller to respond to Ch11 HWREG(UDMA_REQMASKCLR) = UDMA_CHANNEL_SSI0TX; // Tx channel: Configure the channel control structure /* 2 + 14 ping pong */ pui8ControlTable[UDMA_CHANNEL_SSI0TX].pvSrcEndAddr = &debugTxBuf[1]; pui8ControlTable[UDMA_CHANNEL_SSI0TX].pvDstEndAddr = (void *)(SSI0_BASE + SSI_O_DR); pui8ControlTable[UDMA_CHANNEL_SSI0TX | UDMA_ALT_SELECT].pvSrcEndAddr = &debugTxBuf[15]; pui8ControlTable[UDMA_CHANNEL_SSI0TX | UDMA_ALT_SELECT].pvDstEndAddr = (void *)(SSI0_BASE + SSI_O_DR); pui8ControlTable[UDMA_CHANNEL_SSI0TX].ui32Control = UDMA_CHCTL_DSTINC_NONE | UDMA_CHCTL_DSTSIZE_8 | UDMA_CHCTL_SRCINC_8 | UDMA_CHCTL_SRCSIZE_8 | UDMA_CHCTL_ARBSIZE_4 | (2 << UDMA_CHCTL_XFERSIZE_S) | UDMA_CHCTL_XFERMODE_BASIC; pui8ControlTable[UDMA_CHANNEL_SSI0TX | UDMA_ALT_SELECT].ui32Control = UDMA_CHCTL_DSTINC_NONE | UDMA_CHCTL_DSTSIZE_8 | UDMA_CHCTL_SRCINC_8 | UDMA_CHCTL_SRCSIZE_8 | UDMA_CHCTL_ARBSIZE_4 | (14 << UDMA_CHCTL_XFERSIZE_S) | UDMA_CHCTL_XFERMODE_BASIC; // // Configure SPI (datasheet p. 1219) // // Enable clock for SSI0 HWREG(SYSCTL_RCGCSSI) |= SYSCTL_RCGCSSI_R0; // Enable clock to the GPIOA which contains all the pins of the SSI0 HWREG(SYSCTL_RCGCGPIO) |= SYSCTL_RCGCGPIO_R0; // Select Alternate function for pins PA2, PA4, PA5 HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_AFSEL) |= GPIO_PIN_2 | GPIO_PIN_4 | GPIO_PIN_5; // Configure PA3 as CS HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_DIR) |= GPIO_PIN_3; // PA3 as output HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_DR8R) |= GPIO_PIN_3; // 8 mA output // Select peripheral mux to the SSI functions uint32_t shiftSize = (GPIO_PA2_SSI0CLK >> 8) & 0xFFu; uint32_t portMux = GPIO_PA2_SSI0CLK & 0x0Fu; HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_PCTL) |= portMux << shiftSize; shiftSize = (GPIO_PA4_SSI0XDAT0 >> 8) & 0xFFu; portMux = GPIO_PA4_SSI0XDAT0 & 0x0Fu; HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_PCTL) |= portMux << shiftSize; shiftSize = (GPIO_PA5_SSI0XDAT1 >> 8) & 0xFFu; portMux = GPIO_PA5_SSI0XDAT1 & 0x0Fu; HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_PCTL) |= portMux << shiftSize; // Digital enable HWREG(GPIO_PORTA_AHB_BASE + GPIO_O_DEN) |= GPIO_PIN_2 | GPIO_PIN_3 | GPIO_PIN_4 | GPIO_PIN_5; // // SPI Frame format configuration // // SSE must be disabled during any conifg changes HWREG(SSI0_BASE + SSI_O_CR1) &= ~SSI_CR1_SSE; // For Master mode set SSI CR1 reg to 0x0000.0000 HWREG(SSI0_BASE + SSI_O_CR1) = 0u; // Config clock source to system clock HWREG(SSI0_BASE + SSI_O_CC) |= SSI_CC_CS_SYSPLL; // Configure clock prescale divisor HWREG(SSI0_BASE + SSI_O_CPSR) = 120u; // CR0.SCR will be set to 0 --> // SSI0Clk = 120 MHz / (120 * (1 + 0)) = 1 MHz // Set Serial Clock Rate // CR0 reset value = 0 --> CR0.SCR = 0 // HWREG(SSI0_BASE + SSI_O_CR0) |= 0 << SSI_CR0_SCR_S; // SPI Mode 0: SPO = 0, SPH = 0 //HWREG(SSI0_BASE + SSI_O_CR0) &= ~(SSI_CR0_SPO | SSI_CR0_SPH); // Freescale SPI format HWREG(SSI0_BASE + SSI_O_CR0) |= SSI_CR0_FRF_MOTO; // 8-bit data size HWREG(SSI0_BASE + SSI_O_CR0) |= SSI_CR0_DSS_8; // Enable Loopback mode HWREG(SSI0_BASE + SSI_O_CR1) |= SSI_CR1_LBM; // Enable SSI0 HWREG(SSI0_BASE + SSI_O_CR1) |= SSI_CR1_SSE; // Test loopback mode HWREG(SSI0_BASE + SSI_O_DR) = 0x11; HWREG(SSI0_BASE + SSI_O_DR) = 0x22; while(HWREG(SSI0_BASE + SSI_O_SR) & SSI_SR_BSY); UARTprintf("LBM[0]: 0x%x\r\n", HWREG(SSI0_BASE + SSI_O_DR)); UARTprintf("LBM[1]: 0x%x\r\n", HWREG(SSI0_BASE + SSI_O_DR)); UARTprintf("\r\n"); // // Configure SSI for uDMA use (datasheet p. 1219) // // We won't configure uDMA completion interrupt because only primary channel will operate in ping-pong mode // Alternate channel will operate in auto mode, which will make uDMA to stop after one cycle of "ping-pong" transfer // Enable TX FIFO and RX FIFO triggers to the uDMA HWREG(SSI0_BASE + SSI_O_ICR) = SSI_ICR_DMARXIC | SSI_ICR_DMATXIC; HWREG(SSI0_BASE + SSI_O_DMACTL) |= SSI_DMACTL_TXDMAE; // Finally we enable UDMA_CH10_SSI0RX and UDMA_CH11_SSI0TX HWREG(UDMA_ENASET) |= 1U << UDMA_CH11_SSI0TX; // Wait while(HWREG(UDMA_ENASET) & (1U << UDMA_CH11_SSI0TX)); UARTprintf("TX channel enable: %d\r\n", HWREG(UDMA_ENASET) & UDMA_CH11_SSI0TX ? 1 : 0); UARTprintf("RX channel enable: %d\r\n", HWREG(UDMA_ENASET) & UDMA_CH10_SSI0RX ? 1 : 0); UARTprintf("TX mode (primary struct): 0x%x\r\n", pui8ControlTable[UDMA_CHANNEL_SSI0TX].ui32Control & UDMA_CHCTL_XFERMODE_M); UARTprintf("TX mode (alt struct): 0x%x\r\n", pui8ControlTable[UDMA_CHANNEL_SSI0TX | UDMA_ALT_SELECT].ui32Control & UDMA_CHCTL_XFERMODE_M); UARTprintf("RX mode (primary struct): 0x%x\r\n", pui8ControlTable[UDMA_CHANNEL_SSI0RX].ui32Control & UDMA_CHCTL_XFERMODE_M); UARTprintf("RX mode (alt struct): 0x%x\r\n", pui8ControlTable[UDMA_CHANNEL_SSI0RX | UDMA_ALT_SELECT].ui32Control & UDMA_CHCTL_XFERMODE_M); // Print the Rx buffer cnt = 0; while(HWREG(SSI0_BASE + SSI_O_SR) & SSI_SR_RNE) { UARTprintf("[%d] 0x%x\r\n", cnt++, HWREG(SSI0_BASE + SSI_O_DR)); } // End of the program while(1) { } }
Hello Aljaz,
Thanks for providing the correct variable definitions.
Regarding your latest code with all DRM calls, we don't really support detailed debug with DRM and I get that you are trying other means but that moves into territory that I cannot help with much.
For the DriverLib code, what is the purpose of the delay you have in there? I ask because:
So now my RX buffer is:
[0] | 0xAA '\xaa' (Hex) |
[1] | 0xBB '\xbb' (Hex) |
[2] | 0x12 '\x12' (Hex) |
[3] | 0x34 '4' (Hex) |
[4] | 0x56 'V' (Hex) |
[5] | 0x78 'x' (Hex) |
[6] | 0x9A '\x9a' (Hex) |
[7] | 0xBC '\xbc' (Hex) |
[8] | 0xDE '\xde' (Hex) |
[9] | 0x12 '\x12' (Hex) |
[10] | 0x34 '4' (Hex) |
[11] | 0x56 'V' (Hex) |
[12] | 0x78 'x' (Hex) |
[13] | 0x9A '\x9a' (Hex) |
[14] | 0xBC '\xbc' (Hex) |
[15] | 0xDE '\xde' (Hex) |
[16] | 0xFF '\xff' (Hex) |
However the Semaphore still isn't triggering, and honestly looking at your code I am a bit unclear about the Semaphore process you've put together here, it's not something I've seen before with DMA, can you elaborate more what you are trying to do with that? Though from a code standpoint I don't see any issues with that... I thought maybe optimizations were involved but they didn't fix that.
Also one other change made was making the ControlTable 1024 bytes rather than 64:
//***************************************************************************** // // The control table used by the uDMA controller. This table must be aligned // to a 1024 byte boundary. // //***************************************************************************** #pragma DATA_ALIGN(pui8ControlTable, 1024) uint8_t pui8ControlTable[1024];
Hello Ralph,
I'll be away until Tuesday so these are the last findings for me before the weekend. I understand your position on DRM and no worries, it was just another way to test the behaviour for me and hopefully get some new ideas from it.
I've managed to analyse what's going on on the SPI lines myself too and I've also observed that the delay task doesn't seem to have an effect. But my final implementation will be running SSI clock speeds a few 10x faster than this example. With higher clock speeds the delay does become apparent and it is important because the slave needs a certain amount of time before I can start reading the data from it.
If I remove the delay task out then it works for me too but also only up to the semaphore. Which sparked an idea if somehow memory-to-memory transfer causes the problems or some kind of a combination of mem-to-periph/periph-to-mem and mem-to-mem, although I don't seem to have problems with getting the semaphore set on the TX channel. The reason I'm using a semaphore is because I want to use two uDMA channels sequentially, i.e. TX channel enabling the other one where I can do some other stuff, plus set the TX channel again with UDMA_ALTCLEAR.
The final thing is that if I could get the semaphore on the RX channel set that would be a major step towards successfully implement this whole feature in my project. Could you just check if after you receive all 2 + 14 bytes (without including the delay task of course), if the uDMAChannelIsEnabled(UDMA_CHANNEL_SSI0RX) returns true?
Hello Aljaz,
Rather than trying to add in a check for that I just looked at the register directly and it is set for the SSI0RX channel so it looks like the channel is still properly enabled when it hangs without the RxSemaphore being set.
I am wondering about the swapping between memory-to-memory and memory-to-peripheral as well. Reviewing our Scatter Gather example, it was coded with memory and peripheral sequences being in separate task lists.
Honestly if you want a delay of sorts I feel that is something you should do via software more than trying to do this memory-to-memory process. How long of a delay are you aiming for when you are running at the high speed mode? Are you looking for the 'delay' to happen without CPU cycles being spent on it? Would having an ISR that triggers the next part work?
Hello Ralph,
and thanks for the ideas. I've tried a very simple Scatter-Gather mode with only two tasks in each of the SSI0 channels (first write (read) 2 bytes, then write (read) 14 bytes) and the uDMAChannelIsEnabled(UDMA_CHANNEL_SSI0RX) still returns true even after a long time. This means I can encounter problems the next time I use the SSI because the uDMA will pick the first byte out. One idea would be to manually disable the UDMA_CHANNEL_SSI0RX but I don't like this at all (I've tried setting Clear Alternate structure bit of this channel (HWREG(UDMA_ALTCLR) = 1 << UDMA_CHANNEL_SSI0RX) but it doesn't help - the channel is still enabled).
The delay needs to be at least 250 ns. But even if I try to implement this in a dedicated ISR, the uDMA still hangs at that RX channel.
The last argument in the uDMAChannelScatterGatherSet() takes a value of true/false and allows you to select either peripheral Scatter-Gather or memory-to-memory Scatter-Gather. Looking through the code of the TivaWare I noticed that this sets the Control word of the primary control structure. The primary control structure contains a list of all the tasks to be performed, which are then copied to the alternate control structure one by one. I would imagine that the selection of peripheral Scatter-Gather/memory Scatter-Gather are the most important in the alternate structure because the trigger from the peripheral would cause the start of a particular task, but the primary side could (the way I imagine this process works) simply be copied to the alternate structure and would thus always be memory Scatter-Gather. I must admit that even after reading the documentation I don't quite understand this. If I set the last argument of the uDMAChannelScatterGatherSet() for both TX and RX channel, then I don't receive anything but the TX and RX channels are both finished. What differences does this parameter actually make, why is it important on the primary side? Could it affect what I'm trying to achieve?
Hello Aljaz,
Sorry I did not catch onto this sooner, I don't have much experience with scatter gather and had never tried to use it in such a manner so it didn't occur to me before now that the way the channel is setup vs your structure is the issue here. After reviewing the datasheet again and looking back at our udma_scatter_gather example it's more clear.
Simply put when you configure the channel, you can pick to the a peripheral scatter gather or a memory scatter gather, but not both. Basically each Channel Set command used is limited to one specific uDMA mode. Ping-Pong, Mem Scatter Gather, Periph Scatter Gather etc. and to use another mode, you'll need a new Channel Set call. So for uDMAChannelScatterGatherSet, you are using true/false to toggle between Memory and Peripheral Scatter Gather operations as only one can be selected.
I'm not sure actually why the TX side ended up working in the end, it's probably more of an undefined behavior as these modes shouldn't be crossed over. But removing the memory transfers has everything working as I expect. I am not sure the issue with the RX channel being enabled? When I remove the memory transfers, I get all the data bytes from the uDMA channel so it is working as intended here.
I don't think this has anything to do with Primary and Alternate structures either, and I'm honestly not fully following what you are trying to express with that, but perhaps with my above comments about why the last variable of uDMAChannelScatterGatherSet is important gives you the details you need? I think that is what you were really seeking to understand here.
Ralph Jacobi said:I don't think this has anything to do with Primary and Alternate structures either, and I'm honestly not fully following what you are trying to express with that, but perhaps with my above comments about why the last variable of uDMAChannelScatterGatherSet is important gives you the details you need? I think that is what you were really seeking to understand here.
Yes, this was the piece of a puzzle that was missing in my mind. I haven't conceptualized the idea that if I set Peripheral Scatter-Gather I can't mix it with the Memory Scatter-Gather tasks in between because it was working in some cases. Thank you Ralph for the effort, I've learned something new.
Hi Aljaz,
Aljaz Prislan said:I haven't conceptualized the idea that if I set Peripheral Scatter-Gather I can't mix it with the Memory Scatter-Gather tasks in between because it was working in some cases.
I'm not sure why it is working either, there isn't anything in the datasheet that says it should work so really it ends up falling under the 'undefined behavior' bucket as something we haven't spec'd and don't have details on how it works. Would explain the strange quirks seen. My guess right now, and it's just me taking a wild stab at it, but I think when the uDMA is called again to handle the RX, then that lets it finish the memory transfer in the TX side somehow. A possible test would be to send one dummy byte over TX after you receive the data to see if that triggered your semaphore. Though even then I wouldn't rely on that always working as that type of operation isn't spec'd. But maybe that experiment could help with some of the confusion.
Hello Ralph,
I did a quick test as you suggested. I added one more task at the end of the task list of the TX channel. Both semaphores were indeed set in the end.
But I also found the following paragraph in the datasheet:
9.2.6.6 Peripheral Scatter-Gather
Peripheral Scatter-Gather mode is very similar to Memory Scatter-Gather, except that the transfers
are controlled by a peripheral making a μDMA request. Upon detecting a request from the peripheral,
the μDMA controller uses the primary control structure to copy one entry from the list to the alternate
control structure and then performs the transfer. At the end of this transfer, the primary control
structure will copy the next task to the alternate control structure . If the next task is a
memory-to-memory transfer, execution will start immediately and run to completion; if the next task
is a peripheral-type transfer, the μDMA will wait for a peripheral request to begin.
which seems to indicate that such mixing should be possible? But of course only if there was a trigger in the first place (which this last example based on your idea had).
Hello Aljaz,
I haven't quite figured out what is going on here honestly because the datasheet seems to be contradictory a bit about how the memory transfer should work vs what we observed.
Presuming that it is true you can do both as the datasheet says, then the purpose of setting the peripheral bit for XFERMODE would then be to distinguish between primary and alternate, and to indicator that for peripheral scatter gather, the uDMA is waiting for a peripheral trigger.
Then for the peripheral, it says that it will automatically start executing any memory-to-memory transfers which are tasks.
For a moment after reading that, I wondered if maybe instead of AUTO mode, we needed to use Memory Scatter-Gather mode for the task to trigger that - which originally didn't work for the delay by the way - and again it did not solve the issue with the RX semaphore.
So right now it feels like something needs to help trigger a memory-to-memory transfer in spirit of how the uDMA is waiting for peripheral triggers when in Peripheral Scatter-Gather mode from XFERMODE.
The weird thing with that though then goes back to the original issue with the 40 byte transfer as a delay where it corrupted your TX operation that had triggered to get things processing...
I tried a few different settings again while pondering all of this but I still haven't gotten the RX semaphore to trigger without another task.
You mentioned you added a task at the end of TX... did you not need one for RX then? Or was it RX you added the task for?
Hello Ralph,
Ralph Jacobi said:You mentioned you added a task at the end of TX... did you not need one for RX then? Or was it RX you added the task for?
if I added a single task at the end of the TX channel, then the RX tasks ran to completion. But then there's one byte left in the RX FIFO afterwards. If I add additional task at the end of the RX channel to read that byte out, then both uDMA channels ran to completion and all the data is read. But then we're left with a situation where you have to split the data if you want the Memory SG somewhere inside your task list.
Ralph Jacobi said:Presuming that it is true you can do both as the datasheet says, then the purpose of setting the peripheral bit for XFERMODE would then be to distinguish between primary and alternate, and to indicator that for peripheral scatter gather, the uDMA is waiting for a peripheral trigger.
Then for the peripheral, it says that it will automatically start executing any memory-to-memory transfers which are tasks.
This is how I interpreted it too. However if you want the uDMA to copy the next task right away and not wait for a peripheral trigger, then based on my interpretation you would set the last argument of the uDMAChannelScatterGatherSet() to false. But if I do this then I get some strange behaviour and even less data is stored correctly at the end inside the debugRxBuf[].
Ralph Jacobi said:So right now it feels like something needs to help trigger a memory-to-memory transfer in spirit of how the uDMA is waiting for peripheral triggers when in Peripheral Scatter-Gather mode from XFERMODE.
Based on my testing with my final project as well as this simple project I posted in the OP, that was my understanding too. I've tried many things trying to make it work but based on the data I got an idea that there must be some problem with the way uDMA is triggered and that's why the whole thing isn't working as expected. Of course I can't look inside the microcontroller so I asked a question here because all the other ideas I had failed.
One last thing that also supports this idea is that for my final project I had to decrease the arbitration size from UDMA_ARB_4 to UDMA_ARB_1 for the task where I send those 14 bytes (in the TX task list). At the beginning I used the setting UDMA_ARB_4 and I wasn't receiving the last byte. If I set it to UDMA_ARB_1 I got the following:
- all 14 bytes OK
- next transmission: 13 bytes OK, 4 most significant bit of the last byte OK, 4 least significant bits of the last byte were 0
- then repeat
This meant that there was some improvement by selecting UDMA_ARB_1 instead of UDMA_ARB_4 and I assume it's because the uDMA spends a little more time performing the tasks as it arbitrates a little more often. This then doesn't causes as many problems with the faulty trigger because the time is given for the data to arrive. Then when the uDMA reads from the RX FIFO, the data is already there, even if the trigger was set earlier.
Of course there was still a problem with the last four bits on every second transmission. I needed to somehow increase the duration of uDMA operation. I then found out that if I split the read tasks into:
1. task: read 2 bytes
2. task: read 6 bytes
3. task: read 8 bytes
I've got the correct operation and I receive all 14 bytes correctly all the time.
Hello Aljaz,
I wonder if it would be possible for you to do some tests where you monitor the SPI bus and use a GPIO to see how long it takes for the data to arrive. It really sounds like the DMA is trying to read too much data too quickly more than anything. Another test would be if you notice any changes while adjusting the speed of the SPI bus?