Delayed Start in eUSCI SPI

Nicholas Conn

Other Parts Discussed in Thread: MSP430FR5969

While I have SPI working correctly on the MSP430FR5969, there are a few intervals that seem much larger than expected. The following logic analyzer capture is for 3 wire SPI with a software controlled CS pin. It is transmitting 0x4F, 0x00, and 0x25 (correctly). For this test, MISO is left floating. SMCLK = 1 MHz.

I believe the issue is two-fold. First, after the CS pin is toggled and UCA1TXBUF is written to, there is a very long delay before data transmission starts. I know that I am not using DMA, but should the delay really be this large? Second, the delay between bytes also seems unusually large.

// Init CS pin (active low)
P4DIR |= BIT3;
P4OUT |= BIT3;

// Configure SPI pins
P2SEL1 |= BIT4;                           // USCI_A1 operation
P2SEL1 |= BIT5 + BIT6;                    // USCI_A1 operation

// Configure USCI_A1 for SPI operation
UCA1CTLW0 = UCSWRST;                      // **Put state machine in reset**
UCA1CTLW0 |= UCMST + UCSYNC + UCMSB; // 3-pin, 8-bit SPI master, MSB
UCA1CTLW0 |= UCSSEL__SMCLK;               // SMCLK
UCA1BR0 = 0x02;                           // /2
UCA1BR1 = 0;                              //
UCA1MCTLW = 0;                            // No modulation
UCA1CTLW0 &= ~UCSWRST;                    // **Initialize USCI state machine**
UCA1IE |= UCRXIE;                         // Enable USCI_A1 RX interrupt

The above code initializes the eUSCI module.

*SPI_CS_PORT &= ~SPI_CS_PIN;
while (!(UCA1IFG & UCTXIFG));   // USCI_A1 TX buffer ready?
UCA1TXBUF = buffer_read(&SPI_tx_buffer);

The above code is used to start SPI transmission.

#pragma vector=USCI_A1_VECTOR
__interrupt void USCI_A1_ISR(void)
{
	switch(__even_in_range(UCA1IV, USCI_SPI_UCTXIFG))
	{
		case USCI_NONE: break;
		case USCI_SPI_UCRXIFG:
			if (buffer_count(&SPI_tx_buffer))                        // Check TX byte counter
			{
				UCA1TXBUF = buffer_read(&SPI_tx_buffer);
			}
			else
			{
				*SPI_CS_PORT |= SPI_CS_PIN;
			  	SPI_busy = 0;
				__bic_SR_register_on_exit(LPM0_bits); // Wake-up CPU
			}
			SPI_rx_byte = UCA1RXBUF;
			UCA1IFG &= ~UCRXIFG;

			break;
		case USCI_SPI_UCTXIFG:
			break;
		default: break;
	}
}

The above code is the interrupt routine.

The delay seems much longer than what it would take to execute the code in the service routines and transfer the data to and from the eUSCI buffer registers. Does anyone have any suggestions? Have I set up the eUSCI incorrectly? Even if it did take that long between bytes, why does it take so long for the first byte to be sent after writing to UCA1TXBUF?

Thank you for your help!

over 9 years ago

0 Bruce McKenney47378 over 9 years ago

Guru 88410 points

What does buffer_read() do exactly? Have you timed it?

Just for some perspective: With UCA1BR=2, each byte (burst of SCK on your analyzer) is 16 clocks. I estimate that your post-/CS delay is about 3x of these, or about 50 clocks. That time is spent almost entirely in buffer_read(). It's not that difficult to burn up 50 clocks in even a simple function. Just getting there and back might cost 20 clocks.

More generally, it's not easy to achieve maximum throughput over the SPI without DMA. It's harder if you have to make a non-trivial decision about each byte. It's even harder if you feed it with an ISR, which has its own overhead. (An ISR does make good sense if you're running the SPI very slowly, but that's not your case.)

0 Robert Cowsill over 9 years ago

Guru 16361 points

In addition to Bruce's reply, you're losing some time by not parallelising load of TXBUF with the eUSCI transmission. Waiting for RXIFG to be set before loading the next byte for transmission forces the CPU and eUSCI to "take turns", with each waiting for the other to finish before continuing.

TXIFG goes high at the start of a byte transfer, as soon as TXBUF has been copied into the TX shift register. Loading TXBUF when TXIFG goes high means the CPU can safely start work while the eUSCI is still busy shifting the current byte out onto the line.

0 Nicholas Conn over 9 years ago

Prodigy 140 points

Thank you both for your help. Both answers helped me understand what was causing the delay. When only sending fixed values and simplifying the ISR, things speed up significantly. I do have two followup questions.

1. Is there a more efficient way to do the following?

while (!(UCA1IFG & UCTXIFG));   // USCI_A1 TX buffer ready?

2. When I have only the TX interrupt enabled, the delay is significantly less than when I have both TX and RX interrupts enabled. The first image is when only the TX interrupt is enabled.

The next image is when both TX and RX interrupts are enabled.

Is this increase really due to the interrupt being called twice per byte sent? It seems strange that the delay is increased every second byte. I had thought that since I am sending the next byte in the TX interrupt, the RX interrupt would not greatly slow things down. I am not doing anything in the RX interrupt (see code below). Is this because of the switch statement? If so, would it be better to use two if statements since I only care about two interrupts (RX and TX)? Is there anything else that can be done to remove this extra delay when an RX interrupt is needed? I know that I am nit-picking here, but I would definitely like to understand what is going on.

While I understand not using interrupts is the fastest way to go, my application requires many things to happen "at once", so I will be relying on interrupts to handle all communications. I will definitely be looking into using DMA to make things happen as fast as possible.

Thank you again for helping me clarify things!

Also, for anyone else who may stumble across this, here is an example of code with minimal delay between transmissions.

while (!(UCA1IFG & UCTXIFG));   // USCI_A1 TX buffer ready?
UCA1IE |= UCTXIE;
UCA1TXBUF = 0x55;              // Send fixed data

Sending a fixed values removes any additional instructions that need to be run.

#pragma vector=USCI_A1_VECTOR
__interrupt void USCI_A1_ISR(void)
{
	switch(__even_in_range(UCA1IV, USCI_SPI_UCTXIFG))
	{
		case USCI_NONE: break;
		case USCI_SPI_UCRXIFG:
			break;
		case USCI_SPI_UCTXIFG:
			UCA1TXBUF = 0x52;
			break;
		default: break;
	}
}

The above interrupt routine does nothing except feed in another value to the TX buffer. This code is only for demo purposes, it doesn't do anything useful.

0 Robert Cowsill over 9 years ago in reply to Nicholas Conn

Guru 16361 points

Yes, the increased delay between pairs of bytes does appear to be caused by the RXIFG being processed.

What's happening is that the first byte is loaded, then TXIFG fires almost immediately and the ISR loads the next byte. When the first byte has been sent/received you get RXIFG and the ISR starts handling that. While the ISR is running TXIFG goes high again, but is left pending until the ISR has finished from the last RXIFG. Unfortunately, before you get to the line where the ISR reads UCA1IV another byte is received. RXIFG has higher priority than TXIFG, so it gets processed first and delays the next transmit.

In other words, your interrupts are forced into a pattern of: "TX, TX, rx, rx, TX, TX, rx, rx..." rather than: "TX, TX, rx, TX, rx, TX, rx, TX, rx...". Having two rx in a row means the TXBUF is already empty, so the USCI needs to stall until the second rx and the subsequent TX complete before it can transmit again.

Just to give you some idea of where the time goes, here's a fragment of the disassembly with cycle counts:

naken430util - by Michael Kohn
    Web: http://www.mikekohn.net/

                                                        6 (interrupt entry)
0x441a: 0x92a2 cmp.w #4, &0x05fe                        4
0x441c: 0x05fe
0x441e: 0x2003 jne 0x4426  (offset: 6)                  2
0x4420: 0x40b2 mov.w #0x0052, &0x05ee                   5
0x4422: 0x0052
0x4424: 0x05ee
0x4426: 0x1300 reti                                     5 (interrupt exit)

That works out as 22 MCLK cycles per TX interrupt and 17 per RX interrupt. In each case there are 11 cycles of interrupt handling overhead. With the /2 divider selected the USCI transfers a byte every 16 MCLK cycles, so the CPU can't keep up.

0 Jens-Michael Gross over 9 years ago in reply to Nicholas Conn

Guru 227245 points

You can put the whole switch statement into a while(1) loop and make a return in case 0.
If another interrupt is pending while the ISR handles the first, it will be then processed instantly, without exiting and re-entering the ISR. If no more interrupts are pending, case 0 will do the return and leave the ISR.

0 Jens-Michael Gross over 9 years ago in reply to Nicholas Conn

Guru 227245 points

Also note that when clearing SWRST, TXIFG is instantly set., So when setting TXIE, an interrupt will be triggered (if GIE is set too). So your write to TXBUF after setting TXIE will happen after the ISR has already written to TXBUF. Either have GIE clear , or let the ISR do all the writes to TXBUF, or clear TXIFG right before setting TXIE (in this case, a 'starting' write to TXBUF is required to get TXIFG ever set again for subsequent transfers)

0 Tony Philipsson over 9 years ago in reply to Jens-Michael Gross

Guru 12050 points

Do you need a delay after you asserts nCS and before you start sending data, yes.
The SPI slave datasheet will specify, but for example the cc3000 needs 50uS

It now comes down on how to write your SPI driver.

I do NOT write the first byte to TX to start it, I let the IRQ write byte[0]..[n]
I instead I just enable its IRQ permission (line3 below)
At boot up I {bis.b #UCA0TXIFG,&IFG2 ; set IRQ flag}, but leave permission off so it's ready to go.

      bic.b #nCSpin,&nCSport     ; chip select low
      delay 50, R15              ; 50uS delay macro use r15 (plain dec.b R15 loop)
      bis.b #UCA0TXIE,&IE2       ; Enable USCI_A0 TX interrupt, IFG is already on.
      ...

IRQ
      mov.b @SpiPnt+,&UCA0TXBUF  ; SpiPnt =R8, a write to BUF clears UCA0TXIFG
      dec.w &spiLen              ; subtract the numbers of bytes to send  
 if_z bic.b #UCA0TXIE,&IE2       ; Disable interrupt, IFG flag will still be set but on hold
      reti

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

Delayed Start in eUSCI SPI