This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

I2C Master (DMA with no looping) to I2C Slave. Intermittent failures.



 

I have I2C errors that I haven't been able to resolve, and have been wondering if it may be a silicon issue.

I'm using a F5529 (master), communicating to a F2132 (slave).  During an exchange, it appears as though the slave will fail to load data into the receive buffer and the master continues to clock SCL for an extended period.  Usually it's not until another transmission from the master that it will recover.  Intervention with a UCS reset or sending a stop bit does not generate a recovery.

I have already implemented and tested the workaround for the F2132, as described in the errata USCI26.
I have accommodated the USCI30 extra byte receive issue, as it is not possible to circumvent this without the STP and STT IRQs working in master mode.

I2C is running 390kHz on UCB (also tried at 300kHz), with send receive packets occuring 300..400 times per second.  Comms failures are occuring multiple times per second.

 

http://www.eclipze.com.au/temp/ti/scope_0.bmp

scope_0.bmp : SDA goes high and remains high, while SCL continously clocks without regard to requested receive quantity.  Yellow trace shows the point of a forced timeout.  Does not matter if I issue STOP or do a SWRST.  However issuing a SWRST during this failed period causes a glitch on UCA (doesn't cause a problem on the other channel to issue a SWRST at any other time).

 

http://www.eclipze.com.au/temp/ti/scope_1.bmp

scope_1.bmp: Shows magnified of the start of the capture.

 

http://www.eclipze.com.au/temp/ti/scope_2.bmp

scope_2.bmp: Shows the point where comms fails.  SDA transitions during SCL high.

 


http://www.eclipze.com.au/temp/ti/scope_3.bmp

scope_3: Shows that in some cases, both SCL and SDA go low and successive timeouts occurs, without immediate recovery.

 

 

Code segments...

http://www.eclipze.com.au/temp/ti/master.s43

http://www.eclipze.com.au/temp/ti/slave.s43

 

I've used I2C in applications before and have the expectation of 100% data throughput.

This application uses DMA on the master for transmit and receive.  It's a real time application that will have a high CPU load, so I cannot introduce any delay loops to test for start/stop flags etc...  I did try removing the DMA and going to IRQ transmit/receive on the master, however the problem still existed.

I've spent nearly a week stressing over this I2C link trying to resolve the problem, and I really don't want to consider removing both MSP430's from the design over this problem.  I've already got stock on the shelf of both these micros for the pre-production run.

Does anyone have any ideas on the cause of the faults?  I would really appreciate any help with this. 

 

Cheers,
Tony

  • The timeout happens when on the slave there is nothing written to TXBUF. So the slave holds down SCL until you write something to TXBUF. Also, it you didn't read RXBUF yet and the 7th bit of the next byte has already received, the slave will too hold down SCL until you read RXBUF. The USCI doesn't know a timeout, and on I2C there is no timeout too, since a low SCL is stallign the bus and nobody except the one who holds it down can end this. Defining a timeout is pointless. It's like a train that is stuck on the track. No timeout of any kind can make another train pass on this track. That's I2C.

    I don't have the time to get through all of yor assembly code, but some things I noticed:

    bic.b #UCB0TXIFG, &IFG2
    It is not necessary to manually clear the IFG bits. This is automatically done if TXBUF or RXBUF is written.

    mov.b 0(R10), &UCB0TXBUF
    using mov.b @R10, &UCB0TXBUF is 2 bytes shorter and one cycle faster. The 0(R) variant is only needed for th eoutput operand as the @ mode isn't available there.

    bit.b #UCSTPIFG, &UCB0STAT // Clear I2C stop IFG
    YOu should read &UCB0STAT into a register and compare form there. It is a bit shorter and faster. Also, it may well be that one of the later tested register bits is cleared by a new I2C condition. So STOP or START clear NACK. So if you just tested for STOP/STart where it wasn't, then a START is detected, then you test for NACK, you won't detect any of them, as START was set after you checked and NACK was cleared before. That's a race condition that will probably kill your program logic.

    Also, you should only clear the bit you detected and handled, not all, unless you are sure you handled all. I guess, here is the source of your problem: a race condition with the status flags.

    bis.b #UCB0RXIE, &IE2 // Enable I2C Rx Interrupt
    bic.b #UCB0TXIE, &IE2

    There's no need to disable the IE bits. a TX interrupt only happens in TX mode, and RX only happen sin RX mode. So you can leave them on all the time for both.
    If it causes problems, it indicates a problem in your program logic.

  •  

    Thankyou so much for your clear comments.  Admitted, I have gone overboard in an effort to resolve the issue.  With intermittment failures, confidence was lost on any function operating 100%.  It's good for the clarification, as I don't like bloated code.... and I've probably gone to the extent that I may cause new problems.

    I will go through and refine the code.

    I see how the duration of the comms failing is a result of the slave, and it's the slaves timeout function that is permitting the recovery to take place.  I'm not sure how else I could get the slave to detect the fault has occurred.   Not practical to monitor the SCLLOW bit, as I could miss level changes in sampling.

    I do not believe the flag race conditions influence with the problem shown in the oscilloscope traces.  This comms error occurs during a transmission, where the master is already engaged in receiving bytes via DMA and the slave is streaming a byte to the Tx buffer on successive IRQ calls.

     

  • One more thing I noticed...

    Eclipze said:
    I have already implemented and tested the workaround for the F2132, as described in the errata USCI26.

    ??? USCI26 is about th etime between a stop and a following start, whcih is a master timing issue and does not affect slave mode and is of no importance if there is only master. The delay is only necessary to allow a different master to grab the bus between two transmissions.

    Or did you mean USCI28? Here, the workaround 3 as printed maybe isn't enough. Especially if you're dynamically disabling and enabling the IE bits and therefore the one you are checking might just be disabled.

    Eclipze said:
    I see how the duration of the comms failing is a result of the slave

    Most likely, yes. A continued low clock is caused by the slave. One cause is that you didn't write anything into TXBUF, so the slave holds SCL low until you did. On eUSCI with multiple-address support, this also happens after an address is received and the software has to decide whether to pick up the request ot drop it. But on normal USCI it is eithe ra full RXBUF or an empty TXBUF that holds the clock low.

    However, the 'endless' clocking on your first screenshot is a problem with the master. Only th emaster will clock SCL. The slave can stretch a single clock cycle, but cannot cause additional clock cycles.

    One suggestion: try going significantly down with your transmission rate. 400kHz means you have only 22.5 clock cycles per MHz MCLK for each byte. Which isn't much. With 1MHz MCLK this wouldn't be enough to handle the traffic at all, even if there were no other code running.

    Try going down to 40kHz I2C clock and see whether the comm errors still happen. if not, it is a clear sign that you have a timing/CPU load problem. If the ratio of errate/total transmissions is still the same, then it must be something else.


  • You are correct... I did mean USCI28.  I do check for all the IE bits for the reset condition.

    I've tried ~300kHz, but it had made no difference.  Master is 25MHz from XTAL, slave running at 8MHz from DCO.  I can see that the slave does a clock stretch before responding with the first byte, which is due to the routine preparing the data for the response.  However the slave has a very light CPU load.  It only has two timer IRQs to compete with, both are very quick... one sets a flag, the other starts an A/D sequence, and another IRQ on DTC compete (not much in there).  The masters CPU load doesn't impact the endless SCL clocking, as this occurs during a period where the DMA is receiving the data.  It is as if the DMASZ register gets changed to a larger value.

    The curious component is that the slave stops transmitting before it is completed the expected count... and SDA goes high while SCL is high (seen in scope_2.bmp).  Could it be that the Master fails to fully ACK the slaves byte, and the slave treats it as a stop bit?

  • Eclipze said:
    Could it be that the Master fails to fully ACK the slaves byte, and the slave treats it as a stop bit?

    Indeed, SDA going high during SCL high IS a stop condition, no matter when it occurs (mid-byte or after an ACK/NACK bit). It's the only way a master can issue a stop when in transmit mode as teh slave will have and hold SDA low (ACK) at the end of a cycle, so the master starts another byte with a low bit, and releases SDA while SCL is high after thie first bit. And the slave will detect a stop.

    So the question is: why does your master send a stop before things are done and why does it continues receiving more than it should. Maybe a misprogramming of the DMA. You should post the (commented) code where you set up the DMA and where you handle it when done.

  •  

    You certainly have a good grasp on I2C!

    All the master software associated with the I2C is in this file... http://www.eclipze.com.au/temp/ti/master.s43

    See first post for slave code.

     

    The first routine does the initialisation... sets I2C registers, configures DMA, timeout, NACK IRQ and finally sets the start bit.

    Next routine is all the ISR for the DMA.

    The routines for the NACKIFG and Timeout are also at the end of the file.

     

     

**Attention** This is a public forum