This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM335x I2C multi-master/slave mode

Other Parts Discussed in Thread: TMP275, TLV320AIC3106, AM3358, DM3730

Is there any application code / notes on how to use the am3558's i2c in multi-master mode / slave mode?

We're using QNX.  The product design is broadly based on the beaglebone black.

We have it working as a master - we just can't seem to get it to respond as a slave.

For the moment we are simply polling it.

  • A couple clarifications:

    1. In order to see if there was a stop condition on the bus, I need to see both SCL and SDA.
    2. I wanted to look at signal integrity so I need to see a zoomed in transaction.

    In the past when I have debugged I2C issues on a scope I've used the "zoom" capability of the scope in order to capture a huge buffer of samples similar to what you show in your screenshot, but then to be able to zoom into the capture and pan back and forth to closely examine the failing sequence. For example, you might be picking up some kind of noise on the line at some point that "looks" like a stop condition to the slave and is causing things to go out of whack. It's hard to know without examining closely.
  • I've looked up my notes on the i2c peripheral (made with the help of an i2c debugger I wrote on an stm32f100 microcontroller that was sharing the bus and which used clock stretching to slow down traffic sufficiently to allow accurate datalogging, and even permitted single-stepping through the transaction). These notes were actually taken on a DM814x (running baremetal code) but afaik the i2c peripheral of the AM335x is the exact same.

    [UPDATE: fixed name of SBLOCK regsister]

    First, I strongly recommend using the SBLOCK register to ensure any i2c transaction addressing you as slave is stalled until you are aware of it. The i2c peripheral will stall the transaction using clock stretching. You release it by pulsing the appropriate bit of the SBLOCK register low once you get an AAS event.

    The purpose of this is to provide flow control of i2c slave transactions. Without it the only flow control is on the fifos, but if your i2c event handling is too slow for whatever reason (e.g. debug logging) you can lose track of i2c transaction boundaries since these are not recorded in the fifos.

    Another issue is that in slave transmitter mode the master may end the transaction at any time, potentially causing data to be left in the tx fifo. If stalling is not used then a next slave transmitter transaction could start before you have a chance to clear the tx fifo, causing this data to be delivered to the master unintentionally.

    Even with stalling you need to be prepared for the possibility that while in the middle of a transaction you already see AAS for the next one. This is not a problem as long as you're aware of this. Clear the AAS but remember it occurred. It is important to note you will not see ARDY in this case since the next slave transaction is already in progress. How to proceed depends on whether you're currently a transmitter or receiver:

    If currently transmitter, then the current transaction has already terminated. If you care about the actual amount of data transferred, take note of the fifo stats. Then, clear the tx fifo if non-empty.

    If currently receiver, drain the rx fifo by reading until RRDY clears.

    In both cases, both fifos are now empty. Release the slave transaction by pulsing the SBLOCK bit low. Wait for XRDY (slave transmit), RRDY (slave receive data), or ARDY (empty slave receive transaction).

    Second, beware of the irq bits of the i2c controller, they behave in all kinds of weird ways. My notes say:

    AL, NACK, ARDY, AERR, BF, AAS are events. AAS also auto-clears at stop or repeated start, but I get the impression this isn't reliable?

    RRDY is "sticky": it is forced high by a level status, needs manual clearing but this will not succeed unless the level is low. XRDY is either sticky or a normal event depending on circumstances.

    BB is level and cannot be enabled. It is nevertheless shown in irqstatus.

    XUDF and ROVR are level but also have sticky versions. If they are enabled, the sticky version is visible in both irqrawstatus and irqstatus. If they are disabled, the level is visible in both irqrawstatus and irqstatus. Manual set and clear acts on the sticky event even if not enabled (but clearing will only succeed if the level is low).

    Here are some transcripts I made of how exactly the irq bits behave in various slave scenarios. I hope they are sufficiently legible and of some use:

    Notation for I²C bus:
            < = SDA falling with SCL high
            > = SDA rising with SCL high
            . = SCL falling
            0 = SCL rising with SDA low
            1 = SCL rising with SDA high
            x = 0 or 1
    hence
            <.  = start condition
            1<. = repeated start condition
            x.  = data bit
            0.  = data bit: 0
            1.  = data bit: 1
            0>  = stop condition
    
    
    Notation for irqs/events:
            +FOO = level status set (sticky event set and can't be cleared)
            -FOO = level status unset (sticky event can now be cleared)
            ^FOO = event pulsed (set but can be cleared)
            +addr/-addr refers to a bit being set/cleared in ACTOA register
    
    
    slave transmitter:
            <       +BB
            .x.x.x.x.x.x.x.
            1       +addr ^AAS
                    +XUDF   (if tx fifo empty)
            .0.     ^XRDY   (if tx fifo empty)
                            (waits until stall bit is cleared)
                    +XUDF   (waits until tx fifo non-empty)
                    -XUDF   data written, tx-bufstat decremented (mod 64)
            x.x.x.x.x.x.x.x
            .       +XUDF
            0.      ^XRDY
                    -XUDF   data written, tx-bufstat decremented (mod 64)
            x.x.x.x.x.x.x.x
            .       +XUDF
            1.      ^NACK
            0>      -XUDF ^ARDY     tx-bufstat is reset to count
                    -addr
                    -BB ^BF
    
    slave transmitter with stall bit set and stray data in tx fifo:
            <       +BB
            .x.x.x.x.x.x.x.
            1       +addr ^AAS
            .0.             (stalling begins)
                            fifo cleared
                    +XUDF   stall bit cleared (may immediately be set again)
                    -XUDF   data written
            (continues as before)
    
    restart:
            1<      -XUDF ^ARDY     tx-bufstat reset to count
            .x.x.x.x.x.x.x.
            1       ±addr ^AAS
    
    slave receiver:
            <       +BB
            .x.x.x.x.x.x.x.
            0       +addr ^AAS
            .0.             (waits until stall bit is cleared)
            x.x.x.x.x.x.x.
            x       +RRDY   rx-bufstat incremented
                    -RRDY   data read, bufstats decremented
            .0.x.x.x.x.x.x.x.
            x       +RRDY   rx-bufstat incremented
            .0.x.x.x.x.x.x.x.
            x       +RRDY   rx-bufstat incremented
            .0.0>   -addr -BB ^BF
                            data read, bufstats decremented
                    -RDY    data read,
                    ^ARDY   bufstats reset
    
    filling the rx fifo:
            x               rx-bufstat incremented to FIFO size - 1
            .0.x.x.x.x.x.x.x.
            x               rx-bufstat incremented (mod FIFO size) to 0
            .0.x.x.x.x.x.x.x.
            x               +ROVR
            .0.             (stalling begins)
                            -ROVR   data read, tx-bufstat decremented
                                    data read, bufstats decremented
    
    slave receiver + restart to transmitter:
            x       +RRDY   rx-bufstat incremented
            .0.
            1<      (no ARDY due to data in rx fifo)
            .x.x.x.x.x.x.x.
            1       ^AAS +XUDF
            .0.     ^XRDY   (stalling)
                    -RRDY   data read, bufstats decremented

  • Thank you Matthijs, this is exactly the kind of info we had been missing. Although I'm not sure where the AM335x's stall capability is from a quick look at the TRM, I will try to x-ref with the DM814x

  • Ah it looks like I just misremembered the register name, it's SBLOCK, the last register.

  • Re zooming on scope traces - tried that but due to the delay before the FAKE pulse event after RD started we are beyond to the sampling rate of the scope.
    Also it will not be possible to slow down the clock speed on the gateway and try that.

    What I will check next is the bits behaviour as per Matthijs' comments. Having some kind of recipe gives a chance of getting this to work. I will do this on the interrupt based driver, and leave this polling driver work for now.
  • Further to the telco re: my action d):-

    -          I tried to reduce the detection latency with the timeout time by an order of magnitude, but the delay between the trigger pulse and the preceeding entry into the loop did not scale linearly, i.e. there was still a large ~80ms delay between our detection and preceeding SCL/SDA activity which will prevent an accurate Zoomed capture on the scope. Not sure that can be ameliorated.

     We decided to see if we could find any glitches using the scope:

    -          Tried to trigger scope with pulses less than 5us (1/2 clk cycle) on SCL and SDA – none found

    -          Also, we checked for short STOP conditions and didn’t find any

    -          We noticed when the problem occurs, it’s in the address phase on the Beagle. We increased the sample rate on the Beagle analyzer to 50Mhz. The traces show the same as before.

    -          Tried with different trigger levels above

    -->  Scope did not trigger during the “fake” scenario (SLV TX bits/BB during SLV RX). This is some ringing on the lines, but nothing too untoward. This is with a single card and a gateway running the polling driver.

    -          Theoretically , In the polling driver it would be possible to hook up an interrupt to detect the rogue Bus Free, which would then trigger the GPIO, this could reduce the latency, but that’s not a quick job for today.

    thanks

  • Hi Paul,

    Just to let you know that some updates on the I2C slave example on BBB will be posted here on the forum.

    A.
  • Thanks AnBer.
    Just cross-linking again to the interrupt based driver since I've had some progress applying Matthij's excellent information there. However still some hurdles there: e2e.ti.com/.../1965505
  • Just to conclude, we have now abandoned trying to develop a MMi2c driver using the Sitara's I2C block, and are looking into another solution.

  • Matthijs- I have a question if you are still following this thread. We are also trying to get multimaster code working for the AM335x but having slightly different (but perhaps related) problems. Your suggestion about SBLOCK may solve one of them.

    Perhaps you can help with another:

    The (rare) situation is this: master A thinks it wants to begin a transmit. So it sets slave addr, DCOUNT, and enters MST_TX. However nearly simultaneously master B starts a transaction. Master A gets AAS and then RRDY and begins receiving (in slave receive) but seems to be stuck also getting some number of XRDY interrupts. At some point during the receives the XRDYs stop as if the master xmit has been halted somehow (we do NOT get AL). After the receive finishes some, but not all ,of the transmitted bytes pop out in a new bus transaction and the transmitter is messed up in a way that it does not provide more XRDY interrupts, but the bus remains busy (stuck?). Sometimes we get a XUDF at this point.

    I presume that we are not getting AL because the peripheral is noticing that the bus is busy at the last moment and deferring the transmit via the FIFO....

    Is there something we should be doing when we have begun a master transmit, but are interrupted by slave receive? What is the correct way to clean up? (To ensure we can safely re-start the transmission)
  • Jeff Senn said:
    Is there something we should be doing when we have begun a master transmit, but are interrupted by slave receive? What is the correct way to clean up? (To ensure we can safely re-start the transmission)

    ¯\_(ツ)_/¯

    I recall now I've seen some pretty fascinating things happen in such corner cases, but I had given up already on multimaster so I never really explored them to make notes.

    Have you tried clearing the fifos, dcount, MST, and START before you release the slave transaction using SBLOCK ?

  • Well... interesting notion. I'm trying some of that now. BUT... AFAICT there is no way to actually clear DCOUNT, nor START or STOP. They can only be cleared by the peripheral (when the bus event occurs). I can't reset the whole peripheral (obviously - since I'm in the midst of a read). If I actually attempt to clear MST then really odd things happen depending on exactly where I do it... (clearing MST does not seem to reliably get STT/STP to clear as one might expect) more later (testing is slow as I don't have a good test rig to actually cause this problem immediately -- have to wait for it to happen by jamming the bus full of messages)

    Aside: I really wish this peripheral were documented more carefully....
  • Jeff,

    Are you already in production with your hardware? If not, an idea came to mind that would conceivably simplify your software tremendously. In short, can you connect TWO of the I2C peripherals to the same bus? You would dedicate one as "master only" and the other as "slave only".

    I have some discussions going with the design team to see about improving this peripheral in future devices, but that's clearly not going to solve your immediate issues. Using a pair of peripherals on the other hand would probably go a long way!

    Brad
  • Brad - good suggestion. I had thought of that. Our hardware is in production (but at fairly low quantity currently). If it comes to that I'll check if we can do that.

    In the meantime: do you (or someone else there) know enough to offer me advice about this particular situation? Most of our I2C software works so far -- just in this rare case of master transmit being interrupted by the receiver before the bus start happens, I cannot figure out how to get the peripheral back to a usable state. The HW itself is sometimes (but not always) clearing MST mode (and apparently also the xmit FIFO) but then (after the receive) and then retrying/resuming the interrupted transmit (I've tried various sequences for this -- some of which actually manage to emit the correct message but then), the bus becomes stuck in the BUSY state and will NOT issue a STOP (the STP bit is 1 and stuck there).

    I can offer more details on what I've tried if it helps....

    Somewhat desperately,
    Jeff
  • Are you able to capture the bus with a protocol analyzer? For example, you can get inexpensive USB analyzers from companies like Saleae. It would be very helpful to capture what happened on the bus leading up to the issue. I think a register dump at the end when things are "stuck" would be useful too.
  • Yes - I have an analyzer - there is nothing unusual on the bus until the final state where the bus is busy and the processor in question has either output: 0 of N, 1 or N data bytes of an N byte message and then is just stuck there.

    The 0/1/N bytes corresponds to how I handle the interruption when the receive starts:

    If I do nothing - that is let the MST+TRX state persist while the receive happens[*] - then everything is a mess after the receive finishes and the master starts. After the bus start (and address ack) either 0 or 1 data bytes are emitted on the bus and I don't get the right number of XRDY interrupts and the bus is stuck in BUSY (MST=1, TRX=1, STT=0, STP=1). Usually I get a XUDF state here...

    ([*]Note: sometimes in the above I believe - but am not 100% positive) that MST is cleared internally and then it is somewhat more like the next case....)

    If, when the receive starts I set MST=0 TRX=0 and clear the TX FIFO, then when the receive is done, I can restart the transmit MST=1,TRX=1,STT=1,STP=1 (Note: I set STT and STP here, BUT they are already =1 in the register!!). STT will -> 0, and I can fill the FIFO at XRDY state and the complete message will go out and be correctly ACKED. BUT then the master will NOT issue a STOP on the bus. And in this case CON still has STP->1. And BB=1. An explicit attempt to stop at this point (setting STP=1) seems to have no effect.

    I can do a more complete dump of all the registers when I get back to the office...

    I hope I'm not missing some documentation somewhere.... I have also not been able to determine what is the correct way to recover from an AL(arb lost) or XUDF condition -- if there *is* any way... short of resetting the whole peripheral.
  • Jeff Senn said:
    Yes - I have an analyzer - there is nothing unusual on the bus until the final state where the bus is busy and the processor in question has either output: 0 of N, 1 or N data bytes of an N byte message and then is just stuck there.

    What is the level of SCL and SDA?

    Jeff Senn said:
    I hope I'm not missing some documentation somewhere.... I have also not been able to determine what is the correct way to recover from an AL(arb lost) or XUDF condition -- if there *is* any way... short of resetting the whole peripheral.

    I'm a little confused as your earlier post stated "we do NOT get AL", but above you ask how you should recover from AL.

    Jeff Senn said:
    I can do a more complete dump of all the registers when I get back to the office...

    Please do.  Perhaps for both cases you mention if they are different.

  • Note: asking about AL because (although we do NOT get it in this case) we do get it in a more rare case...

    In this case I wind up with SCL=0 SDL=1. I assume this is the processor either attempting clock stretching -- or just busted?

    registers wind up like this (although just prior to this XUDF interrupts occur - which I currently ignore for this test)

    CON=0x8602
    IRQr=0x1400
    IRQ=0x1400
    CNT=0x9

    I'm going to post again with a screen dump from the i2c analyzer -- you will see an incoming prior packet from another master succeeds to this processor (slave addr 0x11), then the processor in question starts a 9 byte packet to (0x14) that works, but does not STOP (analyzer notes a restart) and then a new packet starts, is address ACKed but then just sits there (and "times out" in the analyzer)

    In particular note that STP=1 (has been throughout the whole prior packet) - so I should think the peripheral believes it should stop, yes? Or should have even at the prior packet end.

    I'll also try to catch it a few more times (and post) as it sometimes differs slightly depending on when the incoming AAS occurs relative to the outgoing transmit request.
  • So... I've developed a (rather non-satisfying) workaround for this problem. I'm mentioning it in case it helps anyone else who happens upon this thread.

    I would still very much like to know how to recover from this situation as the workaround seems like a hack.

    Please note: this workaround still has a horrible race condition if you cannot make sure that everyone on your i2c bus is robust to the processor in question resetting it's peripheral and thus being not-present on the bus for some period.

    Here is workaround:
    When you notice a receive interrupt a master transmit (that is AAS happens *after* a set of DCOUNT=N; MST=1,TSX=1,STT=1,STP=1, but before the xmit begins)
    1) as soon as you see the AAS, clear the transmit FIFO, and set I2CCON=0 (MST=0,TSX=0,STT=0,STP=0) i.e. Disable the master tx
    2) Wait for the receive to finish
    3) When the bus is free (BF=1,BB=0) restart the transmit and fill the xmit FIFO as normal
    4) If another receive interrupts, repeat the process until the xmit succeeds in starting.
    5) As soon as the xmit finishes (or proceeds passed the address ack) change the destination slave address to one that is NOT present on the bus (you will NOT get a stop on the bus - but rather a situation that looks like a restart)
    6) Notice the NACK condition, and reset the I2C peripheral as fast as you can!
  • I've been out of the office this week.  I'm glad to see you've made some progress (albeit not an ideal solution).  Does that seem to be working reliably, or has that just further reduced the number of problems?

  • It does seem to resolve the current test case (which in itself was somewhat rare).

    But it is definitely not elegant.  

    I would still like to know what the intended/prescribed procedure is to recover from this case (as well as AL, and possibly NACK -- which I'm just guessing at) -- if there is such a procedure other than resetting the peripheral....

  • Jeff Senn said:
    1) as soon as you see the AAS, clear the transmit FIFO, and set I2CCON=0 (MST=0,TSX=0,STT=0,STP=0) i.e. Disable the master tx
    2) Wait for the receive to finish
    3) When the bus is free (BF=1,BB=0) restart the transmit and fill the xmit FIFO as normal
    4) If another receive interrupts, repeat the process until the xmit succeeds in starting.

    These steps seem normal.  So in the case where you were attempting to be a master transmitter, you notice that someone else had already addressed you as a slave, and so you proceed to receive that data as a slave.  Afterward you retry your master transmit.  Sounds good!

    Jeff Senn said:
    5) As soon as the xmit finishes (or proceeds passed the address ack) change the destination slave address to one that is NOT present on the bus (you will NOT get a stop on the bus - but rather a situation that looks like a restart)
    6) Notice the NACK condition, and reset the I2C peripheral as fast as you can!

    This is the surprising part.  So it sounds as if you have succeeded in both the slave receive as well as the master transmit operation.  Is that correct?  It seems like everything is working properly, so I'm not sure why there are these final steps.  Why send data to a dummy address at all?  Why not just reset the peripheral immediately in that scenario?  And what happens if you don't reset the peripheral?

  • Jeff Senn
    5) As soon as the xmit finishes (or proceeds passed the address ack) change the destination slave address to one that is NOT present on the bus (you will NOT get a stop on the bus - but rather a situation that looks like a restart)
    6) Notice the NACK condition, and reset the I2C peripheral as fast as you can!

    This is the surprising part.  So it sounds as if you have succeeded in both the slave receive as well as the master transmit operation.  Is that correct?  It seems like everything is working properly, so I'm not sure why there are these final steps.  Why send data to a dummy address at all?  Why not just reset the peripheral immediately in that scenario?  And what happens if you don't reset the peripheral?

    Ah - A stop does not occur after my transmit.  And there seems to be no way to force it to occur -- setting the STP bit does not help (it is already 1 - has been 1 since the very beginning of this).  Instead the master does a "restart" and immediately addresses whatever slave is in the slave address.  Note: I don't ask for this restart, and the STP is set so it should not occur.  

    If I do not address someone else, then a XUDF will occur and things get really weird if I try to reset after that (possibly involving state in the "other guy").  So I address someone not there -- get the NACK, and then everyone is in a stable-ish state. HOWEVER there is STILL no way (that I know of) to cause the peripheral to cause the STOP to happen and get that STP bit cleared (and the bus freed up). So I have to reset the whole peripheral.

  • Thanks for explaining that.  I appreciate your patience.  Ok, that connects this issue nicely with your previous comments and data points.  As far as I can tell from the state of the registers and the bus, it looks to me like the I2C peripheral thinks it should still be sending more data.  In particular, you have XUDF set as if you've "run out" of data (despite the fact that you have sent the full payload).  Also, holding SCL low would be expected if you were still sending more data, e.g. like the hardware is waiting for a 10th byte or something.

    So I'm wondering if there's something not working correctly with respect to DCOUNT.  Are you able to adjust your code slightly such that when you get to this case that you keep stuffing some extra bytes into the transmit fifo? I'd like to know how many bytes the I2C "thinks" it is supposed to be sending.  Perhaps we can correlate that with something else.

    Does your code re-write the DCOUNT register after you have successfully completed the slave receive operation?  If not, please do so.

  • Yes, I can re-write DCOUNT. And I have. I can change the length of that packet -- and can successfully send one that is of a different size than originally requested. But then no stop, and another start begins. Since there is no way to set DCOUNT to 0 (0 means 64K on this part I believe), there is no way via DCOUNT to signal we want to cancel a send. I think stuffing extra bytes basically works -- the packet goes out . I do have mixed success with that (due to being interrupt driven and not knowing exactly when to stuff). However - none of this gets that STP bit cleared and/or causes a bus stop. I believe that if I do enough stuffing, the packet just goes out and then starts another -- or possibly I need to hit STT again (or get XUDF). Still: no stop though.

    Since I don't really know anything about the peripheral insides, I'll widely speculate: I suspect that the STT and STP bits are latched in the peripheral -- that is when you set (0->1) the bit, the peripheral queues up a pending start/stop. When the start is interrupted I think the pending start and stop are cleared -- but not necessarily the bits in then register? Flipping MST off and then back on correctly causes the peripheral to attempt a start, but there is no way to tell it to stop because you can no longer make a 0->1 transition on STP. I guess something is odd about STT as well since it does do a restart rather than hanging out at the end of the successful packet (or maybe that is intended behavior -- I'm confusing myself at this point).

    If this is true - then I guess I'm convinced there is little to do except a reset -- unless there is some way to force that stop...(which is why I was also interested in how to handle an AL condition... seems like it might wind up in a similar place...) I haven't tried it (and it might be difficult in my situation to cause it to occur) but I wonder what happens to a STP=1 bit in the case of AL...