This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Problem with I2C Routine hanging

Other Parts Discussed in Thread: BQ77PL900, BQ34Z100, TCA4311A, TCA9509, PCA9517, TCA9517, TXS0104E, TXS0102

Hi all,

I have seen similar problems in the forum but not yet any solutions.

In short: I initialize the I2C Routine and start communicating with two slaves. Master is lm4f230h5qr, slaves are bq77pl900 (internal 10k pullups) and bq34z100. At first the communication is ok, but if there is any interruption on the lines (emc noise) the communications hangs up and only the start pulses can be seen on the bus.

Difference to other problems: The scl and sda linese are high. Also the communication does not time out (bus busy is set to 0 within the time), also the function I2CMasterErr does not report any problems. The software runs as if the communication was ok.

The values however are bad, and on the oscilloscope i can see only start-conditions.

I had a similar problem, when I tried pulling the scl line low on the eval-board, but that could be fixed by a peripheral reset.

I also added a function that disables the i2c and sends up to 16 clock signals on the bus before restarting. Still to no avail. (Might add, that in this case neither sda or scl are externally pulled low by the slaves, therefore a slave hang is improbable) .

Also a software-reset in the debugger does not help, if the communication has been lost once. Only a hardware reset solved the problem. Therefore either a slave is the culprit or the i2c hardware has a problem.

All Routines except the direct scl-clocking in an error case use the Stellarisware Routines. the direct scl-clocking is also only started in debug mode by setting a bit, so it does not interfere in the normal routine. Also this whole problem could be observed in debug modes as well as in free running mode via oscilloscope.

I attached two pictures, the first is the scope after a hardware reset, the second is the scope after the hangup. This keeps repeating with each supposed communication until a hardware reset.

 

 

  • Second picture and rest of text got lost. So here it is.

    Any ideas appreciated.

    Best regards,

    Jan

  • I talked to the apps engineer who has done the most work with I2C on your part and he tells me there is a known susceptibility to bad bus noise in the I2C master hardware that can cause this symptom. Unfortunately, the only fix is to ensure that the I2C lines are clean and do not suffer from noise that would be considered outside the specification for the bus.

  • Perhaps lower value pull-up R's on SCL & SDA - especially when the bus is expected to remain "idle" for prolonged period - may solve.  (or shed some investigatory light.)

    Dave doesn't quite qualify "clean" - but you may wish to employ schottky diodes to limit I2C line excursions both above 3V3 and below ground...

  • I have found a scope probe on the SCL line of the launchpad will generate sufficient noise to occasionally require un-powering and re-powering of I2C slave device when data transfer hangs. An I2C Error Code 0x10 persists even after peripheral reset is issued.

  • Well, I know that noise is bad on the bus. I also do not mind to lose a telegram, but there is always a scenario where a lonely emc pulse or something similar can occur. Employing schottky diodes or different pullups is not possible for me in this case, since the hardware is fixed.

    As long as there is any way to recover from a hung I2C, this would not be a problem, since these pulses do not occur under normal conditions . But if the system is to run unsupervised in a mobile application (probably Power Tool or E-Bike) some pulses could happen (emc).

    If this is really a hardware bug, that the i2c Hardware cannot be reinitialized after a problem without a full power on reset, that would be a serious flaw in my opinion. Is this verified, or is there a possible solution by software? 

     Greetings,

    Jan

    PS: I also stumbeld upon this problem with a scope-probe at first, but a change to a better scope and probe solved that.

  • One possible solution...

    Isolate each I2C slave power supply -- even a current limiting resister will do for the isolation. Attach a GPIO or other circuit to pull the slave power low if the IO error will not go away after an I2C peripheral reset. Most I2C peripherals are very low power so sinking a maximum of a few milli amps to "ground" the power input should not be an issue.

    It might mean running a few wires on a current board production lot -- but this safety could easily be designed in.

    When I get the problem I lift the power lead of the I2C device out of the breadboard if I still get the "error 0x10" after a peripheral reset.

  • In most cases that would work, however in this specific case I cannot shut down the Devices. The pl900 is directly powered from the main power source which it is measuring. I cannot shut it down, since it powers indirectly my 3,3V onboard supply. In order to switch it off, I need to send a string and pull down one pin, but this string can only be sent via i2c, hence my problem.

    Right now I am tending to implementing a software based I2C, but it is somehow a step back.

  • While it disappoints me to have to agree, given that you have no way to modify the hardware to eliminate the noise and that the latest revision of the part you are using contains this problem, your idea of using a software I2C implementation is probably the best solution.

    Edit 12/21/12: After posting this, I realized that I had not mentioned that there is already a softi2c module available in StellarisWare. It's not included in the LM4F releases but you can find it in the utils directory in the lm3s9d96 release (and probably most other lm3s releases too). It should work on LM4F without any problems.

  • This sounds like a fairly serious design flaw -- and it looks like it could affect the entire LM4F line and every MCU available -- all two plus the LaunchPad. Does this also apply to the LM4F232 --specifically the kit?

    It's of interest to me because all current projects on my desk make heavy use of I2C. I would go so far as to say they are "un-doable" without reliable I2C.

    Is this issue being tackled by the "Foundry" workers?

    It's not like we can switch to an LM3S with I2C -- not in the TI line anyway. But even if we could -- do they have the same problems?

    So now I'm thinking about how to handle this... My best test gave me about 300M cycles before the I2C sensors quit due to some sort of problem -- but that's only about three days.

    Incidentally when the errors do occur it fires off all error conditions for the I2C bus, then locks with error 10 -- even after a peripheral reset. Power must be removed from the sensor and then reapplied to restart the I2C bus.

  • Dave, I'll have to defer to the hardware guys for more explanation on this but my understanding is that this problem affects all the LM4F parts and may also be seen on LM3S too but at lower frequency. I suspect you will see this in the errata fairly soon.

    Just to be absolutely sure we're talking about the same scenario, you find that, after some period of time, you start seeing "arbitration lost" errors from the I2C master even though you are running on a single master bus and that, after this, calling SysCtlPeripheralReset(SYSCTL_PERIPH_I2Cn) and reconfiguring the peripheral doesn't make a difference (you continue seeing arbitration lost errors). Further, you are sure that no slave peripheral is hung and holding SDA low while it waits for the end of the interrupted transaction and you have implemented a software reset approach which involves wiggling SCL to get the peripheral out of any mid-transaction state?

  • Oohh wow -- way too complicated for me... I ain't that smart. I look for simpler solutions.

    I took a working part out of my Atmel Controller -- since I had my suspicions.

    No I don't "Start Seeing" -- the crash is sudden and complete. All possible I2C errors are returned. I added code to log the errors to the USB/AUART comport line -- it logs them for me to my PC screen then does a flashing light dance so I know it occurred.

    No I don't bother wiggling the SCL line. I saw the CB1 solution on this and decided against it. I figger the port should work. Since I'm logging a wave train a "hole" in the data like this is not helpful. I need a part that works.

    I built all that reset stuff into a routine which reconfigures the part for me -- so I don't have to think. However anything but a power removal on the slave leaves an error 10. Whatever is happening locks up the slave.

    The only thing I might try is I2C isolation -- with a line driver. I attach sensors to points on a structure -- soooo a line driver would be helpful anyway. (These tests are with short runs od say 6-12".) If it is line noise -- a very short run to an I2C line driver might be helpful -- good for several feet??? Got a sample you can send me??? ;-) (My address is on file.)

    I checked my records. I get up to about 500M tuples (complete data points XYZ plus temperature) then it fails. But... sometimes the errors occur in minutes of run time.

    The same part gives no such problems on Atmel AVR processors (I drop data) -- which cannot keep up with the sensor -- hence the attempted change.

    I do have another system here (M4 Cortex -- but no FP) and I will move the code -- but it ain't a TI part.

    It's too bad -- I really do like the TI system. Even getting used to Code Composer.

  • Just as an aside here... In the previous design I was using a Spark Fun Level Changer -- which uses the BSS138. I am guessing that it may have been acting as an I2C driver for the longer cables -- and suppressing the noise. I cannot rule that out.

    I guess I really do need to get am I2C driver to see if that cures the issue. If it does I can live with that -- the extra cost in this design does not mean much. My timing of the various routines in the data collection proves I really need the floating point as well -- and at least 80MHz clock speed.

    And rewriting all that working code is not a pleasant proposition....

    If it works it might be acceptable to others as a "fix" as well.

    You might want to suggest a part. I can order from Digikey or Newark and have it here by the end of the month.

    Incorporating level shifting too would be good as some of my sensors are still 5V -- so one at 3.3V and one at 5V with 3.3 V level shifting for the MCU end... -- I really only want to level change at the MCU end -- even if a higher voltage might have less noise susceptibility... I want to power the sesors from the MCU end.

    Any thoughts?

    Even the Concerto (since it has Ethernet) is a thought -- same code should mostly work -- just a minor recompile -- right? But that is a bigger change... seriously

  • How about :

    TCA9509 or the older TCA4311A ???

    PCA9517 or TCA9517 (Preview)

    The bigger the leads the better -- I have big fingers.

    Any thoughts?

  • Funny - I was feeling the same way (about "too complicated" - I'm a software guy!). In the past, when this kind of problem occurs, I've found that the slave peripheral can sometimes be stuck in a state where it holds SDA low because it is waiting for more clocks. After it's reset and you try to initiate a new transaction from the master, it detects that SDA is low and takes this to mean that some other master owns the bus. As a result, you end up in a stalemate where the slave is waiting for clocks and the master is thinking it can't generate the clock because it doesn't own the bus.

    To avoid this, during I2C peripheral configuration, I would sample the state of the SDA line (configured as a GPIO at this point) and, if it's low, wiggle the SCL line (also configured as a GPIO) a few times until the SDA line goes high again indicating that the peripheral has released it. This has cured I2C bus hangs that reported as arbitration lost errors for me in the past.

    Another option, which may not work for you if you need to drive the interface really fast, is the softi2c driver that you can find in the utils directory of the StellarisWare release for lm3s9d96 (and other lm3s releases). For some reason, we don't seem to ship this in the LM4F releases (presumably because those parts have a lot more I2C controllers integrated) but it should work fine on LM4F parts too.

  • Indeed "Too Complicated" means something different to everybody. lol.

    I am originally a processor designer but spent lots of time in software as well. I don't see a big difference on the "programming" side. It's just ones and zeros and when I started they only gave us zeros -- life was smuch tougher then... uphill both ways etc.

    The SDA line and the SCL lines were both hanging in a high state. I can check this by reprogramming the hardware to "freeze" and preserve the hardware in the fault state... I'll even stick the scope probes back on the beast to get a quicker fault.

    Anyway I found adapters at Digikey for SMD devices... I am just about to check the chip availability.

    Regardless, the slave is getting in that state because of "something" the master (or noise) is doing. I know that much.

    Every line of software has an ongoing maintenance cost -- about $2 to $10 per line per year -- $40 (for good code) to write (but only about $15 to $25 for bad hacked code). I'll take a hardware fix if I can get it.

    I will try to order the chips.

    Digikey 9082CA-ND is one possible adapter to make a trial easy... I'll try to add a link shortly.

    This is one possible adapter for anyone else wanting to try a quick fix.

    You can always send me a sample you know... PCA517 -- in your copious free time. It's not like I'm fixing "my" problem... ;-(     grump, grump...

    Finally: I really do appreciate the ideas -- and I know that many of us appreciate your efforts in particular

  • Hi,

    I'm using TXS0104E (4-bit) and TXS0102 (2-bit) for I2C level shifting. When I need to deal with higher bus lenghts (means higher bus capacity) I'm using TCA4311A (hot-swappable I2C buffer).

    Rgds
    aBUGSworstnightmare 

    P.S. you can order free samples online at TI!

  • I'm going to have to punt this to the IC team at this point. If your bus is sitting in the clean idle state (SDA and SCL both high) then the problem has nothing to do with the hang I mentioned in my last post. If, further, the peripheral reset for I2C doesn't clear it, then I can't think of anything else to try other than switching to the bit-banged I2C approach via the softi2c module.

    I was going to suggest that you request samples directly from ti.com's product page for the PCA9517 but I see that that link is now grayed out. I'll see if anyone here knows how to request parts now but, in the meantime, you may have better luck if you request a contact from a sales person and discuss it with them since they are very much closer to the mechanisms that ship physical product than those of us down here in development.

  • Dave, I found a mechanism to get you some samples. Please send me a private message with your mailing address and I'll try to get you a few PCA9517 parts.

  • Dave Wilson said:
    I found a mechanism to get you some samples.

    Just another example of Dave Wilson's sterling care & responsiveness!

     

     

  • Hi,

    it seems we weremistaken about the i2c recovery. I had contact with an FAE from TI and we went through this problem. There has been some strange behaviour on the I2C Side, but it seemed, that if you do a peripheral reset in the first initialisation, then it is possible to get a "frozen" i2c communication back later. There were some differing results if the first peripheral reset is not done in the beginning, but I am not sure if there was a different problem responsible for that. Also I had a problem in my sourcecode, where I chose the wrong define for the peripheral reset before. It would be nice to know, if the initial reset is really needed, but right now I am happy with the way things work and have to get back into schedule. If there is time I will try to check back into that issue.

    I attached the code I used for Initializing the I2C and also for recovery (Only for the I2C Module).

     I2CMasterDisable(I2C0_MASTER_BASE);

    //    HWREGBITW(0x400fe500 + ((SYSCTL_PERIPH_I2C0 & 0xff00) >> 8),
    //      SYSCTL_PERIPH_I2C0 & 0xff) = 1;    //for testing in the debug view.
    //
    //
    ////     Delay for a little bit.
    //
    //    SysCtlDelay(16);
       SysCtlPeripheralReset(SYSCTL_PERIPH_I2C0);
     SysCtlDelay(16);
     I2CMasterInitExpClk(I2C0_MASTER_BASE, SysCtlClockGet(), FALSE); //!I2C0, Systemclock, not Fastmode
     I2CMasterTimeoutSet(I2C0_MASTER_BASE, 0x06); //!0x60 = 0,96ms Timeout
     I2CMasterEnable(I2C0_MASTER_BASE);

    So far the routine is running and can also be recovered after noise on the bus, as long as the error could be detected. (Here rather easy, because I can countercheck the devicename via i2c)

     

  • All,

    I know this thread is fairly old but I'm hoping someone will see this and maybe expand on my idea. I read this thread when I went to start I2C with my launchpad and was pretty bummed as Jan wrote you really do NEED I2C for almost any application now days. So I went and made an I2C program and sure enough after some amount of time the launchpad would freeze completely with no way to bring back I2C even if the rest of the chip was software reset. 

    I was hopeful though and continued on with my project trying to debug whatever I could when I noticed a VERY strange feature on the launchpad. I have multiple read sequences in my program with multiple different addresses and even write points like most I2C applications. To debug inbetween each command structure I put some simple GPIO Out's for the onboard RGB LED. When I did this the stellaris did still hang BUT, it took well over 10 times the amount of time for the hanging to start. I tried testing if noise could get the stellaris into the hanging state earlier but I could not. I tried banging my breadboard shaking the launchpad even putting it over high voltage inductors it appears that noise was not the issue. I really have 0 idea why these simple GPIO Out's are keeping the stellaris from freezing for such a long time but with a little more investigation I feel that someone could possibly find a fix for this issue.

    I would really like someone to comment on this, it could be that I'm making a silly statement but it seems to me that this could really help a lot of people. 

    Oh by the way when I say 10* it was more like going from 30 bytes received then freezing to around 700 bytes. These reads are at 500ms increments.

  • Patrick,

      The only I2C problem I'm aware of on the rev of part found on the Launchpad does relate to noise so if you are seeing something different, that's new to me. Could you provide a bit more information on the state of the application once it hangs. What are the SDA and SCL lines doing at this point? Is your slave device holding SCL low, for example? Has your software detected any error reported by the I2C master? If it's easy and quick to induce the hang by removing your new GPIO code, would it be possible for you to capture an oscilloscope or logic analyzer trace showing what's happening on the bus just prior to the hang?

  • Dave,

    Currently I am away from an oscilliscopes I will hopefully have one some time in the next few days but no promises. I must have been misinformed on the noise part, I believed that was the reason the above post was having issues with I2c. Now I don't have an oscilliscope but with a multimeter I tried to get a decent guess as to if SCL is holding low via probing SCL to ground. It read's around 3.2 volts which seems to me like it is high a lot of the time. I had some error checking in my code but recently removed it I will add that back in and see what kind of error my device is finding hopefully something.