TPS65950 I2C Control Interface Hangs

Pev

Other Parts Discussed in Thread: TPS65950, OMAP3530

Hi all,

We have a board we're using that has the standard direct connection between the OMAP35x and the TPS65950 with no extra devices on the bus. This is running the standard BSquare 6.14.00 Windows CE BSP.

We have a test case where we set up the RTC interrupt to fire every second and poll the RTC time. In paralell we have a simple battery driver that interrogates the MADC for battery voltage. If we have both enabled, the I2C interface to the TPS65950 eventually locks up. This is characterised by both SCL and SDA getting held low.

The signals seen on the bus are normal :

When the bus hangs, we see the SDA and SCL held low thus :

In detail, the last transaction there looks like :

When we run this different times, it does die with different register ops, but typically it's always when reading from the RTC regs... An RTC reg dump on the OMAP at this point looks like :

I2C: REV = 0x003c
I2C: IE = 0x0000
I2C: STAT = 0x0000
I2C: WE = 0x0000
I2C: SYSS = 0x0001
I2C: BUF = 0x0505
I2C: CNT = 0x0001
I2C: SYSC = 0x0001
I2C: CON = 0x8600
I2C: OA0 = 0x400e
I2C: SA = 0x004b
I2C: PSC = 0x0009
I2C: SCLL = 0x0005
I2C: SCLH = 0x0007
I2C: SYSTEST = 0x0000
I2C: BUFSTAT = 0x0100
I2C: OA1 = 0x0000
I2C: OA2 = 0x0000
I2C: OA3 = 0x0000
I2C: ACTOA = 0x0000
I2C: SBLOCK = 0x0000

This is obviously a massive problem as it never recovers and renders all comms to the TPS impossible. We've tried resetting the I2C peripheral in the OMAP but to no avail. We then wondered if this is the OMAP or the TPS holding the lines low so we used SYSTEST to set SDA_O and SCL_O high (the documentation is unclear as there's no direction to set but as it's open-collector I assume that output high equates to NOT driving the output and hence high-impedence) Looking at the lines at this point shows them still low so the conclusion is that it must be the TPS pulling the lines down not the OMAP. Note : We would have tested setting GPIO's to input instead as it's a better defined mechanism but the I2C1 balls cannot be re-muxed.

The problem boils down to this : The I2C Specification declares :

In the unlikely event where the clock (SCL) is stuck LOW, the preferential procedure is to reset the bus using the HW reset signal if your I2C devices have HW reset inputs. If the I²C devices do not have HW reset inputs, cycle power to the devices to activate the mandatory internal Power-On Reset (POR) circuit.

AFAICT we can't externally reset the I2C on the TPS65950 without resetting the system which is obviously not an option...!

Yours confusedly,

~Pev

over 15 years ago

0 Pev over 15 years ago

Prodigy 245 points

Of course when I said "RTC reg dump" what I meant was "I2C reg dump" ...!

~Pev

0 Soren Steen Christensen over 15 years ago in reply to Pev

Genius 3685 points

Hi Pew,

Have you tried checked the actual last successful I2C communication going on at the bus just before the fail condition. Normally (my experience tend to tell me) problems like these tend to bail down to being SW issues, where SW is writing some "invalid" stuff to the chip due to not properly designed SW => leading to race condisitons occationally.

In you case I would double check if everything send on the I2C bus is correct until it suddently fails or if the fail is caused by some kind of wrong read/write previously, which afterwards make the error show up by making TPS65950 go bananas... :-)

Best regards - God luck
Søren

0 Pev over 15 years ago in reply to Soren Steen Christensen

Prodigy 245 points

Hi Søren,

Thanks for the reply. I'd wondered about this being a possibility, but I'd discounted it as the hang seems to occur after different register accesses each time. In the example I demonstrated above it's a valid read from the TPS65950's RTC MINUTES_REG. The RTC access code does the same thing every time and succeeds for 20-30 previous polls before it hangs so I've a reasonable confidence the raw accesses should be OK. If something was being done wrong by the OMAP I'd expect to see that on the wire?

An interesting thing is to compare a previous successful operation on the same register to the one that fails as below :

This is interesting as it compares what we'd expect vs what we see. What we see in the working trace is that you try to read the minutes reg and the TPS RTC holds the clock line low until it's ready with the value. This then gets released and and the value "0" gets clocked out followed by a NACK then a stop. I'm assuming the NACK is because there's something in progress internally but it should be OK as that's normal operation.

Cheers,

~Pev

0 Soren Steen Christensen over 15 years ago in reply to Pev

Genius 3685 points

Hi Pev,

I totally agree with yout findings and way of debugging. And the last image you posted here is actually very interesting, since I agree that it seems to suggest a TPS65950 failure. Have you tried to do the measurements using an oscilloscope instead of the Logic-analyzer? Sometimes signals looking just find with a Logic-analyzer shows up completely different on a scope :-) - The binary threshold on Logic-analyzers are really forgiving and able to hide a lot of valuable information - For good and bad :-)

Are you working on an evaluation board or on a design done by yourself? In case of your own design - Have you tried the same test on an official EVM board just to further rule out layout/board issues?

Other than this I'm currently a bit out of brilliant ideas - Good luck
Søren

0 DavidVescovi over 15 years ago in reply to Soren Steen Christensen

Genius 5470 points

Also, 6.14 BSP added I2C software locks that was not present in earlier versions. Check for resource contentions.

We have also found that when 65950 goes south it first shows up as I2C failures. We had problems with improper caps populated on our board and it showed up as interrmittent I2C failures.

0 Pev over 15 years ago in reply to Soren Steen Christensen

Prodigy 245 points

Hi Søren,

I've *tried* but it's kind of hard to capture things occurring that fast :-D This is using our board but looking at the traces (see the first scope trace) the signals are clean and basically identical to testing on the EVM - which bizarrely seems to work OK which is even more confusing!

~Pev

0 Pev over 15 years ago in reply to DavidVescovi

Prodigy 245 points

Hi David,

I'd looked at the locks when 6.14 first came out - as I understand the code, these are extra locks that you can use through the driver for acquiring exclusive actions so you can do sequences of accesses in one go though and don't affect normal usage of I2C?

The intermittent failure sounds like what we're observing potentially - I'm very interested in your comment about "improper caps" too as this is something that could potentially be a delta between our board and the EVM. Are you referring specifically to decoupling caps on the supplies? Could you tell me anything more about what you observed and how you traced it?

Interestingly I'd noticed that the NACKs that our logic analyser detected don't appear to raise the NACK bit in the I2C STAT register when it's polled too...?!

~Pev

0 Gandhar Dighe over 15 years ago in reply to Pev

TI__Genius 15100 points

Hi David, Soren,

Thank you for sharing your experiences on the forum. I have not seen or heard of such issues with any customer so far.

Your inputs are valuable and I hope it will help resolving Pev's issues with I2C.

Regards,

Gandhar.

0 DavidVescovi over 15 years ago in reply to Gandhar Dighe

Genius 5470 points

Our problem was improper decoupling caps populated on our custom board. The problem showed up as intermittent I2C errors.

Just on first look it looks like you have two threads compeating for the same resource (I2C bus). The MADC reading the battery voltage and the other reading the RTC time. If it works OK with just one running and only fails when both are running it is most likley a resource lock issue. But this is just a guess.

Also be sure the pullup resistors are on the clock and data lines per I2C spec.

0 Soren Steen Christensen over 15 years ago in reply to DavidVescovi

Genius 3685 points

Hi Gandhar and Pev,

@Gandhar: No problem - You are welcome - We all need to help each other best possible :-)

@Pev: I just checked you register dump, and it has I2C_CON.TRX (bit 9) set, which indicated that the OMAP thinks it's in transmitter mode, which it shouldn't be according to the last byte send on the I2C bus (the 0x97 for reading from TPS65950). Thinking about it, this still, in my oppinion, points in the direction of a SW race condition, where the ADC thread somehow reconfigures the I2C moduel setup in the middle of a transfer. As asked by David: Is it rock solid only running one of the tests (RTC and ADC) at a time?

Secondly - during my last reply I forgot that you already send an oscilloscope dump of the signals, and I agree the signal quality looks OK...

In your case I would go ahead an put in some debug statements in the I2C driver to continue debugging...

Good luck - I hope this helps you forward?
Søren

0 Pev over 15 years ago in reply to DavidVescovi

Prodigy 245 points

Hi All,

There's been some interesting food for thought so far, thanks all!

David : I'll look at the decoupling caps tomorrow and see if there's anything suspect going on. WRT race conditions, in the Win CE code I think all accesses are protected by a per-bus critical section so I wouldn't expect this to be an issue but again I'll check tomorrow. I don't think that any up-protected code modifies any of the critical I2C registers... Having said that, I'm not sure that I'd have expected this to specifically cause the slave to lock up though or would you anticipate this under the banner of "breaking in undefined ways" ?

Pullups are definately correct. We're using 4.7k as used on all the other reference boards. OK, I know TI call them "evaluation boards " these days! :-) Interestingly if you study the TPS TRM it mentions that you shouldn't use external pull-ups as the TPS defaults to internal using pull-ups on its I2C control interface (IIRC, 2KOhm +/- 30%?) We've tested disabling the internal pull-ups and also removing the external ones also just to see if it affects the issue (it doesn't!)

Soren : That's an interesting note to point out...! I'm not sure why TRX was set at that point. I think it could be that the driver hadn't finished the transmit part of the operation when the dump was performed, or that the dump I quoted was after a soft reset and was when it was attempting to TX and failing again. I'll check and post some better info on it. I *think* in the past I've sanity checked this and it was OK...

What I find VERY interesting is that the STAT register is 0x0000...! This really should not happen I think? No one seems to know why though. This is a hole in the TRM - a bit like the lack of explanation about what to do in polling mode when XUDF / ROVR get set :-D

It does seem to be solid when not having two accesses. Having said that it seems OK with both accessess too.... until it crashes ;-)

Right, it's a beautiful Sunday afternoon. I hope you're all out enjoying the weekend instead of doing too much work!

~Pev

0 DavidVescovi over 15 years ago in reply to Pev

Genius 5470 points

One other note, There should be a software lock on the complete I2C "transaction" ..not just the bus access. Some transactions require multiple packets and the complete transaction should be protected or else out of sequence problems can occur. Your problem looks to be more bus related so probably this is not the root problem.

0 tony martin51903 over 15 years ago in reply to DavidVescovi

Prodigy 50 points

Hi David,

Are you refering to the TPS decoupling capacitors?

0 GregoireGentil over 15 years ago in reply to tony martin51903

Expert 2330 points

Hello,

I'm experiencing a similar issue, meaning very frequent lost I2C connections between OMAP3530 and TPS65950. I'm suspecting more a hardware issue. Can David explain what his problem was about the improper decoupling caps? I'm very interesting to hear about that. Thanks in advance,

Grégoire

0 tony martin51903 over 15 years ago in reply to GregoireGentil

Prodigy 50 points

Hi Gregoire,

We have found that the layout around VDD1 and the clocks is critical. By feeding VDD1 from a bench power supply, the VDD1 switcher stopped and the problem was eliminated. We don't yet know if we've fixed the problem, as we are waiting for new boards built in which we have been very careful to isolate the track between VDD1 and the inductor is completely shielded from everything else.

Tony

0 Gil Zhaiek over 14 years ago in reply to tony martin51903

Prodigy 205 points

Hi All,

I was wondering how this problem was fixed?

We are using same OMAP 343x and same TPS65950 and our i2c hangs too.

We develop in Android Linux platform.

Basically, this problem occurs rarely. I have a while loop that does a register dump for all of the devices on the I2C bus, and it works fine.

Every few hours - our I2C controller hangs and we get a "controller time out".

Thank you

Gil

0 tony martin51903 over 14 years ago in reply to Gil Zhaiek

Prodigy 50 points

Hi Gil,

We found that the problem was fixed by changing the layout around the 26MHz oscillator. We originally had the 26MHz tracks routed too close to the switcher outputs. We moved the inductor and the oscillator to improve isolation and shielded them from each other with the ground plane. That fixed it for us.

Best regards,

Tony

0 Gandhar Dighe over 14 years ago in reply to tony martin51903

TI__Genius 15100 points

Thanks Tony.

0 Gil Zhaiek over 14 years ago in reply to Gandhar Dighe

Prodigy 205 points

Thanks Tony!

Gandhar, we did look at the signals and they look fine. Clean and in great shape.

What I am thinking is that looking at the tps65950 code in the driver - drivers/mfs/twl-core.c - there are mutex locks before each of the i2c_transfer. These mutex locks are only per driver and not per bus (I think). I am not sure - but I think of the option where there are 2 threads trying to commuicate on the same bus with 2 different tps65950 virtual i2c devices - 0x48 0x49 0x4a and 0x4b.

Can this cause a conflict?

Should we do a global mutex lock for the bus?

Thank you

Gil

0 Gandhar Dighe over 14 years ago in reply to Gil Zhaiek

TI__Genius 15100 points

Hi Gil,

This is a software question and I am not able to answer. Can you please post this on the OMAP/AM software forum?

Someone in the software team will be able to help. I have forwarded your query to software experts, however, it will probably be faster if you post this on the software forum.

Power management

Power management forum

TPS65950 I2C Control Interface Hangs