This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

UCD9090 alert response causes invalid command status

Other Parts Discussed in Thread: UCD9090, UCD9090A

Been chasing this problem far too long:-(.

Right now, just a UCD9090 and PIC (master).  After power is sequenced up and no failures I can read the ARA no problems, of course the UCD9090 does not ACK since there is no alert.  I can read status, voltages, etc. without any issues.

If I cause a supply failure, the UCD asserts PMBALERT#.  2 things I don't understand.  1) if I read STATUS_WORD before reading the ARA, the status is what I expect (8801h).  But if I read ARA first and then STATUS_WORD I get 8803h, and reading STATUS_CML is 80h indicating "invalid command".  I don't understand why reading the ARA causes the "invalid command" status to show up.  I have tried reading the ARA with and without the PEC byte, but for some reason adding the reading of the PEC byte results in a timeout during a following STATUS_WORD described more below.

2) If I read the ARA with PEC byte after the PMBALERT# is asserted, then read STATUS_WORD, I seem to get a timeout on the first one, but a second STATUS_WORD completes in a normal time.  The message looks like S-Add-wr-Ack-0x79-Ack-RS-Add-rd-Ack then the UCD9090 stretches the clock to cause a timeout.  If I don't read the ARA with the PEC byte, this timeout behavior does not happen, but I still get the "invalid command" status described in #1 above.

Also odd to me is that reading the ARA does not de-assert the PMBALERT# as I would expect.  I have to either reset the UCD9090 or send CLEAR_FAULTS to de-assert the PMBALERT# output.

I get the same behavior when I do this with the USB interface and have compared the transactions with a logic analyzer, so it is not related to the PIC.

Any guidance is appreciated.

  • If the clock stretching is within 25ms, it is within PMBus spec. The host should support it.

    Is there any other PMBus device on the same bus that may stretch the clock?

    The CML fault is very likely related to the timeout. So, please add clock stretching and see if the problem goes away.

    The fault status and PMBus alert requires CLEAR_FAULT to clear. It is expected behavior.

    Regards,

    Zhiyuan

  • I am not positive about the length of the timeout since my logic analyzer can't measure that much time (if I trigger on START I can see 10's of ms with nothing following, and if I trigger on STOP I can see 10's of ms preceding the last 2 bytes). So, I suspect it is a valid timeout. What I do know is that I the master just waits to read, the clock stretching stops and it reads 2 bytes of FF. That sort of implies that the UCD9090 is not responding but has given up the transaction after the timeout.

    Edit: The UCD9090 is on the evaluation board, so there is a TI USB interface on it.  However, during these experiments, I do not have the Fusion GUI running and I think that interface is not interfering.  I know it is not generating any transactions.  I suppose it could be stretching the clock but I cannot really tell.  I suspect it is not.  There are no other devices on the bus, just the PIC, the UCD9090 and the built in TI USB interface.


    I have 25ms timeout in my code, and it appears that is not being exceeded. I will have to experiment a bit more and possibly adjust the time to see if that is the case.

    The invalid command is definitely not a result of the timeout, it is related only to the reading of the ARA. I have many times read STATUS_WORD and STATUS_CML when the PMBALERT# is asserted and the invalid command bit is not set. I can read the ARA once, the repeat STATUS_WORD and STATUS_CML and the invalid command bit is set. And as I said before, if I don't read the PEC byte as part of the ARA I don't get the timeout, the transaction is in one contiguous block.

    So, why is invalid command being set?

    And why would clock stretching happen or not happen just because I read a PEC byte when reading the ARA?

    Thanks.

  • I have done a bit more fiddling.

    1). When a read does not complete in 25ms, I consider this a timeout and NACK the data when it eventually comes and follow this with a STOP. I finish the read transaction, and it contains FF at 34ms. So it looks like the UCD9090 is releasing the clock at about that time.

    2). The timeout only occurs after reading the ARA.

    3). The "invalid command" bit is set as result of reading the ARA.

    4). A subsequent STATUS_WORD does not timeout, nor do any further ones until I read the ARA.

    Can anyone tell me:
    1) Why on earth would the UCD9090 consider reading the ARA to be an invalid command?
    2) Why on earth would reading a PEC as part of the ARA cause a STATUS_WORD to timeout when reading the ARA with no PEC does not timeout?

    I plan to never read the ARA with PEC since I am not plagued with the timeouts. But I do not understand why this is considered an invalid command, and am not sure what the best way to deal with it is.

    Anyone with any insight, your comments are most welcome. And thanks in advance!
  • For the INVALID CMD bit after ARA command, we were able to reproduce and confirm this is an existing issue of the device.

    ARA command does clear the ALERT signal, but ALERT will be back if the FAULT is not removed.

    As for the timeout after PEC with ARA, we could not reproduce the problem. Could you please capture a waveform to show this?

    Thanks,
    Zhiyuan
  • I hope this all comes through.  I am going to try and post a word document with the screen shots in it.  I can get more detail if required.

    First is a screen shot of the reading of the ARA without the PEC.

    Second screen shot is a STATUS_WORD followed by a STATUS_CML.  Note, my current code does both of these now, but the timeout happens if I only attempt a STATUS_WORD.  Let me know if you need screen shots with just this message.

    Third is a screen shot of reading the ARA and the PEC.

    Fourth screen shot is at the end of the timeout.  I am triggering on a STOP condition, once I detect the timeout I make the read with a NACK so just get one byte.  The byte is FF.  The next two transactions are the retry of the STATUS_WORD and the STATUS_CML which both work fine.

    The fifth screen shot is just a close up of the same transaction in the forth screenshot.  You can see the tail end of the message that timed out, then the STATUS_WORD and the STATUS_CML.

    I can capture the time out by triggering on the START instead, but as described previously, after the address and read bit, there is just a long delay to the end of my logic analyzer buffer.

    My code now starts a ms timer before each read.  During the read if the time exceeds 25ms, it NACKS the byte read and issues a STOP and retries the entire transaction.  My timer indicates that I finally get the byte (the FF) that I NACK at about 34ms which jives with what I would expect.

    This timeout always occurs after reading the ARA and PEC.  It never occurs if I read the ARA without the PEC.

    Screenshots.docx

  • May I ask another question? Perhaps this should be in another thread.

    I have a set of 4 rails that require other rails to come up first (up dependencies), but if they fail it is not necessary to take any other rails down. However, if any of them does have a fault, I would like to deassert their enable to make sure they go down and stay down. I do not want them to try to resequence. I have not been able to figure out how to do this. If I can do it in the Fusion GUI, great. But, I have an MCU that can command the rail enable to turn off. The closest I can come to figuring out how to do this is use GPO_SELECT and GPO_CONFIG but I worry that will interfere with the automatic bring up of the rails.

    Thanks.
  • If you want the EN of the faulty rail to deassert and stay low, you can configure Fault Response in Fusion GUI, and specify no restart and no resequence. Note that, you need to select the corresponding rail in the drop-down list on the top-right corner of the interface.

    If you want other rails to shut down as a result of a fault of a particular rail, you can select other rails as this rail's Fault Shutdown Slaves. When this rail has a fault and shuts down itself, its slaves will also be shut down.
  • Thank you.  My enables now work as intended.  I thought I had set them that way, but must not have.  Anyway, I am almost done.

    Any more information you need on the timeout?  For now, I am not going to read the PEC with the ARA and the UCD9090 seems to behave itself.  I also have code that does retry if a timeout occurs, so I am able to proceed.

    The setting of the "invalid command" when reading the ARA is a bit harder to deal with.  Is there a chance that a firmware update for the UCD9090 would fix that?  If so, what sort of time frame might that be released?

    Thanks,

    Bill

  • I suspect that the master and the slave are not seeing the same clock signal voltage level. The master might “think” the slave has released the clock, but the slave “thought” it was still holding the clock low. As a result, the master started to switch the clock, and the slave had no action on the data line. Consequently, the master saw a “0xFF”.

    It will be helpful if you can capture the I2C voltage waveform on both ends (near master and near slave) when the problem happens.

    Thank you very much.

    Zhiyuan
  • Well, I guess it wouldn't hurt to get a scope shot of all this. I can probably get the entire transaction with a timeout in one trace as well.

    However, my analysis is a bit different than what you suggest. First of all, the master is configured to disable all clock stretching, so I know it is not holding the clock low. Second, the master decides after 25ms that the slave is trying to reset the bus, so it prepares to NACK the read cycle it is waiting on. At around 34ms, the slave feels that it has held the clock low long enough to guarantee a reset and releases the clock. At that time, the read cycle completes, but with no slave responding the master reads FF.

    Previously, when I did not have the master looking at time, it just waited as long as necessary for the 2 bytes that it wanted. It ACKed the first and NACKed the 2nd since it was the last. The data on both bytes was also FF. That tells me that the slave was doing the same thing, it held the clock low for a minimum of 25ms and then released it and then ignored whatever was going on on the bus.

    And of course there is the whole issue that this only happens if I read the ARA with a PEC byte. By the way, I can do this using the UCD9090 EVM with its built in I2C interface, no other devices at all. I don't think the signal levels should matter when the circuit is confined to just that board.

    Also, I have disabled all fail logging just to make sure the UCD9090 is not writing to flash that would cause this.

    Comments?
    Bill
  • Hi Bill,

    UCD9090 has timeout after 35ms clock stretching. Could you please disable master timeout, and wait for UCD9090 to terminate the clock stretching? If the returned value is still 0xFF, could you please check which bit in the CML status is set, in particular, whether bit 1 is set? Please make sure the CML fault is cleared before the test.

    Thank you very much.

    Zhiyuan

  • I removed the master timeout as requested.  But my results are a bit different than in the past.  I am assuming it is related to other configuration changes I have made to the UCD9090.

    For now, I disabled the master timeout and read both words.  When I read both bytes, they are both FF.  The clock stretching still appears to end at approximately 34ms.

    Now, something has changed in the UCD9090, but the only thing I think I changed was the behavior of the supply enables to turn off when a voltage fault is detected.  Now, when I read the ARA with PEC byte, the ALERT# gets deasserted.  This is new.

    Also, per my notes (before the supply enable change), I would introduce a voltage fault.  The STATUS_WORD returned was 0x8001 and STATUS_CML was 0x00.  I would read the ARA with PEC byte, then read the STATUS_WORD with timeout.  Then read the STATUS_WORD again and it would return 0x8803 and STATUS_CML would return 0x80 (invalid command).

    Now with the supply enable change, I introduce the voltage fault, STATUS_WORD returns 0x8801.  I read the ARA with PEC byte and ALERT is deasserted.  I again read STATUS_WORD and it is 0x8803 and STATUS_CML is now 0x02.

    I am a bit confused as to why the UCD9090 configuration would change the STATUS_CML from 0x80 to 0x02.  If I get some time I will try an old configuration and see if I can recreate this.

    Bill

  • More information.

    I returned my routine to read the ARA without the PEC. Now, the UCD9090 does not deassert the ALERT. With the proper timeout code in place, STATUS_WORD and STATUS_CML are 0x8801/0x00 before reading the ARA, and are 0x8803/0x80 after reading the ARA. But if I read the ARA with the PEC byte, STATUS_WORD/STATUS_CML return 0x8801/0x00 always, before and after reading the ARA and the ALERT is cleared properly.

    I thought reading the PEC was optional? Clearly it has an effect.

    Bill
  • Spoke too soon:-(.

    1. Reading the PEC byte following the ARA is key to getting the UCD9090 to release the ALERT# signal. Reading without the PEC byte by itself results in the "invalid command" bit set in STATUS_CML.

    2. Reading STATUS_WORD after the ARA+PEC sometimes results in the UCD9090 stretching the clock to reset the bus, not always, but sometimes. When it does, the STATUS_WORD/STATUS_CML is 0x8803/0x02. The same status before reading the ARA+PEC is 0x8801/0x00. And unless the STATUS_WORD transaction times out, I get 0x8801/0x00.

    Now I am faced with choice between two evils. Do I clear the CML error and let the UCD9090 release the ALERT line, or do I deal with the ALERT line and avoid the CML errors?
  • Hi Bill,

    There is a bug in the FW that can cause a CML fault when ARA is received without PEC. The new CML fault causes Alert to be asserted again. This is why you see ARA without PEC cannot clear Alert. We will fix it in the next release, possibly early next year. Sorry for the inconvenience.

    Best regards,

    Zhiyuan 

  • Zhiyuan,
    Thank you for the update.

    I am in the unfortunate position of having to deliver this product to my customer next month. Unless I am able to get some early version of the FW, I won't be able to wait for the fix. Is there any way I can get source code and fix it myself?

    What is TI's recommended work around? I think from a performance standpoint, reading the ARA without the PEC is the best, as there are no timeouts introduced. If I decide to go this route, is there a way to mask the "invalid command" so that I can clear the alert line, or is my only option to send CLEAR_FAULTS command?

    If I read the ARA + PEC, it seems I do not get a timeout every time, but when I do, it sets "other comm" bit in CML. What I typically see is a voltage fault occurs, the UCD9090 asserts the ALERT, I read the ARA and the ALERT is deasserted. I go to read the status and get a timeout, which causes another ALERT to be asserted that needs to be read again. This is often repeated as each timeout causes another ALERT.

    Suggestions?

    Bill
  • Hi Bill,

    We can't give out source code. Sorry.

    Can you directly poll the sequencer instead of using ARA?

    Or, you may read ARA without PEC, then read status word, then clear faults.

    Would any of above two methods work for you?

    There is no way to mask alert in UCD9090. Sorry.

    Regards,

    Zhiyuan

  • Zhiyuan,

    I think I will look at polling a bit closer, and either of the two suggestions you made should work for me.  At least I know now that the behavior I saw was the UCD9090 and not my software:-).  I would very much like to know when new FW is released, if there is any way to get on a notification list, or if there is a way I can check, please let me know.

    I am marking this as answered.  Thank you for all your help.

    Bill

  • I'm seeing something similar with UCD9090s bought this year (2017).  Has the promised firmware update been released?

  • could you try UCD9090A instead of UCD9090?

    Thanks
    Yihe
  • No, the boards are already built.