This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Hercules RM46 I2C Unexpected Restart

Other Parts Discussed in Thread: HALCOGEN

I’m seeing some very weird behavior that we can’t explain. The way the code is working now, we have an external computer sending commands to the Hercules through its serial port. The Hercules is running a debug command manager that translates the requests into I2C commands. So at its simplest, we are doing repeated I2C writes and reads to and from an EEPROM. The commands look like this:

Loop (

i2c write 50 0

i2c read 50 2 )

So all we’re doing is writing an EEPROM address of 0 to the EEPROM (I2C ID 0x50) and then reading back two bytes starting at address 0. This works fine most of the time, but occasionally instead of the write command working as expected, we see it send the I2C address and data (0x00) with no I2C Stop followed immediately by a Restart and the I2C address but no data. There IS a Stop after this address. This is much easier to explain looking at the scope trace in “badI2C.pdf” attached. I’ve also attached the “goodI2C.pdf” that shows what it’s supposed to look like. We can’t find anything in the software (built using Mentor Graphics Nucleus drivers) that explains the occurrence of the restart and the subsequent I2C address without any data. The write in the first part of this sequence is not successful, and the read of the EEPROM gets the wrong data back (reads from an address shifted one bit to the left). Can you please take a look and see if you can think of what could cause the Restart to occur? Does the hardware have the capability to create the Restart without being commanded to by the software? One other note is that putting some delay between the read and the write seems to help prevent the problem. Even without the delay, there seems to be plenty of time for the read to finish, so it doesn’t appear that the I2C calls are overlapping at all.

Thanks!

Dave Mack

5684.badI2C.pdf

0576.goodI2C.pdf

  • Hi Dave,

      Can you please tell me what speed are you running the I2C and if you have the proper pull up resistor on the I2C bus?

  • Hi, Charles. Thanks for responding to my question. We're running at 100K baud and have 4.99K pull-ups on SDA and SCL. I'm investigating a possible error in the calculations which makes the baud rate actually be 102K, but I doubt this is a problem. I've tried slower baud rates without the problem going away. One thing I ought to mention is that we're using an LTC4312 as a mux between separate local and interboard I2C buses. I'm currently only using it on the local bus. On the local bus side, there are also 4.99K pull-ups and 100 ohm serial resistors. I'm attaching a piece of the schematic so you can see what I'm talking about.

    Thanks again,

    Dave

    8510.I2CBridge.docx

  • HI Dave,

    I2C is synchronous. So at 102K baud rate I don't think it should fail unless the EEPROM device can not tolerate higher than 100K just slightly.

    In the bad pdf, the master transmitter is writing to the slave address of 0x75 followed by a write data of 0x0. First of all, the slave address is not matching for you mentioned for 0x50. Is this the correct understanding? Are you writing to address 0x50 or ox75?

    After the first write, it is followed by another write command to 0x75 again with a restart. To really understand why there are two back to back write commands we need to know how the code are generated.

    For the good pdf, the master is also sending a write command to slave address 0x75 but just once. Why do you say this one is correct?

    I don't really know how the 'i2c write 50 0' high level command is translated into the actual commands to the I2C modules. After the write, does it wait for the ARDY before starting reading the EEPROM?

    Here is one userguide for the I2C. Even though it is for the TI DSP but the Hercules I2C shares the same IP module. Please take at look at the flowchart at section 7.3.1 and if the command that you put out is in line with it.

    www.ti.com.cn/.../spru175d.pdf

    Here is another link for some tips to the TI DSP I2C. Again. they will apply to the Hercules since they use the same IP.

    processors.wiki.ti.com/.../I2C_Tips
  • Hi, Charles. Yes, sorry for that confusion about 0x50 vs 0x75. The data in the PDF is a scope trace acquired when the software was communicating with an I/O expander at address 0x75. I've duplicated that condition when communicating with an EEPROM at 0x50 which is somewhat less complicated. (Actually the external computer would stop running after the fault when talking to the I/O expander but not when talking to the EEPROM, so I've focused on that part just for convenience. The phenomenon is the same -- a restart with address but no data.)

    Just to clarify what I'm debugging here, this is all in a driver provided by Mentor Graphics. I'm certainly gaining in understanding of how it works, but I'm not an expert in the code. What I've found is that the code that is supposed to generate the Restart condition is never being called. Also, the driver never uses the ARDY interrupt -- that condition is prevented from generating interrupts. It does wait for "not busy" (the BB bit) before the next transaction. I'm saying the "just once" case is correct because that's what my software is doing. All it's supposed to do is write 0x00 to address 0x75. The read back from there is done in a (much) later read. We aren't using the Restart feature at all.

    If read the "Tips" doc, but not the other user guide. I'll take a look. The 'i2c write 50 0' command translates to doing a single Write of 0x00 to I2C address 0x50.

    Hope this is clearer.
    Thanks,
    Dave
  • HI Dave,
    The user guide for DSP is almost the same as the I2C chapter in the Hercules TRM. Actually our TRM missed the flowcharts in the DSP user guide and we are planning to add them in the future release. The flowcharts provide more information on how to use the I2C module.

    The i2c write 50 0 is immediately followed by the i2c read 50 2. Is it possible that the STT bit is set due to the read command before ARDY is set for which according to the flowchart the ARDY should be checked before writing to the MMR registers to either restart or stop the I2C.
  • 6283.NoStop.xlsxHi, Charles. The reads and writes are interleaved, but there's about 15 milliseconds between them, so there should be plenty of time for the previous read to finish before the next STT is requested. I'm including a I2C trace (captured using a Beagle from TotalPhase) so you can see the timing and what it looks like. In the trace I'm attaching, each write is actually two bytes, so the readback is from address 0x01 rather than 0x00. I made it a two-byte write so I could see if the restart occurs after just one byte is written or waits until after the two bytes are successfully written. It does appear that the two bytes are written completely before the restart happens with the missing data. See what you think.

    As I said, I don't have a lot of  control over the sequence of states in the driver, but I do have the source code, so I can fix it if we figure out what's going on.

    Thanks,

    Dave

  • HI Dave,

     I see the trace and I don't know why the start is not ended with the stop in the highlighted command.

     I understand that your driver is checking the BB bit. Is it possible for you to check the ARDY bit before issuing the next command. I want to know if it makes a difference.

     Do you know how much loading is on the bus? Will you be able to try 10k and see if it makes a difference. I reference the below figures. Depending on your loading and if less than 100pf, you may try a higher pullup resistor.

  • The Stop is missing because it's replace by the Restart which ends up on the next line.

    I tried the code on our evaluation board (from TI) and it shows the same symptoms. It has 10K pullups on SDA and SCL, so I suspect that the pull-up values aren't the problem.

    I changed the code to wait for ARDY before sending the STT, but apparently it isn't that simple since ARDY is never true before starting the Write. I guess it's only associated with the Read operation. So I'm not sure where to add the check for ARDY being true.

    Thanks,
    Dave
  • Hi Dave,

     I think we can exclude the pullup as a cause.

     Not sure if you had a chance to reference the flowchart. Below is one example using polling mode. Will you be able to cross check your code against the flowchart and see if it still created the extra restart bit? If you still have problem after following the flowchart then I think you might be hitting some problem that we were not aware of. The I2C module has been out there for a very long time, not just for Hercules but also in many of TI products. We have yet to come across the problem that you reported.

  • I do suspect that the problem lies in the driver, but I just can't find any bit of code that might be creating that situation. And I haven't been able to catch it doing anything unusual using the debugger and all sorts of breakpoints. The driver for the Sitara processor (which we're also using in the same system) is quite different, even though it also comes from Mentor graphics. I haven't looked at the flowcharts yet, but I'll take a look. Since the problem occurs on the evaluation board as well as our new hardware, I think we can restrict the likely causes to the driver. My hope is that you can tell me what the software would have to do to cause that behavior, or what sort of settings could allow the hardware to do that without processor intervention. I think I said this before, but the problem is intermittent, sometimes very rare, going hundreds of cycles without failing. On other occasions, it happens right away. So there's some timing aspect to it, but I just don't know what it is.

    Thanks,
    Dave

  • Dave,
    I re-looked at the bad/good pdf and also the trace again but still can't come up with a reason why the I2C will appear to generate an extra write command. Since it may be a driver issue for which I'm not familiar with I will suggest that if you could reproduce this problem by writing your own code. You can start with the example from HalCoGen.

    Another thing to ask if you can see the same problem if you only send write commands to the EEprom in a loop or only read commands instead of interleaving between writes and reads. I'm curious to know if that makes a difference.
  • It's funny you should ask that, because that's exactly what I just did! The problem can occur with 100% writes. So it doesn't have anything to do with something the read does to the write timing. The errors occur between 1 and 4 minutes apart, roughly -- sometimes just 10 seconds apart. I'm planning on trying it without any activity on the serial port. It's possible that those interrupts are causing problems with the timing for the I2C. I'll keep you posted.
  • HI Dave,

      Until now I never asked what other things the CPU is also handling other than  the I2C. It is a good move to check if other activities can affect the I2 operation. The I2C interrupt is mapped to channel 66 on the VIM. It is of lower priority compared to many other peripherals. It may be possible that there are multiple higher priority interrupts become pending around the time when the I2C is having problem. 

  • Hi, Charles. So it turns out that the serial port activity DOES affect the I2C transfers, though I don't know what the mechanism is. Here's what I did:

    1. Turned off the external computer sending commands to the serial port of the Hercules
    2. Wrote a test program running on the Hercules to send the write 50 0 command over and over again on the i2c

    With the commands running this way, there were no I2C errors whatsoever. I ran it for millions of transactions without problems, but when I started sending carriage returns on the serial port, I started seeing I2C errors. The occasional errors would persist for a few times after I'd stopped hitting carriage return. This was pretty surprising and I haven't been able to figure out why that would happen. But the errors would stop, and it could go millions more transactions without errors.

    Some observations about the drivers:

    1. The I2C driver disables interrupts during its Interrupt Service Routine
    2. The serial driver does not disable interrupts during its ISR

    So it looks like the I2C code gets interrupted when the serial interrupt happens (but not during the ISR) and when it returns from the serial ISR, sometimes it doesn't resume properly and creates the repeated start condition. I've tried to duplicate this by adding delays in the I2C code at "strategic spots", but I haven't had any luck causing the repeated start. The Start command doesn't appear to be getting called any extra times. My code checks to see that Starts and Stops are equal and we don't see two Starts in a row, and that never happens.

    So maybe it's not really a repeated start, even though it looks like one. I dunno. It's pretty frustrating.

    Thanks for your help so far!

    Dave

  • Dave,

    IRQs are automatically disabled by the processor upon entry into an IRQ exception.

    If you see that the driver for I2C is running with interrupts enabled, that means someone implemented a special sequence of code for nesting.   (or if they didn't then it won't work right!)

    What CPU mode is that ISR for the I2C running in?  If it is nesting - it would have had to change modes from IRQ mode to SVC or SYSTEM mode first, before re-enabling interrupts.   If you re-enable interrupts from inside IRQ mode then the IRQ LR would be corrupted when another interrupt interrupts the ISR.  

    I could see where if this isn't being handled correctly - it would be the source of an infrequent issue ...

    See this appnote SPNA219 by for details on what is required.   

  • Hi Dave,

      Like Anthony said, the IRQ interrupt (the I bit in the CPU's CPSR) is disabled upon entry into the IRQ vector. So you can check if the  Mentor Graphic's driver is re-enabling the I bit to cause other higher priority interrupt to preempt the I2C ISR from finishing. If this is the case, you want to keep the I bit disabled during I2C ISR. 

      One more thing you can try to is remap the I2C interrupt to a higher priority channel than your serial port interrupt. You can easily do this in the VIM module. There is the VIM Channel Mapping register. Currently the I2C is mapped to VIM channel 66. Try map it to higher than the serial port and see if that makes a difference. 

  • Hi, Anthony. Thanks for joining the investigation. You're right that IRQs are disabled for both of the ISRs. I checked the CPSR register and bit 7 is already set as it enters each of those ISRs. So when I said that interrupts were disabled in the I2C ISR, that was because they are explicitly disabled at the beginning of that ISR, but they aren't in the Serial ISR. It looks like that doesn't matter, though.

    So I don't think any nesting is going on.

  • HI Dave,
    Have you had a chance to remap the I2C to higher priority in the VIM?
  • Well, here's what I did. I wasn't sure if it was the interrupt priority or the interrupt vector which is the critical factor for this, so I tried changing each of them. Changing the serial interrupt vector (to 70 or 74) disabled the serial port, so I guess that wasn't it. I changed the serial priority from 1 to 10 (I2C was set to 5). When I ran it this way, I got spontaneous I2C errors without doing anything to the serial port. Sadly, when I switched that priority back, I continued to get the errors. So now I can't seem to get back to the case where the errors don't happen unless I send characters into the serial port (from TeraTerm). Unfortunately, I've been asked to stop working on this since we have a temporary work-around and I've got a deadline for other software at the end of March. So for now, I have to leave this alone. If you can think of any condition of the hardware (as programmed by possibly messed up software) that would create this type of restart error, please do let me know, but I have to stop experimenting for the time being.

    Thanks very much for your help, Charles and Anthony.

    Dave
  • HI Dave,

      I forgot to tell you that when you remap a channel you also need to enable the new channel via the REQENASETx register. Perhaps this is what is missing. Hopefully when you have a chance to get back to debug this again we are here to assist. 

  • So are the channel and vector the same thing? Is it the vector that determines the priority, rather than the other number which shows up as priority in the configuration file? Maybe this isn't clear because the description is unique to the Mentor build environment. But as a default, the serial driver has interrupt vectors of 66 and priority of 1, while the I2C driver has a vector of 68 and priority of 5. Which needs to be adjusted? And is a lower number higher priority?

    Thanks!
    Dave
  • Hi Dave,

      Not sure if you are using the VIM API to manage the channel remapping. 

      The I2C is currently mapped to interrupt request 66 which by default also maps to VIM channel 66. The interrupt request connection is hardwired but the channel mapping is programmable. Supposed you want to remap this interrupt request 66 to channel 2 you can do the below. This example assumes that you are not using existing VIM channel 2 which by default is reserved for RTI interrupt. 

      vimChannelMap(66,2, &i2cInterrupt);  // remap request 66 to channel 2. The interrupt vector for I2C is i2cInterrupt

      vimEnableInterrupt(2,SYS_IRQ); // enable channel 2 as a IRQ interrupt

      Perhaps you can try this when you have a chance to get back to this. 

  • And:
    - lower channel numbers mean lower priority
    - the vector is different than the channel number. the vector (in vectored mode) is a table lookup from the VIM RAM.
    so when channel 2 is the highest priority interrupt, the vector that is passed to the CPU is the one you load into the VIM RAM for channel 2.
  • Thanks, Anthony and Charles. I'll definitely use this information when I get back to debugging this.

    Dave

  • I finally did debug the problem, and it was an error in the device driver. In the very rare case that the SendStart routine was interrupted in the middle of a read/modify/write macro, it is possible for the Start signal to be sent twice. The same value (with the STT bit set) was written twice in a row in the driver. This made no difference when the two writes were very close together, but if there was a context switch between them, then the STT bit was reset after having taken effect and clearing (in the hardware). I was able to remove the second write and the code works fine now. Thanks for your help.

  • Hi Dave,
    Very glad that your problem is resolved.