This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMP117: stuck with the temperature reading significantly above real

Part Number: TMP117

We have TI TMP117MA WDFN6 part attached to a PCB. The TMP117 is covered with a brass block (25x25x3mm). Thermal paste is applied between them to ensure proper thermal contact.
The PCB has a resistive heater trace inside used to heat the block up.

Usually, the reading of the temperature from the TMP117 agrees with the temperature of the brass block. As we apply the current through resistive trace, block the sensor temperature rises, and we can successfully stabilize it at around 36degC. The sensor mode is the default (0x220 in the control register). The high-temperature limit is set to 60degC.

Occasionally, we get an overtemperature alarm from the sensor. Our firmware logic shuts the heating off. By the time we can get to the unit the temperature of the brass block is ambient (verified by touch, about 25degC).
Very infrequently, the sensor appears to be stuck reading ~60degC. One case was 55degC.

* TMP117 remains responsive to I2C commands. We are able to obtain register dumps

* The temperature reading is updated. The value is jumping +- 0.2degC around the stuck value.
* The sensor reads 0x220 from the control register (Except first reading, when the alarm bit is set), temp offset register is empty (0). Setting the temp offset register to 0 gives no result.
* Soft reset of TMP117, I2C General Call Reset are acknowledged but do not resolve the situation
* Powercycle - even 0.5s one - resolves the problem (reading returns back to expected ambient temperature)

This happens intermittently, we are not able to reliably drive the system to this state.

Please, help us troubleshoot.

  • Hi Mikhail,

    How are you receiving the alarm? Are you monitoring the ALERT pin output, or polling the configuration register?

    If monitoring ALERT pin, please monitor with an oscilloscope with the hope of capturing a glitch here. The power cycle you describe is a latching behavior. We would need to determine if our output is latching in assert state or if the latch is occurring further down the line in the application circuit. 

    If polling the configuration register, please monitor your bus. You may need to use an oscilloscope to analyze bus failures. I will help you inspect any scope pictures you post. 

    As an aside, have you unlocked the EEPROM and changed the default configuration? 

    What is the goal of this application?

    thanks,

    ren

  • Hi, Ren, and thanks for the reply,

    We are monitoring the ALERT pin output. The pin is still reported asserted (Active low) on the MCU (STM32). We do not read the configuration register automatically to deassert it. On the first (manual) readout of the configuration register, the bit is reported asserted. The same bit is reported deasserted on the subsequent reads.

    We have about 20 units, this error pops up about once every couple of months at random, on a random unit. Is your suggestion to monitor 20 simultaneous channels + 20*2 i2c bus channels for multiple months?

    What we are concerned about is not the overtemperature and not the latching of the ALERT pin. We are concerned by the "latching" of the temperature read from the temperature register. It is not quite latching, as the value keeps jumping +-0.2 degrees.

    You say that "The power cycle you describe is a latching behavior". Have you observed similar behavior before? I was not able to find it neither in the datasheet nor in any of the application notes. I was not able to find any kind of errata as well.

    Re: EEPROM unlocking. Yes, we use 48 bits available to us to store the serial # of the unit. The EEPROM remains locked during the failure described above. The default configuration is not changed.

    The goal of this application is to control the temperature of a device that is mounted on top of the brass block. We use the sensor in the device as a primary one and TMP117 as a secondary one.

  • Dear Mikhail - 

    is it possible that the brass block is putting any weight on the TMP117?

  • Josh,

    yes, but not directly. The brass block is attached to the PCB using screws. A special cutout is made for the sensor, but the cutout is filled with the thermal paste. So the brass block may exert some pressure on the TMP117 through the paste.

    How would it affect the latching behavior?

  • Mikhail - that was just general question to make sure the part is not being stressed - this is a very sensitive part and it does not take much external influence to move it around. So, if you are not putting active stress on it, great -

    Were you able to get the scope captures Ren requested, do you have any images of this setup and the schematic with which you are operating with? 

  • Hi Mikhail, to address your first response to me:

    What you describe in your first paragraph is correct operation of the ALERT functionality. 

    Troubleshooting is often not easy! I understand your concern. It will be necessary to capture the failure to analyze it. Otherwise, we are only speculating. I'll continue to speculate with you, but we can't go very far without more information.

    If you can show me that TMP117 reads an incorrect temperature, I would have to ask you submit this device as return so that it can be analyzed. Unfortunately, it does neither of us any good to request the unit you have now, because your condition is not repeatable. I can't think of an obvious configuration error that would cause the TMP117 to behave this way. One obvious thing to look for is communication problems. Do you have other devices on the I2C bus? Is there a noise event that is corrupting the received communication? Has something gone awry in your software decoding of the temperature?

    The latching I mentioned is a failure mode that happens in IC from time to time. It's not specific to TMP117. We strive to design products which don't have these faults. There's no known problem with TMP117. Further, I was referring to an 'analog' type of latching where a single transistor is stuck in a state it should not be. This wouldn't be the same type of latching you claim to experience, where the device repeatedly communicates an incorrect value. The fault would have to be deep within the digital logic, and it's not prone to doing things like this.

    thanks,

    ren

  • Hi, Ren and Josh. Is there a way to move this conversation to a more private channel?

  • sure we can close this and I have sent you an email, cc'ing Ren.