This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM5728: thermal: failed to read out thermal zone at low temperature

Part Number: AM5728

Tool/software: Linux

Hi,

we are using SDK 3.01.06 to test our soc-am5728 at low temperature.

Here is what we meet.

We tested six boards at minus 40 degrees Celsius at the same time,Two of the boards are working normally. The test temperature data are as follows.

root@am57xx-evm:~# ./omapconf show temp
OMAPCONF (rev v1.73-19-gbe8626b)

HW Platform:
Generic DRA74X (Flattened Device Tree)
DRA75X ES2.0 GP Device (STANDARD performance (1.5GHz))
TPS659038 ES2.2

SW Build Details:
Build:
Version: _____ _____ _ _
Kernel:
Version: 4.4.19
Author: cy@ubuntu
Toolchain: gcc version 6.2.1 20161016 (Linaro GCC 6.2-2016.11)
Type: #3 SMP PREEMPT
Date: Wed Mar 13 16:56:27 CST 2019

|--------------------------------------------|
| Sensor | Temperature (C) | Temperature (F) |
|--------------------------------------------|
| MPU | -11 | 13 |
| GPU | -12 | 11 |
| CORE | -11 | 13 |
| IVA | -27 | -16 |
| DSPEVE | -14 | 7 |
|--------------------------------------------|

root@am57xx-evm:~#

The four boards are not working properly. The temperature detection information is as follows.

root@am57xx-evm:~# ./omapconf show temp
OMAPCONF (rev v1.73-19-gbe8626b)

HW Platform:
Generic DRA74X (Flattened Device Tree)
DRA75X ES2.0 GP Device (STANDARD performance (1.5GHz))
TPS659038 ES2.2

SW Build Details:
Build:
Version: _____ _____ _ _
Kernel:
Version: 4.4.19
Author: cy@ubuntu
Toolchain: gcc version 6.2.1 20161016 (Linaro GCC 6.2-2016.11)
Type: #4 SMP PREEMPT
Date: Mon Mar 18 15:45:48 CST 2019

|--------------------------------------------|
| Sensor | Temperature (C) | Temperature (F) |
|--------------------------------------------|
| MPU | -17 | 2 |
| GPU | -18 | 0 |
| CORE | -17 | 2 |
| IVA | NA | NA |
| DSPEVE | -17 | 2 |
|--------------------------------------------|

root@am57xx-evm:~#

And with the printing of error information at the serial port

[2019/3/12 12:48:21] root@am57xx-evm:~# [ 15.813074] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:22] [ 16.313071] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:22] [ 16.813070] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:23] [ 17.313072] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:23] [ 17.813071] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:24] [ 18.313069] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:24] [ 18.813069] thermal thermal_zone4: failed to read out thermal zone (-5)
[2019/3/12 12:48:25] [ 19.313071] thermal thermal_zone4: failed to read out thermal zone (-5)

After debug, it was found that the ADC code value read by the four sets of abnormal boards was about 490, which exceeded the range of Table 18-10. ADC Values Versus Temperature 540 - 945, resulting in error reporting information.

The temperature sensor is still detecting linear temperature changes. When I run a high-power program, the temperature inside the chip rises and the ADC code is valid, the error message disappears. When the program stops running, the error message reappears.

Here is my questions:

1. Whether the operating range of five temperature registers in 5728 chip is -40℃ to 125℃ ?

2. Why is the temperature measured by different chips at the same ambient temperature more than 10 degrees Celsius?

3. Is there any code on the software side that can affect this aspect? Or is the problem affected by hardware design and material quality?

Please give me some debugging ideas or suggestions.

Thanks.

  • Hello CY,

    The temp sensor range is within -40C to 125C. However IVA appears to be running colder than -40C so this is returning the N/A value for you. IVA is a low power IP that does not produce as much heat as other cores. So it will be closer to ambient temperature than the other cores.

    Can you try using the command "omapconf show opp"? This can show you whether there are any modules powered down or if there are differences in the operating frequency and voltage.

  • Hi,

    I tested the data on the Power section today and it all looked normal, but one thing I didn't quite understand, please explain it to me.

    |-----------------------------------------------------------------------------------|
    |                        | Temperature | Voltage | Frequency      | OPerating Point |
    |-----------------------------------------------------------------------------------|
    | VDD_CORE / VDD_CORE0   | -39C / -38F | 1.150 V |                | NOM             |
    |   L3                   |             |         |  266  MHz      |                 |
    |   DMM                  |             |         |  266  MHz      |                 |
    |   EMIF1                |             |         |  266  MHz      |                 |
    |   EMIF2                |             |         |  266  MHz      |                 |
    |     LP-DDR2            |             |         |  532  MHz      |                 |
    |   L4                   |             |         |  266  MHz      |                 |
    |   IPU1                 |             |         | (425  MHz) (1) |                 |
    |     Cortex-M4 Cores    |             |         | (212  MHz) (1) |                 |
    |   IPU2                 |             |         | (425  MHz) (1) |                 |
    |     Cortex-M4 Cores    |             |         | (212  MHz) (1) |                 |
    |   DSS                  |             |         | (192  MHz) (1) |                 |
    |   BB2D                 |             |         | (354  MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_MPU / VDD_CORE1    | -17C / 2F   | 1.240 V |                | UNKNOWN         |
    |   MPU (CPU1 ON)        |             |         |  1450 MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_GPU / VDD_CORE2    | NA          | 1.080 V |                | HIGH            |
    |   GPU                  |             |         | (532  MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_DSPEVE / VDD_CORE3 | -19C / -2F  | 1.120 V |                | UNKNOWN         |
    |   DSP1                 |             |         | (750  MHz) (1) |                 |
    |   DSP2                 |             |         | (750  MHz) (1) |                 |
    |   EVE1                 |             |         | (0    MHz) (1) |                 |
    |   EVE2                 |             |         | (0    MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_IVA / VDD_CORE4    | NA          | 1.060 V |                | HIGH            |
    |   IVA                  |             |         | (532  MHz) (1) |                 |
    |                        |             |         |                |                 |
    |-----------------------------------------------------------------------------------|
    |-----------------------------------------------------------------------------------|
    |                        | Temperature | Voltage | Frequency      | OPerating Point |
    |-----------------------------------------------------------------------------------|
    | VDD_CORE / VDD_CORE0   | -15C / 5F   | 1.150 V |                | NOM             |
    |   L3                   |             |         |  266  MHz      |                 |
    |   DMM                  |             |         |  266  MHz      |                 |
    |   EMIF1                |             |         |  266  MHz      |                 |
    |   EMIF2                |             |         |  266  MHz      |                 |
    |     LP-DDR2            |             |         |  532  MHz      |                 |
    |   L4                   |             |         |  266  MHz      |                 |
    |   IPU1                 |             |         | (425  MHz) (1) |                 |
    |     Cortex-M4 Cores    |             |         | (212  MHz) (1) |                 |
    |   IPU2                 |             |         | (425  MHz) (1) |                 |
    |     Cortex-M4 Cores    |             |         | (212  MHz) (1) |                 |
    |   DSS                  |             |         | (192  MHz) (1) |                 |
    |   BB2D                 |             |         | (354  MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_MPU / VDD_CORE1    | -16C / 4F   | 1.180 V |                | UNKNOWN         |
    |   MPU (CPU1 ON)        |             |         |  1450 MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_GPU / VDD_CORE2    | -15C / 5F   | 1.050 V |                | HIGH            |
    |   GPU                  |             |         | (532  MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_DSPEVE / VDD_CORE3 | -16C / 4F   | 1.080 V |                | UNKNOWN         |
    |   DSP1                 |             |         | (750  MHz) (1) |                 |
    |   DSP2                 |             |         | (750  MHz) (1) |                 |
    |   EVE1                 |             |         | (0    MHz) (1) |                 |
    |   EVE2                 |             |         | (0    MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_IVA / VDD_CORE4    | -33C / -27F | 1.060 V |                | HIGH            |
    |   IVA                  |             |         | (532  MHz) (1) |                 |
    |                        |             |         |                |                 |
    |-----------------------------------------------------------------------------------|

    Above are two sets of test data, the first for the exception board information, and the second for the normal operation of the board information.

    I also read the IVA and GPU nuclear pressure value through the register, found that six pieces of the board are around 1.06V-1.10V, there is no abnormal situation.

    But I noticed a strange point, by reading the register value, found that the abnormal board of the Standard Fuse value is higher than the normal board value, abnormal board value Standard Fuse value is higher than 0xC80, the value of the normal board is less than or equal to 0xC80. I read the values of the 0x4a002458 and 0x4a002194 registers separately, and both came to the same conclusion that the value of the abnormal board was in the 0xCA0.

    I'm not familiar with the concept of Standard Fuse. Does this value change with temperature?
    Does this value have an impact on Vdd_iva or VDD_GPU, resulting in a temperature sensor reading error? If you have a document or wiki page about Standard Fuse, please let me know.

    Thanks.

  • Every device has a minimum safe operating voltage we test for and save in the standard fuse registers. It optimizes every device to run at the lowest power possible. Due to normal variations in the wafers, the lowest power is variable for each device so some are hotter than others. It is best to let the software handle it all.

    It appears your multimedia domains are all disabled. You can optimize the power consumption of the devices by running them at lower OPP NOM voltages instead. You can do this by changing the OPP in the u-boot defconfig. 

    It's interesting your CORE domain temperature is different between the two as well. Are you allowing enough time for the board and the chip to "soak" at the test temperature? Our team uses 10 minutes or more of soak time before measuring.