This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM623: Thermal shut down sequence

Part Number: AM623

Tool/software:

Hi,

My customer wants to implement thermal shutdown mechanism by themselves.
Please note that they are not using Linux SDK. They uses VxWorks.
I know Linux SDK has such thermal management features.
https://software-dl.ti.com/processor-sdk-linux/esd/AM62X/11_01_05_03/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/VTM.html#voltage-thermal-management-vtm

The customer is asking shut down sequence used in Linux SDK so that they can implement it.
Could you let me know the shut down sequence in details ?

Thanks and regards,
Koichiro Tashiro

  • Hi Tashiro-san,

    There is a hardware and a software component to thermal shutdown. When the device 'enters thermal shutdown', its not completely off so it can execute a reboot when safe to do so.

    Hardware Defined Thermal Shutdown

    Within the AM62x processor, there are two on-die temperature sensors as indicated in this FAQ:  [FAQ] AM625 / AM623 / AM620-Q1 / AM62Ax / AM62D-Q1 / AM62Px / AM62L / AM64x / AM243x (ALV, ALX) Custom board hardware design – Voltage and Thermal Manager (VTM) 

    These sensors will measure the junction temperature and send the data to the VTM module. The software will need to configure the VTM's readings, define the critical temperature point at which temperature thermal shutdown should occur, and a safe temperature to reboot the SoC.

    When the temperature sensors read a value above the critical temperature point, it will put the SoC into thermal reset until the SoC cools down to a safe temperature based on the temperature senor's readings. When the SoC is in thermal reset, the VTMs will remain on to continue read the junction temperature and determine if its safe to reboot the SoC.

    Refer to 6.2.5.5 Voltage and Thermal Manager (VTM) in the AM62x TRM: https://www.ti.com/lit/ug/spruiv7b/spruiv7b.pdf

    Software Configuring the VTM

    The software is responsible for configuring the hardware in this case. Refer to the VTM Linux driver: git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/thermal/k3_j72xx_bandgap.c?h=ti-linux-6.12.y

    You can see in 'k3_j72xx_bandgap_init_hw' that software will need to initialize the VTM module using register writes including configuring the temperature sensors, defining the critical shutdown temperature and the cool down temperature.

    The way the linux driver is defined, critical shutdown will occur at 123C to account for any inaccuracy of the VTM module. Then once the SoC is in thermal reset, it will have to cool to 105C before attempting to reboot.

    There is a bit called 'K3_VTM_ANYMAXT_OUTRG_ALERT_EN' which enable the thermal shutdown mechanism.

    Software Defined thermal shutdown

    In the SDK docs link that you shared, that is the software defined thermal shutdown. The mechanism I described above is for hardware defined thermal shutdown.

    In software defined thermal shutdown, the same VTM temperature values are periodically checked against the temperature ranges defined in the device tree. In the case that the VTM temperature value exceeds the device tree trip point, then thermal_core.c (https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/thermal/thermal_core.c?h=ti-linux-6.12.y) will be called to execute a software based thermal shutdown. When this occurs, Linux will shutdown and will require a reboot afterwards.

    The software defined thermal shutdown and the hardware thermal shutdown typically work cooperatively.

    The software defined thermal shutdown is not required since the Linux driver will just configure the VTM registers to execute the thermal shutdown. So the device tree and themal_core.c portions might not be needed depending on the customer application.

    Please be aware that VXWorks is beyond TI support scope on the E2E forums.

    Thanks,

    Anshu

  • Hi Anshu,

    Sorry for my late reply. The customer wants to implement much simpler mechanism.
    Such as;
    Step#1: Check device temperature with VTM_TMPSENS_VTM_TMPSENS_STAT_J_J[9:0]bits
    Step#2: If the temperature reaches to specific threshold, stop A53 cores. No need to restart cores when the temperature is down.

    The customer wants to know any register bitfield which can be used at step#2.

    Thanks and regards,
    Koichiro Tashiro

  • Hi Tashiro-san,

    No need to restart cores when the temperature is down.

    Can you clarify what this means? Lets say the junction temperature reaches 125C then cools down to 50C, are you not requiring the SoC to attempt to boot the OS again?

    Thanks,

    Anshu

  • Hi Anshu,

    are you not requiring the SoC to attempt to boot the OS again?

    Correct. The system will be recovered (power cycle the system)  by an end-user.

    Thanks and regards,
    Koichiro Tashiro

  • Hi Tashiro-san,

    Correct. The system will be recovered (power cycle the system)  by an end-user.

    Thanks for clarifying. Then the Cool Down value can be set to an unrealistic value so it will never try to reboot the OS.  Just be aware that this usecase has not been tested by TI.

    Thanks,

    Anshu

  • Hi Anshu,

    Then the Cool Down value can be set to an unrealistic value so it will never try to reboot the OS.

    Do you mean the customer just use "Hardware defined thermal shutdown" with an unrealistic cool down value?
    The customer wants to trigger the shutdown by software as I mentioned at step#2 above.

    The customer wants to know any register bitfield which can be used at step#2.

    We do not have such register?

    Thanks and regards,
    Koichiro Tashiro

  • Hi Tashiro-san,

    Step#2: If the temperature reaches to specific threshold, stop A53 cores.

    This can be handled by the registers.

    Refer to the TRM: https://www.ti.com/lit/ug/spruiv7b/spruiv7b.pdf

    MMR_CFG2_VTM_MISC_CTRL has a bit to enable thermal out of range. By enabling this bit, thermal shutdown will occur if the VTM reads a temperature outside of the range.

    MMR_CFG2_VTM_MISC_CTRL2 Bits [9:0] defines what temperature is considered out of thermal range.

    If the Cool Down Temp is not required (MMR_CFG2_VTM_MISC_CTRL2 Bits [25:16]), then it can also be set to some number that is not achievable.


    Thanks,

    Anshu

  • Hi Anshu,

    This can be handled by the registers.

    I understood the configuration is done by registers, but the point is the shut down process is done automatically without software knowing it.
    The customer wants to trigger the shut down process by software when the software checks the temperature and the threshold is reached.
    So the software can do something for example putting alert on display before shut down.

    Thanks and regards,
    Koichiro Tashiro

  • Hi Tashiro-san,

    If this process needs to be software driven, then the Linux driver will have to handle it.

    In the Linux device tree, we define the trip points to trigger a thermal shutdown: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/arch/arm64/boot/dts/ti/k3-am62-thermal.dtsi?h=ti-linux-6.12.y

    This will call the 'thermal_core.c' driver Linux driver: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/thermal/thermal_core.c?h=ti-linux-6.12.y

    When the junction temperature reads a value greater than the trip point, it will tell Linux to halt and it will shutdown.

    Thanks,

    Anshu