This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4AL-Q1: SOC batch: 38POT3S

Part Number: TDA4AL-Q1

Tool/software:

During thermal cycling tests, an unhandled exception occurs with the sensor under the following conditions:

Test Procedure:
The sensor is subjected to a temperature cycle from -40°C to +85°C, repeated three times.

Observed Behavior:
After the third thermal cycle, the sensor consistently encounters an unhandled exception. This behavior suggests a potential issue with temperature-related stress or improper exception handling in extreme thermal conditions.

Our investigation:
During debugging when extracting traces to find the issue right before the exception, we found out that there is a HW interrupt that occurs in the shutdown sequence and hence the exception.

Expected Behavior:
The sensor should operate normally throughout the temperature cycling without triggering any unhandled exceptions.

Request:
Please investigate the root cause of the exception and advise on corrective actions or similar issues with same SOC batch

  • Hi Mohamed,

    Can you provide more details on the specific exception and HW Interrupt that is occurring?  

    Can you also provide a software log that reports out the on die temp sensors over time, say every 1 or 5 seconds?

    What is the ambient and SoC die temperature when the exception occurs?

    Thanks,

    Kyle

  • Can you provide more details on the specific exception and HW Interrupt that is occurring?  

    The exception is Data abort exception, The HW interrupt is most probably an I2C interrupt not 100% sure as it is very hard to reproduce 

    Can you also provide a software log that reports out the on die temp sensors over time, say every 1 or 5 seconds?

    Can you elaborate more what you need?

    What is the ambient and SoC die temperature when the exception occurs?

    40C and 85C

  • Hi Mohamed,

    Thanks for this information. The expert is currently on business travel and will follow up when they are back.

    Thanks,

    Neehar

  • Hi Mohamed,

    Are you using Linux or RTOS to read out temperature? 

    Can you share the logs of the exception?

    Best Regards,

    Keerthy 

  • We are using OSEK to read out temperature

    Which logs are you aiming for?

  • Hello Neehar

                We have critical  Customer delivery to STLA and we identified a SOC issue without any route cause amd we have to identiy from your side this behavior and to anlayse to be able to justify r explain this issue to the customer 

    Out of 12 Soc's from the same patch we have only 1 Soc identified as NOK  at 85 or -40 degree  which is in the SoC limits

    We are requesting to have urgent meeting Today to discuss and share togethers what could be the issue   , So please invite for a call today to be able to proceed .

  • We're aiming for Software logs that show the overall behavior of the system/OS leading up to the failure.  E.g., we would need to know exactly which hardware interrupt is asserting that leads to the exception.

    What is OSEK?  

    The SoC limit is defined as Junction (die) Temp Tj=-40 to 125C.  I believe you're describing an ambient limit of -40 to 85C.   We would want to know if the error is (for example) an "overheating" issue.   The SoC would automatically reset if any of the on-die temp sensors would report >125C (MAXT_OUTRG_ALERT_THR ).  For this reason we would ask that your software logs the on-die temp sensors over time and outputs them to the log.  That way we can understand if the behavior is random vs temperature or associated with a high temp, room temp, cold temp type of issue.

    Can you also simplify your stress test to run a simple linux memtester out to external DDR while temp cycling?

    Regards,

    Kyle