This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RM48L530: Evaluation of nERROR required for SIL certificate? Newbie in SIL

Part Number: RM48L530
Other Parts Discussed in Thread: TPS65381A-Q1

Hello everybody,

I am new to the field of SIL (and in this forum)  and have taken over tasks of a former colleague.

It is my turn to finish an engaged circuit and I would be pleased to hear from an indication whether the nError signal of the μC must be evaluated for a SIL-compliant product (and if so, how).

It would be nice if you could also give me the reference from which you derive your statement (datasheet, reference guide, standard,... if possible with indication of the chapter or the page).
Due to the shortness of time, it has been impossible for me to find and read all documents that may be necessary.
I am grateful for any reference.

Thanks.

Regards

WA

  • Hi Winfried,

    The actual implementation of monitoring the nERROR signal and the system response depends on your application and the safety requirements associated with it. We do not require any specific implementation. The functionality of the nERROR signal is documented in the technical documentation: datasheet, technical reference manual, safety manual.

    nERROR signal is asserted (driven low) by the MCU whenever an error of high severity is detected. It means that an error is detected on-chip which the CPU cannot address, e.g. double-bit ECC error from flash or RAM. In such an error condition code execution is unpredictable, and an external “safety monitor” is recommended to put the system in a "safe state". This safe state depends on the actual application and can only be defined by the system integrator. Quite often a safe state for an MCU is to hold it under a power-on-reset. This can be done by using a power management IC (PMIC) as an external "safety monitor" that puts the MCU under this power-on-reset condition. See the TPS65381A as an example PMIC. This PMIC is designed to work with Hercules MCUs to specifically address functional safety requirements.

    Regards,
    Sunil
  • Hello,
    thanks for your service.

    I have checked the security manual, datasheet and technical reference manual for the keyword "nERROR", but I have quiet questions about the behavior and the evaluation of the nERROR signal.

    I understand that nERROR goes low when a high severity error occurs.
    I'm still not clear:
    1.) whether it is necessary in a SIL-2 application to evaluate this signal to be compliant with SIL-2.

    2.) whether the nERROR signal occurs only in case of high severity or also in case of a reset (or in case of a nERROR test / diagnostic check - is there such a thing?).

    3.) how to avoid that I get stuck in a reset loop, when I connect the ERROR signal e.g. with the manual reset input of my watchdog.

    4.) In the data sheet it is written that nERROR "can be used as an indicator to an external monitor circuit to put the system into a safe state. "

    We do not use the TPS65381A as a PMIC, but have used different PMIC devices and a separate watchdog device based on different requirements.
    In this context I do not quite understand what the TPS65381A actually does with the nERROR signal and how I can transfer the behavior on our PMIC / Watchdog.

    Can you help me there?
    I am grateful for any pertinent advice.
    Regards,
    WA
  • 1.) whether it is necessary in a SIL-2 application to evaluate this signal to be compliant with SIL-2.

    >> Compliance to SIL2 level still requires a particular safe failure fraction (I think up to 99% for complex devices such as MCUs and processors). The nERROR signaling and handling is a key mechanism for addressing many of the failures detected by the safety diagnostics implemented on-chip. This will increase the number of undetected (unaddressed) dangerous failures in the system.

    2.) whether the nERROR signal occurs only in case of high severity or also in case of a reset (or in case of a nERROR test / diagnostic check - is there such a thing?).

    >> nERROR is asserted in case of errors of high severity. The datasheet includes descriptions for all conditions in which nERROR is asserted. Some of the diagnostics self-check routines also assert nERROR. The application can configure the duration for which nERROR is asserted each time an error condition is detected or inserted (during diagnostic self-test). An external monitor can use this time duration difference to identify the cause of nERROR (real error versus diagnostic self-test).

    >> nERROR is not asserted in case of a reset.

    3.) how to avoid that I get stuck in a reset loop, when I connect the ERROR signal e.g. with the manual reset input of my watchdog.

    >> nERROR is driven low only typically for errors of high severity, such as double-bit uncorrectable error reading from program memory. If such an error persists in the application, then the external safety monitor could decide to put the system in a safe state by keeping the MCU under a power-on-reset condition. Tying nERROR directly to nPORRST is not correct, as then you won't be able to run diagnostic self-tests that assert nERROR.

    4.) In the data sheet it is written that nERROR "can be used as an indicator to an external monitor circuit to put the system into a safe state. "

    We do not use the TPS65381A as a PMIC, but have used different PMIC devices and a separate watchdog device based on different requirements.
    In this context I do not quite understand what the TPS65381A actually does with the nERROR signal and how I can transfer the behavior on our PMIC / Watchdog.

    >> Please refer the TPS65381A-Q1 datasheet for more information about the device's response to nERROR signaled by the MCU.

    Regards,
    Sunil

  • I am going discuss a side topic, and these are just my opinion.
    Let us imagine a SIL-2 safety device and the uC is there just to keep a reliability log (not associated with any safety action). You MIGHT not be required to report or act on the condition of that uC.
    If the uC is associated with a safety function, it MIGHT be an industrially accepted good-practice to use the status value and act on the information it provides. You would certainly want to document the needs and confirm that the design is adequate.
  • Thanks for your comments. Yes, the actual implementation will certainly depend on the functional safety requirements being targeted and the actual role of the MCU in the system.

    Regards,
    Sunil
  • Hello all, thanks for making an effort to help me.
    We decided, to evaluate the nERROR pin. But because we don't use a spoecial PMIC like the TPS65381A, we check another possibility to evaluate the nERROR pin.
    In our system wehave a watchdog, which is triggered by the processor (what else? :-)).
    If we use nERROR signal to break the trigger line from the CPU to the watchdog, that could be a solution.

    For diagnosis, the nERROR pulse would have to be so short that it will not reset, even if a watchdog trigger pulse is missing.
    In normal operation, if a high severe error occurs, this trigger line is permanently interrupted. Thus, the watchdog would no longer be triggered even if the CPU (because it goes wrong) would do so.

    To verify this solution I have some questions that you can certainly answer very quickly (it is also sufficient references to the corresponding chapter in the relevant documents).

    1.) how does the nERROR pin react while a possible diagnosis. Is there a special diagnosis possibility foreseen in th eCPU?
    How could that be? Is it possible to create a pulse over the nERROR pin.
    To see that this pulse really came, you could read it in with another GPIO.
    2.) How long is the NERROR signal active, when a high severe error occurs?
    3.) I have read that the occurrence of high-severity errors is stored in a status register. Did I understand that correctly? Does the status remain even after a reset? How can I delete the status?

    I am very grateful for any hint. Thanks for your effort and help.

    Regards,
    Winfried
  • Hello Winfried,

    Sorry for missing this post from August. See my comments for your questions:

    1.) how does the nERROR pin react while a possible diagnosis. Is there a special diagnosis possibility foreseen in the CPU?
    How could that be? Is it possible to create a pulse over the nERROR pin.
    To see that this pulse really came, you could read it in with another GPIO.

    >> The nERROR pin is asserted even while testing on-chip diagnostic features. The external safety monitor needs to be configured to not put the system in a safe state when nERROR is asserted during these diagnostic checks. You can control the duration for which nERROR is asserted by configuring the low-time-counter preload value in the ESMLTCPR register.

    >> You can also read the status of the nERROR pin from the ESM Error Pin Status Register (ESMEPSR).

    2.) How long is the NERROR signal active, when a high severe error occurs?

    >> This is configurable via the low-time-counter preload value in the ESMLTCPR register.

    3.) I have read that the occurrence of high-severity errors is stored in a status register. Did I understand that correctly? Does the status remain even after a reset? How can I delete the status?

    >> Yes, there is a separate ESM error status register for each of the three error groups. These flags can be read even after system reset. They are only cleared by software (write 1 to clear).

    Regards,
    Sunil