This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Hercules TI Board RM48L952 ZWT Self Tests/Safety Init Tab HalCoGen 04.03.00

Other Parts Discussed in Thread: HALCOGEN

Hello,  We are using the TI Hercules board RM48L952ZWT with HalCoGen 04.03.00. Under the safety init tab, there are 7 available self tests. When we tried to enable them all (we assumed it would be a good idea to always run safety tests at boot up), the board hangs and does not boot up all the way (or may be the self tests just takes a long time and we were not patient enough to wait).

Can you explain the different self tests, and what is TI's recommendation for running each when (i.e., what would be good options to run at each boot up versus periodically for example), and what should we expect as behavior (i.e., do some of the self tests take several hours to execute?) Thank you again.

  • Hi Tammy,

    I'll have a look at the safety init code when I return on Monday and provide some guidance. It will take a little longer at boot up but should be completed in relatively short order. Most likely what is happening is that the safety init code has some traps in it for when there is a failure and it is getting hung in a while loop that requires some application code. You should be able to see where it is hung up with the debugger. Also, let me know if you have any event or peripheral device in your system that imposes a time limit on the startup of the Hercules device.
  • Hi Tammy,

    I think that there are fairly descriptive comments in the generated code so I am not certain what additional level of description you might need. From an applicability point of view, I think this is covered with the Safety Manual and in spna106 which recommends various tasks to perform at startup.

    The safety manual for the MR48L952 can be found here:
    www.ti.com/.../spnu577

    The startup code recommendation is here:
    www.ti.com/.../spna106d

    Hopefully these help. Unfortunately, I really can't recommend what makes sense to run on a periodic basis since I don't have all the details of your application or your industry standards. This is highly dependent on the use case and the standards to which you want to certify. Generally speaking, if the system is powered on for a finite amount of time, it probably isn't necessary to run the safety init tests during runtime, but if there is a very long power-on time, it might make since to force a reset and go through the safety init every so often (daily, weekly, monthly? depends on your application).
  • Hi Chuck. Thank you we will read through the documents you included. When we first got the board, we enabled the tests (we are doing medical devices, and we do not want anyone to die during a medical procedure -- so testing is very important) with the basic driver library, and we waited over an hour -- so then what you write is something must have failed early on (there was no application code, just the auto-generated HalCoGen drivers). We have disabled most of the self tests since then. Now for production, we hoped to utilize some of these safety features and want to go back, test by test and see what can be included (and what makes sense to include). The reason we asked for recommendation, in case could there have been some type of the HalCoGen we used then (4.01.00) vs HalCogen 4.03.00 that could have resolved why the tests hung on us (rather then taking the few minutes during boot up as you describe above)? We have had a few weird other issues that we had to solve with TI's support because HalCoGen was not generating what was expected. Are there any known issues around these tests from older HalCoGen versions vs the latest HalCoGen released?
  • Hi Tammy,

    I certainly understand the need to assess the safety init in regard to your product's safety. The safety manual was written in regard to achieving ASILD ISO26262 and SIL3 IEC61508 safety certification standards. If you are targeting IEC61508 SIL3 or working towards compliance to comparable safety levels in IEC60601 the recommendations in the safety manual should be beneficial. However, in the end, it is up to the application developers to make these decisions on what to include, what not to include, and what frequency to which the tests should be performed.

    As a background, the self tests which can be part of the startup sequence are designed in a way that you can test the different diagnositics available in the device and then use these mechanisms to perform diagnostics during runtime. This is what TI calls a safe island approach. i.e., prove the test mechanism is good at power up then relay on those mechanisms during runtime to test other parts of the chip. Given this, I would say that it would be worthwhile to implement as many of the diagnostic tests at startup as possible which gives you the confidence that the other mechanism described in the safety manual are effective during run time.

    In regard to the test code during the startup sequence, it should be finished in far less than a second. If you are getting hung in a part of the boot up sequence, you should be able to determine this by strategically placed break points in a debug session since there aren't that many places that enter a "forever" loop during these initialization tests. Conversely, there are many places where if an error is detected, the decision is left open to the application (i.e., user code sections to add code to take an appropriate action for the application). In many cases, there are opportunities to re-test, error clearing, reset, and or notification to the outside world that a critical system fault has occurred.

    In regard to known issues with the safety init, I am not aware of any known issues. However, I will ping our software team to verify if this is the case.
  • Hi

    Just a Quick note if you are not aware already, both CCM self test and CPU Self test cannot be run in Debug mode, if you try runnning in debug mode the test will hang, the Tool tip points out this when cursor is pointed over these tst selection in HALCoGen. Debug mode meaning you cannot have CCS/IAR/KEIL connected, once Flashed the code terminate the connection and issue a Hard reset, then you can connect so by that time both CCM and CPU test would have completed if it is enabled as part of SafetyInit Tab in HALCoGen.

    For debug purpose I use Branch to itself loop at the main like  asm(" B $"); so that  the code execution is stuck at this line, once you connect you can move your PC to next line and debug your application.

  • Hi Prathap, thank you.  We were not aware, and we were enabling to run them in debug mode in CCS with debugger (which is why we opened this case). We were not sure if it had hung or the tests were simply taking a long time to execute.

    We have been reading the documentation around the different tests. If any of these tests return an error, does it always mean it is unsafe to use the hardware for a safety critical device?

  • Hi Tammy,

    An error doesn't necessarily mean a complete device failure. Some errors such as an ECC fault might be a soft error and could be re-tested to check for repeatability. If the errors are persistent, then perhaps there is a critical fault. Another example of this is with the core compare module tests. Again, these types of faults could be soft errors meaning they will go away upon retest.

    Note, however, even soft errors may be indicative of suitability for operation concerns since it could mean the device is under some condition causing them. i.e., if there is a high amount of radiation from alpha or beta particles causing bit flips in the device memory or logic or low voltage conditions causing intermittent failures.

    I think there is some amount of recommendation in the safety documentation (safety manual, safety report, safety analysis report) but in the end you have to make a call on these based on the risks in your application and the requirements in your industry. Certainly, if there are specific questions about what affect certain errors can have and possible mitigation for specific error conditions, we can help provide data about the microcontroller so that you can make educated decisions based on your application.
  • Tammy,

    I might also add that there is another e2e forum that is a private forum in which detailed information can be discussed regarding safety. It is accessible only with an NDA, but there is a lot more detail and will put you in touch with out dedicated safety experts. I have sent a message to ur Safety Team lead to see if he can send you an invite or, at least, initiate contact to confirm an NDA is in place so that access can be given.