This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RM48L952: SafeTI: handles error pin reset against TRM usage notes

Part Number: RM48L952

Hello (again),

In TRM figure 12-8 (example 5) is described "very interesting" behavior of error pin in case reset is given while error pin is up.

There is also a direct note which prohibits that kind of usage:
"This case is not recommended and should be avoided by the application"

What does SafeTI? In CCMR4F_SELF_TEST_ERROR_FORCING test it disables both ISR & error pin action
sl_esmREG->IECR1 = GET_ESM_BIT_NUM(ESM_G1ERR_CCMR4_SELFTEST);
sl_esmREG->DEPAPR1 = GET_ESM_BIT_NUM(ESM_G1ERR_CCMR4_SELFTEST);

Then generates the error & checks that ESM channel bit goes active and after that *** some drums ***
/* Clear nERROR */
 _SL_HoldNClear_nError();

This yields to a situation that in case after that test comes some real ESM1 error which is not routed to ISR the error pin just rapidly goes down and then back up...

Every other error pin reset looks to relate to esm2 & 3 error handling when error pin action is always included...

Is the founding (fatal) bug or not? Tested that this reset is not needed there and error pin stays up without it.

TRM also misses case which is like example 4, but where error pin reset is given between failures. How the CPU behaves in that case? Will the reset given before 2nd failure still rise the pin after t_err_low from 2nd failure? If yes does the possible 2nd reset after 2nd failure but before error pin has been raised "banked" and applied later like in example 5?


TRM document bug: this sentence is also wrong, this says something which is then undoed in example 5 (this says clearly that reset request is allowed/noticed only when pin is low)
"This request is done by writing an appropriate key (0x5) to the key register (ESMEKR) during the ERROR pin low time"

  • Hello Jarkko,

    I apologize for the delay in responding to this post. I am currently looking into both the SW implementation and the source of the note in the TRM to determine the impact of the discrepancy. I will have an initial response/update of my findings either later today or by tomorrow morning CST.
  • Hello Jarkko,

    In TRM figure 12-8 (example 5) is described "very interesting" behavior of error pin in case reset is given while error pin is up.

    There is also a direct note which prohibits that kind of usage:
    "This case is not recommended and should be avoided by the application"

    What does SafeTI? In CCMR4F_SELF_TEST_ERROR_FORCING test it disables both ISR & error pin action
    sl_esmREG->IECR1 = GET_ESM_BIT_NUM(ESM_G1ERR_CCMR4_SELFTEST);
    sl_esmREG->DEPAPR1 = GET_ESM_BIT_NUM(ESM_G1ERR_CCMR4_SELFTEST);

    Then generates the error & checks that ESM channel bit goes active and after that *** some drums ***
    /* Clear nERROR */
     _SL_HoldNClear_nError();

    This yields to a situation that in case after that test comes some real ESM1 error which is not routed to ISR the error pin just rapidly goes down and then back up...

    I'v had a look at the code in question. I do not believe this code violates the note in the TRM. The note in the TRM applies to clearing the nERROR pin prior to the error being asserted as noted in the following excerpt.

    Also, the function call _SL_HoldNClear_nError(); is gated by the test type. It is not called in the event of Fault injection but is called for CMR4F_ERROR_FORCING_TEST test type. In this test type, the purpose is to preform the test including error notification via ESM so the resulting action needs to clear the interrupt and nERROR signal to insure there is no false action based on the test.

    Also note, for the CCMR4F_ERROR_FORCING_TEST the expected ESM flag will be the BF_CCMR4_CMP_ERROR (CCMR4 Compare error) which is in G2, Ch2 ESM flag.

    Note also, that the ESM and interrupt registers are saved at the start of the test (context save) and restored at the end. So the intent is for the customer register configurations/status are retained after the test is completed.

    Finally to address your questions regarding missed real errors during the test of functions (specifically CPU Self test), the CCMR4 is offline during the CCMR4 self-test so this isn't a possibility or has a very low probability. In general, the execution frequency of this test on a periodic basis needs to be weighed against the project requirements such as the FTTI and Latent Fault detection, etc. In short, I don't see any bugs associated with the process/SW design as implemented relative to example 5 in the TRM.


    TRM also misses case which is like example 4, but where error pin reset is given between failures. How the CPU behaves in that case? Will the reset given before 2nd failure still rise the pin after t_err_low from 2nd failure? If yes does the possible 2nd reset after 2nd failure but before error pin has been raised "banked" and applied later like in example 5?

    I am not 100% certain what you are discussing here, but in the event an error condition is flagged, the system should process that error. From a safety perspective, Hercules certification is based on an HFT = 0 so there is no guarantee for handling of a stack up of faults or redundant faults. Understand that this could be a system concern, and it is something that should be evaluated/characterized such that on a single fault, the system responds to place the system including the RM48 into a safe state.

    TRM document bug: this sentence is also wrong, this says something which is then undoed in example 5 (this says clearly that reset request is allowed/noticed only when pin is low)
    "This request is done by writing an appropriate key (0x5) to the key register (ESMEKR) during the ERROR pin low time"

    As I mentioned above, I think there was a misinterpretation of the example 5. Example 5 is discussing a very specific use were the nERROR is cleared before a known error condition. Basically, this is a warning that any clearing of the nERROR pin could results in sort of a "semaphore
    " type of operation but clearing the nERROR is necessary when processing error conditions unless the result is a nPORRST assertion by an external monitoring circuit. Certainly, before manually clearing nERROR, you could also check the ESM status registers for any error conditions to minimize the possibility of a lost error.

  • Hi,

    First before going further: I think that you mixed 2 different test (CCMR4F_ERROR_FORCING_TEST and CCMR4F_SELF_TEST_ERROR_FORCING), I have also mixed those multiple times when originally inspected this finding (and making other tests, many tests names are so close each other). The first one is group 2 test, second one is group1 test. Group2 test is OK.

    From SafeTI code:
    case CCMR4F_SELF_TEST_ERROR_FORCING:
    ...
    if(GET_ESM_BIT_NUM(ESM_G1ERR_CCMR4_SELFTEST) == (sl_esmREG->SR1[0] & GET_ESM_BIT_NUM(ESM_G1ERR_CCMR4_SELFTEST))){
                     config->stResult = ST_PASS;
                 }

    See it is G1ERR, can you confirm?


    Chuck Davenport said:
    I'v had a look at the code in question. I do not believe this code violates the note in the TRM. The note in the TRM applies to clearing the nERROR pin prior to the error being asserted as noted in the following excerpt.

    Since this is Group1 test (where nERROR action is masked out) not the group2 test which you targeted above I still think that it violates (see bottom of the post, I made test code to prove this that TRM text is correct and you have to be really careful with this feature.
    - in original post I mentioned that this is the ONLY group1 test which I found which acknowledges nERROR, other group1 tests doesn't operate error pin all group1 "normal" tests (not fault insertion) masks both ISR and nERROR action away (this deviation from other code was the "thing" which popped to my eye and only after this I started to read TRM and found that example 5).


    Chuck Davenport said:
    Note also, that the ESM and interrupt registers are saved at the start of the test (context save) and restored at the end. So the intent is for the customer register configurations/status are retained after the test is completed.

    - Yes, this is done in every test and is of course expected behavior (except in some fault insertion ones (most likely by some copy&paste mistake or typo), I think that content should be always returned since I have managed to make all FI tests under SafeTI with little modification to '{' & '}' locations).

    Chuck Davenport said:
    I am not 100% certain what you are discussing here, but in the event an error condition is flagged, the system should process that error. From a safety perspective, Hercules certification is based on an HFT = 0 so there is no guarantee for handling of a stack up of faults or redundant faults. Understand that this could be a system concern, and it is something that should be evaluated/characterized such that on a single fault, the system responds to place the system including the RM48 into a safe state.

    Excuse my poor english as non native speaker, you can anytime ask "what you mean" and I will try to describe it better. As will be proven in the bottom of this post, the TRM for example #5 is correct so I started to wonder does it somehow affect to an example like #4 but which isn't 1:1 like #4.

    In some SafeTI tests (for example RAM or FLASH (FLASH_ECC_TEST_MODE_2BIT)) where the test tests multiple failures in a row (both banks), it uses _SL_HoldNClear_nError() between tests. So error pin is raised up (in non-modified code, which cannot be used since this really blocks the while application for ~150us) and then next error is generated. But error generation in the test does not check the state of the error pin (pin state is checked only in function entry) so the clearing (and busy waiting) between error generation is "useless waste of time" - this just causes extra delay in non-modified code which could be skipped.

    Since there is that ~150us busy wait, I have had to replace the content of the "_SL_HoldNClear_nError()" by code which either works as original (used in startup) or only just sets EKR = 5 (used in run time) in order to prevent busy wait so that our application can do something else while waiting that error pin raises after the end of the test.
    - Now based on my modification this yields to a situation where EKR=5 is written between 2 failure generation - so 1st reset is given in example 4 between those 2 failures and 2nd reset after the both failures. You now got it what kind of reset pattern I mean (Failure - ack (nError has not time to rise up before next failure comes) - Failure - ack?
    - As example 5 is real, I started to wonder is it "safe" to give those 2 resets (because I am not waiting that pin goes up in between) or should I modify SafeTI code so that there is one and only one reset in every test (because I have had to modify the orinigal reset function) - I am trying to avoid modifying the SafeTI code as much as feasible possible since if some day comes new release if will be quite pain to re-make the changes which may still be needed (hopefully of course no need for any modifications).
    - I assume that this 2nd reset is not "banked/stored for later usage" but since that #4 says only that "the low time counter will be reset" I cannot be sure that will it in continue running or is it actually "reset&stopped" and new EKR=5 acknowledge is needed to release the pin after the failure. Basically examples misses scenario where error pin reset is given between 2 failures.

    QUESTION: what is the status of the nERROR pin after t_err_low from latest fault incase if reset is given once between 2 failures? (I know that you can ack nERROR even in esm channels are active errors)


    I have implemented to fault injection tests in such way that I check that there is no active ESM error and also that test in concern has been configured to control the nERROR, only in this case the error pin is acknowledged (so this prevents acknowledge even in a situation where one FI test has acknowledged it's esm channels but forgot to ack nERROR). I have really tried to prevent the "accidental" acks against #5 in my own code... We have also HW "lock" (latch) in case nERROR has gone down, the safe state is not immediately removed in case nERROR goes back to up, you will need to release that latch manually after nERROR gone up (but we do not have much time to release the latch after nERROR has gone up (since nERROR downtime is at minimum quite whopping 150us (yes, that is much in our environment)), for rapid latch release we have wired nError-pin to GPIO so we receive interrupt (since esm module does not provide this kind of IRQ?) so we can release the latch immediately after that). For this reason it is quite critical that nError does not raise up by 'itself'... In case that could happen I'll guess that I need to implement yet another SW guard to the latch release (for example check that there are not any esm  channel errors are active or something. SafeTI code works a bit differently, it typically (if not always) first ack the nERROR pin and after that acks the ESM channel error so I cannot put my "sanity checked nERROR reset " into SafeTI's error pin release function unless I also modify the code so that channel errors are cleared first.

    For this reason it is quite critical to understand every scenario how nERROR could raise 'itself and how it actually behaves, do you agree? We may have channels which have only ISR or only nERROR action (or may not any) that depends on application, for example in our case correctable RAM error would be nice to log but since code/CPU still works as designed we do not want trigger nERROR... And MAYBE some other error could a one which has only nERROR action (cannot immediately figure out use case, but it is possible configution) and depending on other mechanism in the system this may cause serious problems if nERROR goes back to up by "itself" (itself==bug in error acking in SW las per example 5#).

    QUESTION: if t_err_low counting is stopped (& register value reseted) when failure occurs we do not have to worry about consecutive errors but if the t_err_low value is only reseted the nERROR pin raise up in case it has been once acked and after that comes some real error during t_err_low window (this is not example 5 because in 5 the pin in UP when reset is given).
    - And in ultimate case the 2nd acknowledge, is it perhapsly "banked" even the  so that system works like in example 5 (but I doubt that)


    ==== Here is the example 5# proven to be real feature ====

    The the code & debug logs which proves that TRM example #5 is correct and it will most likely has direct impact also to CCMR4F_SELF_TEST_ERROR_FORCING tests
    - Time is us counter from RTI module
    - OS delay used between tests to get all prints from buffers
    - OWN_GET_ESM_BIT_NUM is practically same as GET_ESM_BIT_NUM

    test makes 3 rounds:
    - first time: like TRM suggests
    - second time: against TRM
    - third time: like TRM suggests
    #if 1
             #include "sl_api.h"
             #include "macros.h"
             #include "esm_application_callback.h"

             uint32 u32Rounds = 0u;
             for( u32Rounds = 0u; u32Rounds < 3U; u32Rounds++ )
             {
                 boolean bAgainsTRM = FALSE;
                 if( u32Rounds == 1 )
                 {
                     bAgainsTRM = TRUE;
                     DBG_PRINT( "=== Testing ESM ERROR against TRM clear ===\r\n" );
                 }
                 else
                 {
                     bAgainsTRM = FALSE;
                     DBG_PRINT( "=== Testing ESM ERROR suggested clear ===\r\n" );
                 }

                 DBG_PRINT( "nERROR1: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );
                 if( bAgainsTRM )
                 {
                     DBG_PRINT( "Make clear against TRM\r\n" );
                     sl_esmREG->EKR = 0x5u;
                 }
                 else
                 {
                     DBG_PRINT( "Skip pin clear since it is already up\r\n" );
                 }

                 uint32 u32Channel = ESM_G1ERR_B0TCM_CORRERR;

                 volatile uint32* pu32RegErrPin = ( (u32Channel < 32U) ? &sl_esmREG->EEPAPR1 : &sl_esmREG->IEPSR4 );
                 volatile uint32* pu32RegIsr = ( (u32Channel < 32U) ? &sl_esmREG->IESR1 : &sl_esmREG->IESR4 );

                 *pu32RegErrPin = OWN_GET_ESM_BIT_NUM( u32Channel );
                 //*pu32RegIsr = OWN_GET_ESM_BIT_NUM( u32Channel );

                 DBG_PRINT( "Before test, nERROR2: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );
                 SL_SelfTest_Result failInfo;
                 boolean bOwnRet = SL_SelfTest_SRAM( SRAM_ECC_1BIT_FAULT_INJECTION, TRUE, &failInfo );

                 DBG_PRINT( "bOwnRet: %s\r\n", BOOL2ASCII(bOwnRet) );
                 uint32 u32Time = HAL_u32TimeGet();
                 DBG_PRINT( "After test, nERROR3: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );

                 DBG_PRINT( "Starting pin active wait loop: %u us\r\n", u32Time );
                 while( SL_ESM_nERROR_Active() )
                 {
                     if( (HAL_u32TimeGet() - u32Time) > 500U )
                     {
                         DBG_PRINT( "timeout break\r\n" );
                         break;
                     }
                 }
                 DBG_PRINT( "Out from loop %u us, loop elapsed %u us\r\n", HAL_u32TimeGet(), HAL_u32TimeGet()-u32Time );
                 volatile uint32* pu32Reg = ( (u32Channel < 32U) ? &sl_esmREG->SR1[0] : &sl_esmREG->SR4[0] );
                 *pu32Reg = (uint32)OWN_GET_ESM_BIT_NUM( u32Channel );
                 *pu32Reg = (uint32)OWN_GET_ESM_BIT_NUM( ESM_G1ERR_B1TCM_CORRERR ); // test activates this also

                 DBG_PRINT( "nERROR4: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );
                 if( SL_ESM_nERROR_Active() )
                 {
                     DBG_PRINT( "clear error\r\n" );
                     sl_esmREG->EKR = 0x5u;
                 }

                 while( SL_ESM_nERROR_Active() )
                 {
                 }
                 DBG_PRINT( "nERROR5: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );

                 // required to clear these after the test in order to trig new error
                 sl_tcram1REG->RAMERRSTATUS = TCRAM_RAMERRSTATUS_ADDR_SERR;/*lint !e9033 */
                 sl_tcram2REG->RAMERRSTATUS = TCRAM_RAMERRSTATUS_ADDR_SERR; /*lint !e9033 */

                 DBG_PRINT( "=== End of testing ===\r\n" );

                 WOS_vDelayMs( 100U ); // time to empty print buffers
             }
    #endif

    And here is the results:
    === Testing ESM ERROR suggested clear ===<CR><LF>
    nERROR1: active 0, 299122 us<CR><LF>            //// Error pin is UP
    Skip pin clear since it is already up<CR><LF>         //// no reset given
    Before test, nERROR2: active 0, 299173 us<CR><LF>    //// still UP
    bOwnRet: TRUE<CR><LF>                    //// test executed
    After test, nERROR3: active 1, 299225 us<CR><LF>        //// Error pin is DOWN
    Starting pin active wait loop: 299225 us<CR><LF>        //// wait....
    timeout break<CR><LF>                     //// timeout
    Out from loop 299736 us, loop elapsed 511 us<CR><LF>    //// see elapsed time
    nERROR4: active 1, 299770 us<CR><LF>                    //// Error pin is still DOWN (as expected)
    clear error<CR><LF>                    //// ack the pin
    nERROR5: active 0, 299806 us<CR><LF>             //// Error pin is UP
    === End of testing ===<CR><LF>

    === Testing ESM ERROR against TRM clear ===<CR><LF>
    nERROR1: active 0, 299897 us<CR><LF>            //// Error pin is UP
    Make clear against TRM<CR><LF>                //// reset against TRM
    Before test, nERROR2: active 0, 299938 us<CR><LF>    //// still UP
    bOwnRet: TRUE<CR><LF>                    //// test executed
    After test, nERROR3: active 0, 299989 us<CR><LF>    //// Error pin is DOWN
    Starting pin active wait loop: 299989 us<CR><LF>    //// wait.... BUT no TIMEOUT
    Out from loop 300108 us, loop elapsed 119 us<CR><LF>    //// see elapsed time
    nERROR4: active 0, 300143 us<CR><LF>            //// Error pin is UP (NOT expected, works == example 5 says)
    nERROR5: active 0, 300169 us<CR><LF>
    === End of testing ===<CR><LF>


    === Testing ESM ERROR suggested clear ===<CR><LF>
    nERROR1: active 0, 299020 us<CR><LF>            //// Error pin is UP
    Skip pin clear since it is already up<CR><LF>        //// no reset given
    Before test, nERROR2: active 0, 299071 us<CR><LF>    //// still UP
    bOwnRet: TRUE<CR><LF>                    //// test executed OK
    After test, nERROR3: active 1, 299122 us<CR><LF>        //// Error pin is DOWN
    Starting pin active wait loop: 299122 us<CR><LF>        //// wait....
    timeout break<CR><LF>                    //// timeout
    Out from loop 299633 us, loop elapsed 511 us<CR><LF>    //// see elapsed time (matches to timeout)
    nERROR4: active 1, 299667 us<CR><LF>                    //// Error pin is still DOWN (as expected)
    clear error<CR><LF>                                     //// ack the pin
    ISR: ERROR pin up<CR><LF>                               //// our IO interrupt notices that pin goes up
    nERROR5: active 0, 299702 us<CR><LF>                    //// Error pin is UP
    === End of testing ===<CR><LF>

    === Testing ESM ERROR against TRM clear ===<CR><LF>
    nERROR1: active 0, 400019 us<CR><LF>                    //// Error pin is UP
    Make clear against TRM<CR><LF>                            //// reset against TRM
    Before test, nERROR2: active 0, 400062 us<CR><LF>       //// still UP
    bOwnRet: TRUE<CR><LF>                                   //// test executed OK
    After test, nERROR3: active 1, 400114 us<CR><LF>        //// Error pin is DOWN
    Starting pin active wait loop: 400114 us<CR><LF>        //// wait.... BUT no TIMEOUT
    ISR: ERROR pin up<CR><LF>                               //// our IO interrupt notices that pin goes up!!!!!!!
    Out from loop 400286 us, loop elapsed 171 us<CR><LF>    //// see elapsed time
    nERROR4: active 0, 400320 us<CR><LF>                    //// Error pin is UP (NOT expected, works == example 5 says)
    nERROR5: active 0, 400346 us<CR><LF>
    === End of testing ===<CR><LF>

    === Testing ESM ERROR suggested clear ===<CR><LF>       
    nERROR1: active 0, 501018 us<CR><LF>                    //// Error pin is UP
    Skip pin clear since it is already up<CR><LF>           //// no reset given
    Before test, nERROR2: active 0, 501069 us<CR><LF>       //// still UP
    bOwnRet: TRUE<CR><LF>                                   ///////// look from first round, follows same pattern
    After test, nERROR3: active 1, 501120 us<CR><LF>
    Starting pin active wait loop: 501120 us<CR><LF>
    timeout break<CR><LF>
    Out from loop 501631 us, loop elapsed 511 us<CR><LF>
    nERROR4: active 1, 501665 us<CR><LF>
    clear error<CR><LF>
    ISR: ERROR pin up<CR><LF>
    nERROR5: active 0, 501700 us<CR><LF>
    === End of testing ===<CR><LF>


    So in case we would run the SafeTI code as is and after CCMR4F_SELF_TEST_ERROR_FORCING test would be become real let's say for example group1 error where no isr is configured, only error pin action the nERROR pin would go low and then pop back up by "itself"

    I added "my test code" to be performed after CCMR4F_SELF_TEST_ERROR_FORCING but before any other tests (this simulates "real error" which could occur anytime) and here is the results:
    - removed basically rounds and added critical sections around test to guarantee that other IRQ does not prevent nERROR reading state immediately after the test
    #if 1
    if( ptTest->eTest == CCMR4F_SELF_TEST_ERROR_FORCING )
    {
             #include "sl_api.h"
             #include "macros.h"
             #include "esm_application_callback.h"
             {

                 DBG_PRINT( "=== Testing after CCMR4F_SELF_TEST_ERROR_FORCING ===\r\n" );

                 uint32 u32Channel = ESM_G1ERR_B0TCM_CORRERR;

                 volatile uint32* pu32RegErrPin = ( (u32Channel < 32U) ? &sl_esmREG->EEPAPR1 : &sl_esmREG->IEPSR4 );
                 volatile uint32* pu32RegIsr = ( (u32Channel < 32U) ? &sl_esmREG->IESR1 : &sl_esmREG->IESR4 );

                 *pu32RegErrPin = OWN_GET_ESM_BIT_NUM( u32Channel );
                 //*pu32RegIsr = OWN_GET_ESM_BIT_NUM( u32Channel );

                 DBG_PRINT( "Before test, nERROR2: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );
                 SL_SelfTest_Result failInfo;
                 WOS_vCsEnter();
                 boolean bOwnRet = SL_SelfTest_SRAM( SRAM_ECC_1BIT_FAULT_INJECTION, TRUE, &failInfo );

                 DBG_PRINT( "bOwnRet: %s\r\n", BOOL2ASCII(bOwnRet) );
                 uint32 u32Time = HAL_u32TimeGet();
                 DBG_PRINT( "After test, nERROR3: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );
                 WOS_vCsExit();

                 DBG_PRINT( "Starting pin active wait loop: %u us\r\n", u32Time );
                 while( SL_ESM_nERROR_Active() )
                 {
                     if( (HAL_u32TimeGet() - u32Time) > 300U )
                     {
                         DBG_PRINT( "timeout break\r\n" );
                         break;
                     }
                 }
                 DBG_PRINT( "Out from loop %u us, loop elapsed %u us\r\n", HAL_u32TimeGet(), HAL_u32TimeGet()-u32Time );
                 volatile uint32* pu32Reg = ( (u32Channel < 32U) ? &sl_esmREG->SR1[0] : &sl_esmREG->SR4[0] );
                 *pu32Reg = (uint32)OWN_GET_ESM_BIT_NUM( u32Channel );
                 *pu32Reg = (uint32)OWN_GET_ESM_BIT_NUM( ESM_G1ERR_B1TCM_CORRERR ); // test activates this also

                 DBG_PRINT( "nERROR4: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );
                 if( SL_ESM_nERROR_Active() )
                 {
                     DBG_PRINT( "clear error\r\n" );
                     sl_esmREG->EKR = 0x5u;
                 }

                 while( SL_ESM_nERROR_Active() )
                 {
                 }

                 DBG_PRINT( "nERROR5: active %u, %u us\r\n", SL_ESM_nERROR_Active(), HAL_u32TimeGet() );

                 // required to clear these after the test in order to trig new error
                 sl_tcram1REG->RAMERRSTATUS = TCRAM_RAMERRSTATUS_ADDR_SERR;/*lint !e9033 */
                 sl_tcram2REG->RAMERRSTATUS = TCRAM_RAMERRSTATUS_ADDR_SERR; /*lint !e9033 */

                 DBG_PRINT( "=== End of testing ===\r\n" );
             }
    }
    #endif


    === Testing after CCMR4F_SELF_TEST_ERROR_FORC...<CR><LF>
    Before test, nERROR2: active 0, 293464 us<CR><LF>
    bOwnRet: TRUE<CR><LF>
    After test, nERROR3: active 1, 293518 us<CR><LF>
    Starting pin active wait loop: 293518 us<CR><LF>
    ISR: ERROR pin up<CR><LF>
    Out from loop 293653 us, loop elapsed 135 us<CR><LF>
    nERROR4: active 0, 293710 us<CR><LF>
    nERROR5: active 0, 293735 us<CR><LF>
    === End of testing ===<CR><LF>



    Then If I comment out that SafeTI code nERROR clear and now again nERROR properly stays low:
                 /* Clear nERROR */
                 //_SL_HoldNClear_nError();
                 /* Clear the interrupt */


    === Testing after CCMR4F_SELF_TEST_ERROR_FORC...<CR><LF>
    Before test, nERROR2: active 0, 293282 us<CR><LF>
    bOwnRet: TRUE<CR><LF>
    After test, nERROR3: active 1, 293338 us<CR><LF>
    Starting pin active wait loop: 293338 us<CR><LF>
    timeout break<CR><LF>
    Out from loop 293650 us, loop elapsed 311 us<CR><LF>
    nERROR4: active 1, 293683 us<CR><LF>
    clear error<CR><LF>
    ISR: ERROR pin up<CR><LF>
    nERROR5: active 0, 293719 us<CR><LF>
    === End of testing ===<CR><LF>


    I think that this proves that the _SL_HoldNClear_nError() shouldn't be there since it is test which does not control nERROR?

  • I accidentally left some wrong logs into beginning (where my test code had bugs and 2nd time error was not generated at all), first 2 logs Should be removed (didn't had these lines at that time so error were not generated as can be seen from error state after 2nd test despite of what my comment says from that):
    // required to clear these after the test in order to trig new error
    sl_tcram1REG->RAMERRSTATUS = TCRAM_RAMERRSTATUS_ADDR_SERR;/*lint !e9033 */
    sl_tcram2REG->RAMERRSTATUS = TCRAM_RAMERRSTATUS_ADDR_SERR; /*lint !e9033 */

    First 2 "tests" should be removed" after "And here is the results:" text (the which does not have ISR: ERROR pin up -text". (there are total of 5 test logs below "rounds testing")

    Here is the bottom 3 which were only supposed to be in the post (There is not preview option avaibled? So a bit tricky to verify that content is what you want)...

    === Testing ESM ERROR suggested clear ===<CR><LF>
    nERROR1: active 0, 299020 us<CR><LF> //// Error pin is UP
    Skip pin clear since it is already up<CR><LF> //// no reset given
    Before test, nERROR2: active 0, 299071 us<CR><LF> //// still UP
    bOwnRet: TRUE<CR><LF> //// test executed OK
    After test, nERROR3: active 1, 299122 us<CR><LF> //// Error pin is DOWN
    Starting pin active wait loop: 299122 us<CR><LF> //// wait....
    timeout break<CR><LF> //// timeout
    Out from loop 299633 us, loop elapsed 511 us<CR><LF> //// see elapsed time (matches to timeout)
    nERROR4: active 1, 299667 us<CR><LF> //// Error pin is still DOWN (as expected)
    clear error<CR><LF> //// ack the pin
    ISR: ERROR pin up<CR><LF> //// our IO interrupt notices that pin goes up
    nERROR5: active 0, 299702 us<CR><LF> //// Error pin is UP
    === End of testing ===<CR><LF>

    === Testing ESM ERROR against TRM clear ===<CR><LF>
    nERROR1: active 0, 400019 us<CR><LF> //// Error pin is UP
    Make clear against TRM<CR><LF> //// reset against TRM
    Before test, nERROR2: active 0, 400062 us<CR><LF> //// still UP
    bOwnRet: TRUE<CR><LF> //// test executed OK
    After test, nERROR3: active 1, 400114 us<CR><LF> //// Error pin is DOWN
    Starting pin active wait loop: 400114 us<CR><LF> //// wait.... BUT no TIMEOUT
    ISR: ERROR pin up<CR><LF> //// our IO interrupt notices that pin goes up!!!!!!!
    Out from loop 400286 us, loop elapsed 171 us<CR><LF> //// see elapsed time
    nERROR4: active 0, 400320 us<CR><LF> //// Error pin is UP (NOT expected, works == example 5 says)
    nERROR5: active 0, 400346 us<CR><LF>
    === End of testing ===<CR><LF>

    === Testing ESM ERROR suggested clear ===<CR><LF>
    nERROR1: active 0, 501018 us<CR><LF> //// Error pin is UP
    Skip pin clear since it is already up<CR><LF> //// no reset given
    Before test, nERROR2: active 0, 501069 us<CR><LF> //// still UP
    bOwnRet: TRUE<CR><LF> ///////// look from first round, follows same pattern
    After test, nERROR3: active 1, 501120 us<CR><LF>
    Starting pin active wait loop: 501120 us<CR><LF>
    timeout break<CR><LF>
    Out from loop 501631 us, loop elapsed 511 us<CR><LF>
    nERROR4: active 1, 501665 us<CR><LF>
    clear error<CR><LF>
    ISR: ERROR pin up<CR><LF>
    nERROR5: active 0, 501700 us<CR><LF>
    === End of testing ===<CR><LF
  • Jarkko,

    Thanks for the detailed response. I am in the process of reviewing all the information so I can provide the answers you need and also provide some guidance based on your system requirements/application needs.
  • Hello Jarkko,

    First, no apologies are necessary for 'poor' English. Your English is quite good and admirable that you can communicate in more than one language for these technical issues. The understanding issue is common on the E2E since things can sometimes be interpreted in different ways and it is always difficult to communicate technical issues via the E2E.

    I apologize for looking at the wrong test mode. You are correct that the Self-Test Error Forcing test results in a GRP1 Ch31 flag being set. Group 1 errors are, generally, the lesser priority issues and thus are configurable in their response with respect to interrupts and nERROR action. I also have observed your observation that this GRP1 error is handled differently than others in that the ESM nERROR signal is cleared/reset after the test. I need to query the original developer of the code to find out if this is a simple cut and paste issue or if there is some specific reason for it. In regard to if it is a bug or not, I can't judge at this point. Certainly it does lend itself to the potential to mask a real CCMR4F error since the CCMR4 operation is still ongoing. My thought is to raise a CQ ticket on this issue so it can be further evaluated by the SW team and a decision made if it needs to be corrected based on the risk.

    Relative to your specific use case, I see your point on the assertion of nERROR being an issue. The basis of our chip architecture and fault reporting is to assert the nERROR pin for everything so the external system can be notified and appropriate action can be taken. In many cases, we assume that a reset will be taken to correct any transient faults that might cause the failures so the assertion of nERROR is intended to signal externally for this (thus the HFT = 0). In the case of SW test of function and error path testing, this represents some challenges for handling the nERROR pin as you have noted. IN the case of the use of our companion ship, there is the possibility to program the nERROR monitoring to allow for an extended time on the pin; therefore, short assertions are not impacting entry into a safe state.

    You also mention that you are tying the nERROR pin to a GPIO pin so that an interrupt can trigger when nERROR is asserted. Is there a reason the ESM interrupt generation is not sufficient for this purpose? i.e., I believe any error resulting in the assertion of nERROR also has an ESM interrupt (NMI or FIQ) associated with it or can be configured to have an interrupt associated with it. Is the latency an issue here?

    You also mention/highlight a couple of specific questions regarding the behavior of the nERROR pin and counter. I need to look into these questions a little deeper as this gets into both the design of the logic behind the nERROR pin/counter, and into the specific use case/ SW interaction. I will come back with further information on this.
  • Chuck Davenport said:
    Certainly it does lend itself to the potential to mask a real CCMR4F error since the CCMR4 operation is still ongoing. My thought is to raise a CQ ticket on this issue so it can be further evaluated by the SW team and a decision made if it needs to be corrected based on the risk.

    I wasn't worried about any specific tests (for example just that CCMR4), as my example shows in case any ESM error puts nERROR down - due to that #5 example the nERROR is raised up by itself and you cannot do basically anything to prevent it (except maybe force nERROR down by writing 0xA to EKR if that ESM error gives you NMI/FIQ/configurable IRQ and you decide there that system needs to be kept in safe state). The effects of this "auto"rising depends on application of course what other protections it has (we for example has that HW latch among others, but if you just dummily ack the latch after nERROR rising it would not help at all in this case :)).

    But I am waiting first confirmation that example #5 is correct TRM and I didn't make mistake when trying to test that.

    Chuck Davenport said:
    You also mention that you are tying the nERROR pin to a GPIO pin so that an interrupt can trigger when nERROR is asserted. Is there a reason the ESM interrupt generation is not sufficient for this purpose? i.e., I believe any error resulting in the assertion of nERROR also has an ESM interrupt (NMI or FIQ) associated with it or can be configured to have an interrupt associated with it. Is the latency an issue here?

    We are using this GPIO to get notify of RISING edge of the nERROR because we have that HW "latch" (flipflop - what ever the name is))  after the nERROR pin which cannot be released until the pin is up (if you release it before the output stays down). And as our time frame is rather short how long that latch can be down until external devices starts to tripping we need IRQ based reaction (or other rather fast) of nERROR returning to normal state to prevent spurious trips (and ESM module does not provide it - that's why have hooked nERROR also to GPIO).

    For example in our system with my current code, the raising edge GPIO IRQ would "dummily" just ack the latch and safe state would possible be removed depending of the other controls & protections since currently I do not check anything in latch release - I have trusted that nERROR cannot be raised by accident -> you need deliberate action to raise it up - now I am pretty sure that "the trust" in this point may not be ideal approach since for example such a minor mistake as extra acknowledge while pin is up could most likely cause the nERROR to raise up by itself in case it goes down by an real error - I already have checked in own code that ack is not given at all in case nERROR is up, this by itself should already protect quite well against example #5 in case you use that function in every place (similar function in SafeTI code would have prevented that extra call in that test to cause any side effects). I still am planning to put some kind SW guard to latch release, do not know yet what kind, maybe some flag which needs to be in proper value and/or sw mode is suitable  and/or there are not any ESM channels active which have nERROR controlling capability...

    I am still not saying that "dummily" releasing of the latch would automatically result to removing the safe state by accident but if I can eliminate that accident (or at least greatly reduce the probability) it is worth to do but this of course requires first to understand that this scenario could happen and it required quite careful inspection of TRM. Say I may have a bug in own real ESM error handling which results to a situation that the error is not properly signalled to application (but still nERROR is not acked here) and latest diagnostics test I have made was that CCMR4F_SELF_TEST_ERROR_FORCING test -> "improved latch lock release" would catch this and our current solution doesn't. Typically for example the aeroplanes never comes down due to single mistake/flaw it always requires series of more or less unlikely events & design failures (someone has been watching "mayday"/Air Crash Investigation(s)  from TV) and in case you manage to tackle even one the end result would have been different.

    ==================
    Of course things go trickier if "failure-ack-failure"-pattern results to still release nERROR (which I do not hope) but in that case I think that this "above planned" ESM channel activity check in latch release kicks in in case software misses that error due to for some other bug in error handling ....

    Tried to test that now using FLASH_ECC_TEST_MODE_2BIT test since it contains 2 stages to create errors, and from there I commented out that 2nd error acknowledgement (remember, I have modified that acknowledge function for our runtime purposes and in this case it only write EKR=0x5, does not wait that nERROR raises up)

    So 1st part of the test is as is

                if ((F021F_FEDACSTATUS_B1_UNC_ERR == (uint32)(sl_flashWREG->FEDACSTATUS & F021F_FEDACSTATUS_B1_UNC_ERR))
                        && (sl_flashWREG->FUNCERRADD == (uint32)0x8u)
                        && (BIT(ESM_G3ERR_FMC_UNCORR) == (sl_esmREG->SR1[2] & BIT(ESM_G3ERR_FMC_UNCORR)))) {
                    _SL_HoldNClear_nError(); /* Clear nError */

    And from 2nd part I took out that nERROR ack away
                    if ((F021F_FEDACSTATUS_B1_UNC_ERR == (uint32)(sl_flashWREG->FEDACSTATUS & F021F_FEDACSTATUS_B1_UNC_ERR))
                            && (sl_flashWREG->FUNCERRADD == (uint32)0x0u) && (BIT(ESM_G3ERR_FMC_UNCORR) == (sl_esmREG->SR1[2] & BIT(ESM_G3ERR_FMC_UNCORR)))) {
                        //_SL_HoldNClear_nError();
                        *flash_stResult = ST_PASS;
                    } else {
                        *flash_stResult = ST_FAIL;
                    }

    This should give that "failure-ack-failure" pattern, and in case I didn't made mistakes the result is that nERROR is rising UP also in this case so TRM text about just "reset" the counter is most likely correct and it is up to the reader to understand the consequences of it (basically when you make a diagnostics test which requires nERROR ack (Group2 or 3 or  FI to G1 which has action enabled), after that test it is a bit similar situation as #5 but this one has limited time frame which is nERROR configured down time) since there is not pictured example of it.

    But of course still waiting your confirmation about this... If this is real, then pictured example in TRM would be more than welcome.

    If this is "confirmed" then just tackling example #5 via method "do not ack nERROR isf pin is up" method is not enough since 'similar' effect could be caused by "real error before nERROR risen up + bug in real error handling".  Of course probability for real error during nERROR downtime window is really small and much smaller than in case when #5 is violated since there the error has "unlimited" timeframe... But if this will be "confirmed feature" and when I now know what it could do to avoid this I just can't leave the latch unlocking like it is now...

    ========

    Also tried to test with the help of that FLASH_ECC_TEST_MODE_2BIT flash test the following pattern:  "failure-ack-failure(-ack) ---- let error pin go up --- run this same SRAM fi test after the nERROR has been raised ---- nError stays down as it should. So consecutive ack's made while nerror is down are not "banked" to wait for example #5, as it was most likely expectable&guessable based on TRM examples.

    =======

  • Hello Jarkko,

    First let me address your specific question from a couple posts back.

    First:
    "QUESTION: what is the status of the nERROR pin after t_err_low from latest fault incase if reset is given once between 2 failures? (I know that you can ack nERROR even in esm channels are active errors)"

    I believe the second fault will initiate a second fault notification causing the nERROR pin to be driven low for t_err_low time period until it could rise again if a second ack/reset is given. i.e., the second fault will cancel out the 'ack/reset' of the first.

    QUESTION: if t_err_low counting is stopped (& register value reseted) when failure occurs we do not have to worry about consecutive errors but if the t_err_low value is only reseted the nERROR pin raise up in case it has been once acked and after that comes some real error during t_err_low window (this is not example 5 because in 5 the pin in UP when reset is given).
    - And in ultimate case the 2nd acknowledge, is it perhapsly "banked" even the so that system works like in example 5 (but I doubt that)

    This second question seems to be the same as your first but discussing more the t_err_low counter if I am understanding correctly. So I think the answer is the same with respect to the nERROR signal. Essentially, if there is a fault that results in nERROR going low and a following 'ack/reset' the counter continues to its completion before rising so you have a minimum low time of t_err_low. If a second fault comes after the 'ack/reset' for the first, the counter starts over upon the receipt of the second fault and will stay low until the next 'ack/rest' is given. Essentially, the second fault should cancel the 'ack/reset' request from the first fault. This scenario is different from example 5 since the state of the nERROR pin and the fault assertion is not ongoing in example 5. If you do see a different behavior, let me know and will double check with our design team and experts on the ESM module to see if this is a silicon bug or expected behavior that should be documented.

    Are you able to monitor the nERROR pin with an Oscilloscope during execution of your test case mentioned above? It would be ideal to be able to see this pin action captured in this way since code execution/delays from print statements could miss pin states happening faster than code execution and serial communications.
  • Hi,

    Chuck Davenport said:

    This second question seems to be the same as your first but discussing more the t_err_low counter if I am understanding correctly.
    I think so too.

    Chuck Davenport said:

    Are you able to monitor the nERROR pin with an Oscilloscope during execution of your test case mentioned above? It would be ideal to be able to see this pin action captured in this way since code execution/delays from print statements could miss pin states happening faster than code execution and serial communications.
    Yes I am able to monitor it via oscilloscope also. Did that and the results stays same as my previous test, nERROR rises UP in case consecutive failures comes that EKR=5 ack before the line has gone up.

    How I tested it this time? in startup.c (so FIQ&IRQ and so on are disabled, OS not running so not even remote possibility for context swithing etc...) I run FLASH_ECC_TEST_MODE_2BIT test with similar modifications as in last test  (and verified by stepping via debugger that these modified code lines are executed during that test).

    1st error, modified it so that function is not called at all, only 5 written to EKR (this is practically same as using "during runtime periodic" our modified function but at least now it is really clear what is done.
                if ((F021F_FEDACSTATUS_B1_UNC_ERR == (uint32)(sl_flashWREG->FEDACSTATUS & F021F_FEDACSTATUS_B1_UNC_ERR))
                        && (sl_flashWREG->FUNCERRADD == (uint32)0x8u)
                        && (BIT(ESM_G3ERR_FMC_UNCORR) == (sl_esmREG->SR1[2] & BIT(ESM_G3ERR_FMC_UNCORR)))) {
                    //_SL_HoldNClear_nError(); /* Clear nError */
                    sl_esmREG->EKR = 0x5u;

    2nd error, commented out that acknowledge so we should have "fail-ack-fail"-pattern
                    if ((F021F_FEDACSTATUS_B1_UNC_ERR == (uint32)(sl_flashWREG->FEDACSTATUS & F021F_FEDACSTATUS_B1_UNC_ERR))
                            && (sl_flashWREG->FUNCERRADD == (uint32)0x0u) && (BIT(ESM_G3ERR_FMC_UNCORR) == (sl_esmREG->SR1[2] & BIT(ESM_G3ERR_FMC_UNCORR)))) {
                        //_SL_HoldNClear_nError();
                        *flash_stResult = ST_PASS;
                    } else {
                        *flash_stResult = ST_FAIL;
                    }


    The result was that nERROR was low 156us which quite well match to the fact that between 2 failures there is also data abort vector code to be executed before the ack is given (failure-data_abort-ack-clear_some_register_ack-failure). But lets not stop here, since I wanted to be sure what I am measuring so I added  asm NOP loops between 1st and 2nd failure after the ack like this (so 1000 rounds total of 10000 nops executed).

                if ((F021F_FEDACSTATUS_B1_UNC_ERR == (uint32)(sl_flashWREG->FEDACSTATUS & F021F_FEDACSTATUS_B1_UNC_ERR))
                        && (sl_flashWREG->FUNCERRADD == (uint32)0x8u)
                        && (BIT(ESM_G3ERR_FMC_UNCORR) == (sl_esmREG->SR1[2] & BIT(ESM_G3ERR_FMC_UNCORR)))) {
                    //_SL_HoldNClear_nError(); /* Clear nError */
                    sl_esmREG->EKR = 0x5u;

                    uint32 u32Temp = 1000U;
                    while( u32Temp-- )
                    {
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                        __asm("nop");
                    }

                    /* Clear flash & ESM status registers */
                    sl_flashWREG->FEDACSTATUS = F021F_FEDACSTATUS_B1_UNC_ERR;
                    sl_esmREG->SR1[2] = BIT(ESM_G3ERR_FMC_UNCORR);

    And the result was that nERROR down time was extended from 156us to 244us so NOPs had effect making the 2nd error to appear later. And in case I increased the amount of nops at some point the nERROR is kept down in the end of test since nERROR had time to rise up between those 2 failures generations.

    I am now pretty sure that my testing tests the "fail-ack-fail" pattern, but ultimately of course debugger connection etc. might affect to the test so that's why I cannot be 100% only 99,9% per cent sure that this effect is real and consecutive failures just resets the counter not stop it.

    Chuck Davenport said:

    If you do see a different behavior, let me know and will double check with our design team and experts on the ESM module to see if this is a silicon bug or expected behavior that should be documented
    I do not think
    I see different behavior, it looks like that in case t_err_low counting has been once started (via 5 write to EKR) the consecutive failures after that does not stop the counter, it just resets the counter but do not stop it.

    This is most likely documented correctly in TRM (except that clear example with picture is missing) since it says that counter is reseted (not stopped), it just needs a bit of improvisation from the reader to understand what it actually means since effect to 'pending' nERROR acknowledge is not mentioned at all.

    Another failure occurs within the time the pin stays low. In this case, the low time counter will be reset when the other failure occurs.

    So in case it is documented correctly it cannot be a silicon bug I assume? It is just a bit  "unexpected behavior" (do not know better term) since logically this should work like you described that after the failure one needs to give new ack in order to get pin up. This is quite much like example #5 what comes to the "unxpected"-thing but #5 is clearly documented. #5 could be maybe exploited in testing, first make ack then generate failure -> no need to worry about timings how to get the pin back to up (it would of course ack also real errors but since it looks like that there isn't a silver bullet how prevent nERROR rising UP in every scenario that exploiting should hurt a much more . But maybe still better to not use it even it would be rather tempting...)

    Now waiting for confirmation that finding is real, after that I have to think what I should do in order trying to make the latch release as robust as possible. Currently thinking that there isn't a method which would work in all cases so most likely some parallel means should be selected for trying catch possible error in application's primary error handling. Also noticed that based on TRM (chapter 12.2.3 step 1) error forcing (EKR=0xA) cannot be used in case nERROR is already down, so there is no method to stop LTC from counting once it is acked meaning that nERROR cannot be put permanently into DOWN state so basically it could pop back to UP anytime since you cannot prevent real errors to appear after testing something while waiting nERROR rising...

    There looks to be also differences between different test will the esm bits left active in case test fails (flash test acks ESM channels away always, but SRAM not) and there is also differences will be the nERROR ack be given or not (SRAM tests gives ack always to nERROR despite of test result). So other test acks ESM channels always and other tests ack nERROR always...
    SRAM_RADECODE_DIAGNOSTICS:
                _SL_SelfTest_SRAM_RAD(sl_tcram2REG, sram_stResult);
                _SL_HoldNClear_nError(); /* Clear nError */
                if (ST_PASS == *sram_stResult) {
                    sl_esmREG->SR1[1] = (uint32)(1U << ESM_G2ERR_B1TCM_UNCORR);
                    sl_esmREG->SSR2 = (uint32)(1U << ESM_G2ERR_B1TCM_UNCORR);
                }

    vs.
    FLASH_ECC_TEST_MODE_2BIT (and you remember from previous snippets that nERROR ack was in same code block and ST_PASS setting so it is done only in case of success).:
                /* Anyways clear flash & ESM status registers */
                #if defined(_TMS570LS31x_) || defined(_TMS570LS12x_) || defined(_RM48x_) || defined(_RM46x_) || defined(_RM42x_) || defined(_TMS570LS04x_)
                sl_flashWREG->FEDACSTATUS = F021F_FEDACSTATUS_B1_UNC_ERR;
                flashread = sl_flashWREG->FUNCERRADD;
                #endif
                sl_esmREG->SR1[2] =  BIT(ESM_G3ERR_FMC_UNCORR);



    There should be 2 open things waiting for confirmation
    1) SafeTI CCMR4F_SELF_TEST_ERROR_FORCING-test causes TRM #5 example behavior for future errors since it makes ack while nERROR should be UP (it is up if no other real errors happened during the test)
    2) "failure-ack-failure(-failure)"-pattern results the nERROR being UP after t_err_low-time from last failure (and there is no way to prevent it)

  • Hello Jarkko,

    Thanks again for you detailed summary and explanation of your observations.

    1) SafeTI CCMR4F_SELF_TEST_ERROR_FORCING-test causes TRM #5 example behavior for future errors since it makes ack while nERROR should be UP (it is up if no other real errors happened during the test)

    I confirmed this through code inspection and have opened a ticket in our CQ system for the SafeTI Diag Library.

    2) "failure-ack-failure(-failure)"-pattern results the nERROR being UP after t_err_low-time from last failure (and there is no way to prevent it)

    Again, I think you have confirmed this behavior in all of your carefully constructed test cases and observations. I will submit a documentation enhancement request for this to include a 6th example that, essentially, combines example 4 and 5. I believe the behavior you are describing is as shown in the diagram below. Once you confirm this I will submit the documentation enhancement request.

  • Hi,

    The picture looks correct.