esmREG->SR1 [2] modified w/o dabort call

Vladimir Romanov

Hi all!

At random time (5-200 ms) after board startup esmREG->SR1 [2] become 0x80 w/o call to dabort interrupt. This occurs only once. dabort called during selfcheck procedure at startup, but later not called. How I can intercept this error?

I suspect this error is related to interrupt handling. Maybe related to CAN interrupts.

This is my code

int main (void) {
    int i;
    platform_init ();
    io_init ();

    LOG_INFO (SYS, "ulDAbortCount=%u", ulDAbortCount);
    LOG_INFO (SYS, "esmREG->SR1 [0]=0x%x esmREG->SR1 [1]=0x%x esmREG->SR1 [2]=0x%x", esmREG->SR1 [0], esmREG->SR1 [1], esmREG->SR1 [2]);
    for (i=0;i<500;i++) {
        delay(1);
        if (esmREG->SR1 [2] != 0) {
            LOG_ERROR (SYS, "Flash error esmREG->SR1 [0]=0x%x esmREG->SR1 [1]=0x%x esmREG->SR1 [2]=0x%x", esmREG->SR1 [0], esmREG->SR1 [1], esmREG->SR1 [2]);
            esmREG->SR1 [0] = 0xFFFFFFFF;
            esmREG->SR1 [1] = 0xFFFFFFFF;
            esmREG->SR1 [2] = 0xFFFFFFFF;
        }
        wd_reset();
    }
    LOG_INFO (SYS, "ulDAbortCount=%u", ulDAbortCount);
    main_loop ();
    return 0;
}

Body Control (Head) Version 5.1.0 (3) 24/11/2016 14:01:42 FlashId=0x8B6230B5 Serial=0xD9F70F0B0013 MCU=TMS570LS0914APGEQQ1
0.000:I [SYS] Digital Watchdog config update. Reset time=399. Reset period = 3905
0.000:I [CAN] Set CAN1 speed to 500, loopback mode off
0.000:I [CAN] Set CAN2 speed to 500, loopback mode off
0.000:I [CAN] Set CAN3 speed to 500, loopback mode off
0.002:W [SYS] Valid NVRAM block not found!
0.002:I [J1939] Set J1939 address to 0x1E
0.003:I [SYS] Init completed in 3281 us
0.005:I [SYS] ulDAbortCount=0
0.005:E [SYS] esmREG->SR1 [0]=0x0 esmREG->SR1 [1]=0x0 esmREG->SR1 [2]=0x0
0.050:E [SYS] Flash error esmREG->SR1 [0]=0x40 esmREG->SR1 [1]=0x0 esmREG->SR1 [2]=0x80
0.505:I [SYS] ulDAbortCount=0
0.505:N [SYS] Main Task started

over 8 years ago

0 Bob Crosby over 8 years ago

TI__Guru 72500 points

Which device are you using? (R4 or R5 based?)

0 Vladimir Romanov over 8 years ago in reply to Bob Crosby

Intellectual 940 points

R4
TMS570LS0914APGEQQ1

0 Bob Crosby over 8 years ago in reply to Vladimir Romanov

TI__Guru 72500 points

esmREG->SR1 [2] bit 7 is the flash wrapper uncorrectable error. If you get this error, but did not get an abort, it was caused by a speculative fetch to an unused part of flash. (A thing the R4 does to improve performance by anticipating memory fetches.) The solution is that the ECC for the entire Flash TCM space should be programmed with correct ECC.

0 Bob Crosby over 8 years ago in reply to Bob Crosby

TI__Guru 72500 points

If you use the linker to generate the ECC, use a virtual fill so the linker generates ECC for the entire MEMORY section without filling the object file with 0XFFFFFFFF for the unused space.

MEMORY {
FLASH : origin=0x0000 length=0x4000 vfill=0xffffffff
}

0 Vladimir Romanov over 8 years ago in reply to Bob Crosby

Intellectual 940 points

I add next lines to test
uint8_t* ptr=(uint8_t*)0x000A0000;
LOG_INFO (SYS, "esmREG->SR1 [0]=0x%x esmREG->SR1 [1]=0x%x esmREG->SR1 [2]=0x%x", esmREG->SR1 [0], esmREG->SR1 [1], esmREG->SR1 [2]);
LOG_INFO (SYS, "Value=%u",*ptr);
LOG_INFO (SYS, "esmREG->SR1 [0]=0x%x esmREG->SR1 [1]=0x%x esmREG->SR1 [2]=0x%x", esmREG->SR1 [0], esmREG->SR1 [1], esmREG->SR1 [2]);

After this dabort called and program do endless loop at line fashErrorReal in dabort.asm. In my case dabort not called.

0 Vladimir Romanov over 8 years ago in reply to Vladimir Romanov

Intellectual 940 points

Another test with uint8_t* ptr=(uint8_t*)0x002FFFFF;
In this case dabort called again, again and again.

0 Bob Crosby over 8 years ago in reply to Vladimir Romanov

TI__Guru 72500 points

Sorry, I must be missing the big picture. Are you trying to create an abort? Is the ECC properly programmed for location 0x000A0000?

0 Vladimir Romanov over 8 years ago in reply to Bob Crosby

Intellectual 940 points

I try to show that my case not related to access to flash memory with wrong ECC. Main difference - dabort not called. When I try read byte at 0xA0000 dabort called. This memory area has wrong ECC.

IMHO this is very strange situation. ESM Group 3 events always must lead to dabort. Even if my code write to random address and this address is equal to ESM->SR1[2] this is impossible, because writing to this register can only CLEAN bits.

0 Bob Crosby over 8 years ago in reply to Vladimir Romanov

TI__Guru 72500 points

When a speculative fetch of an address with wrong ECC is made, but the data is not used by the CPU, the ESM error occurs, but no abort. Your read of address 0x000A0000 was an intentional read, so the abort was taken. A classic case of a speculative fetch is if the compiler generates a compare against a register contents, and then the next instruction is an LDREQ Rd,[Rs] (conditional load). The R4 starts the memory access based on the contents of Rs, before it evaluates the results of the compare. If the compare is not equal, Rs may have some random value. The CPU discards the memory contents it read (not put into Rd) and does not do an abort. But the memory read generated the CPU event signal that tells the ESM there was an uncorrectable flash error. Please try generating ECC for the entire program flash space and see if your original problem goes away. Here is the information in the TRM SPNU607.

0 Vladimir Romanov over 8 years ago in reply to Bob Crosby

Intellectual 940 points

Thank you very much!
Problem fixed!
But I found two another bugs :).

Arm-based microcontrollers

Arm-based microcontrollers forum

esmREG->SR1 [2] modified w/o dabort call