TMS570LS1227 nError asserted on enabling ESM channels

Steven Aris26

Prodigy 210 points

Other Parts Discussed in Thread: TMS570LS1227

Hi,

Microcontroller: TMS570LS1227 Rev A silicon

Platform: bespoke dev board

Compiler: "TI v5.2.8"

I've encountered an intermittent problem with nError being asserted seemingly spuriously on enabling "Error Pin Action\Response" for all ESM Group 1 Channels.

At the time of enabling the Error Pin Action, all the ESM status registers (EMSSR1 - ESMSR4) are clear - there are no ESM errors pending.

At some point previously in the code, the error for Group 1 Channel 6 (FMC correctable ECC error) has been set and then cleared, so I am wondering whether this error is somehow (sometimes) 'sticky' even after having been cleared. I should also say that if I do not enable the Error Pin Action for Group 1 Channel 6, then nError does not get asserted.

I do not want to implement a work-around until I understand the cause of this issue, so any relevant information or help would be much appreciated.

Thanks in advance,

Steve

over 8 years ago

0 Bob Crosby over 8 years ago

TI__Guru 72500 points

Do you enable ECC and do you program proper ECC for the entire 1.25MB of Program Flash? The Cortex R4 will speculatively fetch data in anticipation of needing it, but from addresses within the TCM space that you are not using. One of those fetches could be causing the ESM Group 1 Channel 6 error.

0 Steven Aris26 over 8 years ago in reply to Bob Crosby

Prodigy 210 points

Hi Bob,

Thanks for the quick response.

ECC has previously been enabled in our Bootloader, but at the point nError gets asserted (in our application) ECC is disabled.

We haven't programmed the entire flash, but our data abort exception handler and high level ESM interrupt (all ESM errors are routed through our high level interrupt) have been modified to handle reading erased flash (including speculative fetches), so I expect to see some ESM Group 1 Channel 6 errors, but these are cleared in our data abort handler and also immediately after disabling ECC checking in our Bootloader (although the Group 1 Channel 6 Error pin action is not enabled at these points - does this make a difference?).

I've also confirmed that the ESM status registers are all clear (so there are no errors pending), both immediately before and after the point nError gets asserted...which happens on enabling the Error pin action.

I guess what I'm really after understanding is what could be triggering nError if the ESM status registers are all clear?

Is clearing an ESM error in the ESM status registers for a channel that is not enabled, sufficient to prevent it asserting nError if that channel is later enabled?

Best regards,

Steve

0 Bob Crosby over 8 years ago in reply to Steven Aris26

TI__Guru 72500 points

An ECC error on a speculative fetch that is subsequently not used by the CPU will not create an abort, but will set the ESM flag, single bit or uncorrectable.

If you are reading unprogrammed flash locations, you are probably also getting uncorrectable errors. These are group 3 errors so the toggle on the nERROR pin cannot be disabled. If this was an intentional read, you would also generate a data abort. Would your abort routine clear that ESM flag?

But I agree it does not make sense that you would see all 4 status registers clear, enable the pin action for group 1, see an nERROR toggle and see that all 4 status registers are still clear. Did you read and verify that ESMSR1 and ESMSR4 were all clear, or just write to clear them? The flag will be set again if the error condition persists. Do you generate an interrupt as well? Do you read the ESMSRx values in the interrupt routine and can you verify that no bits were set?

0 Steven Aris26 over 8 years ago in reply to Bob Crosby

Prodigy 210 points

I take it an ECC error on a speculative fetch will still trigger the ESM high level interrupt?

And that if the ESM correctable ECC error flag is cleared by writing to the status register, then the error flag will only be set again if the data at the offending address is re-read?

Our data abort routine does clear all the ESM Bank 0 and Bank 7 correctable and uncorrectable ECC error flags, although I don't perform a subsequent read to verify this (but perhaps I should). I have however, stepped through the code to ensure it works as expected and to confirm that the ESM flags are cleared.

For the purpose of debugging this problem I output the contents of all four status registers to a DIO pin (as on our system the debugger prevents nError resets, so was making it harder to see the problem) along with a test pattern to ensure the DIO code is working correctly, both before and after enabling the ESM error pin action...and monitor this DIO pin along with nError, using a USB logic analyser. This shows that all four status registers ESMSR1 - ESMSR4 are '0' both immediately before and immediately after nError gets asserted. When I have used the debugger, this also shows that there are no pending ESM errors during execution of the application, which is where the problem occurs. ECC checking is disabled in the application which I guess means that we shouldn't be generating any new ECC errors?

In the Bootloader where ECC checking is enabled we do have an interrupt routine (high level ESM interrupt only) where we check for correctable and uncorrectable ECC errors in case of a speculative fetch causing an ECC error and if necessary clear the errors from the status registers (so that it will be the data abort handler that actually handles all ECC errors) and write a value of 0x05 to the ESM Error Key register to reset nError - this prevents our FPGA from resetting the TMS570.

For the purposes of debugging I've also copied the ESM interrupt routine from our Bootloader (with all associated VIM initialisation) into the Application to try and catch any ESM errors, but this does not get called, likewise any while loops I've put in to catch any ESM errors (before enabling the ESM error pin action) never trap the problem - which would seem to back up what I see reported on the DIO pin, namely that no ESM errors are pending when nError gets asserted.

I take it that if the ESM status register is clear then you wouldn't expect any previously cleared errors to assert nError when the error pin action is enabled, even if the error pin action was disabled when the error was cleared?

And there's nothing else I need to do other than clear the relevant flag in the ESM status register to clear the error, assuming I don't read the data at the offending address again?

Best regards,
Steve

0 Bob Crosby over 8 years ago in reply to Steven Aris26

TI__Guru 72500 points

A speculative fetch that generates an uncorrectable error will not generate an ESM interrupt. If the data is not used, it will not create an abort. Only the nERROR pin toggle. That is why the correct ECC should be programmed for the entire program flash space if ECC is used.

0 Steven Aris26 over 8 years ago in reply to Bob Crosby

Prodigy 210 points

Am I correct in thinking that if the data cache is disabled, then there will be no speculative data fetches? We don't currently have the caches (data or instruction, or even TCM error) enabled and I suspect have no intention of enabling them, so should hopefully be okay.

I guess speculative instruction fetches will always happen because of the pipeline? Although, if we experience ECC errors from reading erased memory on an instruction fetch, we will probably have more serious problems lurking in our code.

And this current problem is not going to be a speculative fetch as ECC checking is disabled at the time nError gets asserted (and has been for many more instructions than would be needed for a pipeline fetch to trigger an ECC error)...although if speculative fetches are going to be an issue for us they could cause problems elsewhere in our code.

And as for programming the entire flash, that unfortunately is not going to be possible as we need some of Bank 0 for storing "updateable" data - we're already fully using Bank 7 so that is not an option - which is why we're attempting to get round this problem of ECC errors occuring on reads to erased flash.

Best regards,

Steve

0 Bob Crosby over 8 years ago in reply to Steven Aris26

TI__Guru 72500 points

There is no Cache on the TMS570LS1227, and the speculative fetches by the CPU are independent of the pipeline mode. (Pipeline fetches do not cause the ECC errors because the ECC is not checked until the data is read by the CPU.)

That said, I agree you should not get ECC errors when the ECC is disabled. Have you checked if any bits are set in the FEDACSTATUS register before clearing the ESM status flags and enabling the ESM actions?

0 Steven Aris26 over 8 years ago in reply to Bob Crosby

Prodigy 210 points

Thanks Bob,

I wasn't checking the FEDACSTATUS register and it appears that was the problem - whenever nError is asserted unexpectedly, bit 1 (ERR_ZERO_FLG) is set in the FEDACSTATUS register (and conversely FEDACSTATUS is clear when nError doesn't get asserted), so clearing the FEDACSTATUS register after disabling ECC checking seems to prevent nError being unexpectedly asserted.

On an aside - the speculative fetches. Can you confirm whether my understanding is correct?

From looking through the Cortex-R4 TRM it seems that:

1) speculative fetches only occur for instruction fetches

2) if an ECC error occurs on a speculative fetch, then the relevant ESM channel error bits get set in the ESMSRn register(s)

3) an ECC error on a speculative fetch does NOT cause an interrupt (I've not actually been able to confirm this either way in the TRM)

4) any ESM 'ECC error' set to trigger nError will assert nError if it occurs on a speculative fetch - in our case the FSM correctable ECC error (Grp1 Chan6)

Thanks again,

Steve

0 Bob Crosby over 8 years ago in reply to Steven Aris26

TI__Guru 72500 points

Hi Steve,

Ok, that makes sense. On your follow on questions:

Steven Aris26 said:
1) speculative fetches only occur for instruction fetches

No, a speculative fetch can occur on a data fetch. I have seen it on an LDR instruction with a conditional (like: LDRNE).

Steven Aris26 said:
2) if an ECC error occurs on a speculative fetch, then the relevant ESM channel error bits get set in the ESMSRn register(s)

Yes

Steven Aris26 said:
3) an ECC error on a speculative fetch does NOT cause an interrupt (I've not actually been able to confirm this either way in the TRM)

A speculative fetch of a location which generates an uncorrectable error will not generate an abort or an interrupt unless the fetch is actually used. Then an abort is taken. It will generate a group 3 channel 7 ESM error, which toggles the nERROR pin, but does not generate an interrupt.

A speculative fetch of a location which generates a correctable (single bit) error will generate a group 1 channel 6 event which will generate an interrupt if so enabled.

Steven Aris26 said:
4) any ESM 'ECC error' set to trigger nError will assert nError if it occurs on a speculative fetch - in our case the FSM correctable ECC error (Grp1 Chan6)

Yes.

0 Steven Aris26 over 8 years ago in reply to Bob Crosby

Prodigy 210 points

Thanks Bob.

I guess we may need to have a bit more of a think about our ECC handling scheme - although it may be okay - but thanks again for your help with our nError issue.

Best regards,
Steve

Arm-based microcontrollers

Arm-based microcontrollers forum

TMS570LS1227 nError asserted on enabling ESM channels