TM4C129XNCZAD: Limits of eeprom corruption if SYSRESREQ asserted during eeprom program?

Mark Scovel

Part Number: TM4C129XNCZAD

SPMZ850F Errata says:

EEPROM data may be corrupted if any of the following soft resets are asserted during
an EEPROM program or erase operation:
• Software reset (SYSRESREQ)

Would it be multiple locations that would be corrupted or the one that is in progress when the reset occurs?

Edit: I'm asking the question because we've found multiple CPUs whose EEPROMs have what looks like corrupted data in multiple locations so we're trying to determine the cause.

over 8 years ago

0 Robert Adsett over 8 years ago

Guru 27665 points

Since it doesn't say one you are best off assuming the whole EE is in peril. Check the other related errata, if the 129 is similar to the 123 this is one of the more benign errata related to the EE.

BTW I have seen this kind of fault even in real EE and there is no way of predicting what will be corrupted.

Robert

0 Robert Adsett72 over 8 years ago in reply to Robert Adsett

Guru 10570 points

And actually if this is a concern then I would suggest an SPI based EEPROM or FRAM. Either would be less vulnerable and faster, especially the FRAM. FRAM has the additional advantage of essentially limitless updates and it's speedy.

Robert

0 Robert Adsett72 over 8 years ago

Guru 10570 points

Mark Scovel said:
Edit: I'm asking the question because we've found multiple CPUs whose EEPROMs have what looks like corrupted data in multiple locations so we're trying to determine the cause.

Ah, slightly different question. My first response holds. Even real EEs if interrupted during a write can produce corruption in the bytes not being written. Since the EE is only emulated with flash (which leads to low write speeds which makes them even more vulnerable) the likelihood is even higher. Since it has to write a whole flash line it may even approach certainty that multiple locations would be affected.

There are other errata that may affect this as well. basically if there is any chance that the write operation will be interrupted under any conditions you may corrupt memory, you may disable the EE and worse, in some cases IIRC, you may brick the IC.

My fix recommendation would remain the same.

Robert

0 Amit Ashara over 8 years ago

TI__Guru**** 244400 points

Hello Mark

There could be two issues at play here

1. The wear-and-tear of the EEPROM if it has exceeded the maximum P-E cycles. A simple check would be read the EEPROM words that seem to be affected multiple times and compare if bits are changing without having being programmed.
2. The EEPROM is sector organized. So a terminated EEPROM Program-Erase may have caused the state of multiple words to be in such a state. However all EEPROM words should be in the same block. Can you check and confirm the same?

0 Mark Scovel over 8 years ago in reply to Amit Ashara

Prodigy 130 points

We are sure that the program erase cycles have not been exceeded. We can confirm that re-reading the affected cells always returns the same result. It appears that some of the affected cells are farther apart than one sector.

0 Amit Ashara over 8 years ago in reply to Mark Scovel

TI__Guru**** 244400 points

Hello Mark,

Is it possible to run a data integrity check on the EEPROM to verify if the data stored in the EEPROM have not been corrupted over a period of time?

0 Mark Scovel over 8 years ago in reply to Amit Ashara

Prodigy 130 points

Unfortunately the data does not have an integrity check. I've concluded that the data was probably corrupted by a SYSRESREQ while a write was in progress. The code uses that exact sequence: write data to eeprom (one task) and then reset using SYSRESREQ (from a different task that does not wait for/check for the eeprom write to finish).

0 Amit Ashara over 8 years ago in reply to Mark Scovel

TI__Guru**** 244400 points

Hello Mark,

If the code does not wait for the completion check then it is evident that the programming model is not being followed. In such a case it is not possible to determine what can be affected in time. As the EEPROM word is used, the underlying flash see wear and tear causing timing of the program-erase operation to increase over time. Also some bit cells can be more affected than others.
The only correct approach is to update the mechanism to ensure that a SYSRESREQ is not executed till all such operations are complete.

0 Robert Adsett72 over 8 years ago in reply to Amit Ashara

Guru 10570 points

And Mark, you really, really need to read the errata. This is one of the least disruptive effects of the errata.

Also note most EE do not guarantee reliability if writes are interrupted in an uncontrolled fashion. The only real difference here is that the EE writes are much longer on the TM4Cs because the EE is emulated in Flash.

Robert

0 Mark Scovel over 8 years ago in reply to Robert Adsett72

Prodigy 130 points

Robert, I've read the errata and noticed those that apply AFAIK. How about you be more helpful and tell me what you're thinking that makes you say that?

Insofar as memory/eeprom goes, these are the items in the errata that apply. Please correct me if I'm wrong?

These are the memory/eeprom applicable errata I see: What am I missing that you know?

Memory
MEM#07 Soft Resets should not be Asserted During EEPROM Operations X
MEM#11 The ROM Version if the TivaWare EEPROMInit API Does not Correctly Initialize the
EEPROM X
MEM#15 Specific Flash Locations in any Sector do not get Erased X X X
MEM#16 JTAG Unlock Issue when BOOTCFG is Committed with Debug Disabled X X X
MEM#17 Clearing Reserved Bits of BOOTCFG Causes ROM Boot Loader to Fail X X X
MEM#18 Only Lower 512KB Flash can be Protected on 1MB Devices X X X
MEM#19 Certain GPIOs Cannot be Configured as boot pins X X X
MEM#20 Setting Mirror mode bit Causes Bus Faults on 512KB Flash X X X

0 Amit Ashara over 8 years ago in reply to Mark Scovel

TI__Guru**** 244400 points

Hello Mark

In this case MEM#07 seems to be more likely cause.

0 Robert Adsett over 8 years ago in reply to Mark Scovel

Guru 27665 points

Mark, the '123 has a number of errata, at least one of which literally prevent the micro from working after a failure. The '129 ones listed so far do not seem that extreme but note

MEM #03 means you cannot have a power failure while writing to EEProm. In the best case that requires a large hold-up to ensure the write completes since the maximum write time is specified in seconds.

MEM #07 basically means that you cannot write while your program is running since doing so requires you not use the watchdog.

MEM #12 means that you cannot use the ROM API if you use EE. That may not be a concern and the workaround is only space.

MEM #13 means you cannot use uDMA if you use EE

That's a quick overview.

Robert

And then there's the "normal" maximum write time specifications

0 Mark Scovel over 8 years ago in reply to Mark Scovel

Prodigy 130 points

Thanks Robert,

MEM #03: we're using rev3 chips so it does not apply.

MEM #07: we have an external watchdog

MEM #12: we don't use the ROM API.

MEM #13: we're using rev3 chips.

0 Robert Adsett over 8 years ago in reply to Mark Scovel

Guru 27665 points

Mark Scovel said:

MEM #07: we have an external watchdog

Note that it also applies to an external reset

Mark Scovel said:
we're using rev3 chips.

Do you check?

Robert

0 Mark Scovel over 8 years ago in reply to Robert Adsett

Prodigy 130 points

"Note that it also applies to an external reset"

Yes, thank you.

"Do you check?"

We are not implementing code to evaluate it. We simply looked at our lots/board revs.

0 Robert Adsett over 8 years ago in reply to Mark Scovel

Guru 27665 points

Mark Scovel said:
Do you check?"

We are not implementing code to evaluate it. We simply looked at our lots/board revs.

Since you are dependent on HW revision, that's probably something you should specify on ordering and check on incoming inspection. And since inspection is of limited effectiveness I think there's a good argument for a runtime check.

Stock can get stalled in the eddies of the supply stream before returning to the main flow so you cannot be certain of the age of your incoming parts unless you verify.

Robert

Arm-based microcontrollers

Arm-based microcontrollers forum

TM4C129XNCZAD: Limits of eeprom corruption if SYSRESREQ asserted during eeprom program?