TM4C1294NCPDT: TM4C1294NCPDT flash write issue

Dmitry Govorov

Part Number: TM4C1294NCPDT
Other Parts Discussed in Thread: EK-TM4C1294XL, TM4C123GH6PM, TM4C129ENCPDT, SEGGER

Hello,

I have a strange issue during flash erase/write operations in my project. I'm using TM4C1294NCPDT microcontroller and split it's internal flash to a two equal parts and also I'm using mirror mode. So in my project I do a firmware upgrade procedure which writes data to unused part of flash, thus if firmware has started from lower part of flash (0x00000000) it writes to a upper part (from 0x00080000) and vise-versa.

The issue is it freezing the CPU during FlashErase() or FlashProgram() functions and at this time all core registers has exactly equal value. These functions are from TivaWare 2.1.4.178. I'm trying to use MAP_ versions of these functions with the same results. You can see the issue with MAP_FlashErase() on first picture. The difficult is that if change some code even in other part of project the issue can disappear or appear again.

I've managed to create simple project which demonstrate the same issue and can be run on EK-TM4C1294XL Launchpad board. The project just writes to a flash some counter from 0x90000 till 0x100000. In my setup the issue disappeared if you just comment out

vTaskDelayUntil(&ui32LastTime, 30000 / portTICK_RATE_MS);

and un-comment the next line

//while(1){};

in 'FlashWrTask()' function in 'main.c' file.

FlashFailureTest.zip

To run my project set 'Linker comand file' to 'FlashFailureTest.cmd' on 'Project Properties -> CCS General -> Main' tab. And set TIVAWARE_DIR variable on 'C/C++ Build->Build Variables' tab to your TivaWare path.

The project uses:

FreeRTOS v9.0.0
TivaWare 2.1.4.178
TI v16.9.7.LTC compiler
CCS 7.2.0.00013

Can anybody help me to determine the root cause of this issue?

Is it possible bug in hardware? I found the similar issue in TM4C123 CPU

With best regards
Dmitry

over 6 years ago

0 cb1_mobile over 6 years ago

Guru 117855 points

Hi - wish I could assist - (Flash "erase/write" - w/MCUs here - is outside my experience.) So often - posters are advised to, "Increase the stack."

The excellence of your post must be noted - nicely detailed - the narrative, screen-caps, and "use listings" are sure to prove helpful to "more competent" others...

0 Bob Crosby over 6 years ago

TI__Guru 72500 points

Hi Dmitry,
Thank you for doing a good job of documenting this issue. I have imported your project and recreated the issue. Just like you I have trouble debugging it because when the CPU is locked, all the information returned is invalid. I am looking into it.

I suspect that for some reason the CPU is trying to do a read or write to the flash bank that is being programmed or erased. In that case the flash wrapper puts the CPU in a hold (very long wait state).

0 cb1_mobile over 6 years ago in reply to Bob Crosby

Guru 117855 points

Hi Bob,

As the CPU is able to "detect" (some) "issue/trouble/irregularity" (proved by its vectoring to the LONG HOLD) - rather than just, "Placing itself on "hold"" - could not (SOME - even slight) identification of the issue - be additionally provided?

Clearly, "No your job! - (little "street-lingo" - to "liven up the joint.") - yet perhaps you can "pass upstream" - most anything BEATS, "Being Stuck on (very long) HOLD!"

Firm/I are implementing such "trouble identification" - for a 100A+ BLDC Controller - forced to operate w/in (very) hostile environments. "Identification of such issues" (so that a LIVE, COMBATING DEFENSE may be launched) is now our, "ISSUE #1!" (we never/ever (even) considered a, "LONG Hold!")

0 Bob Crosby over 6 years ago in reply to cb1_mobile

TI__Guru 72500 points

Hi CB1,
At this point it is merely speculation on my part. If the flash is being erased or programmed and the CPU tries to read the flash, the CPU will be held until the erase or program operation completes. It should not be indefinite. Also, a watchdog can bring the part out of this state. Typically, the main flash should only be erased or programmed during a program change.

Unfortunately it might take a while for me to work this one out as I am out of the office the rest of the week. I will try to work on it as I get time.

0 cb1_mobile over 6 years ago in reply to Bob Crosby

Guru 117855 points

Thank you, Bob ... yet (even) if speculative - "Launch into (any) "hold/cessation of expected activity" - without ANY, "Signaling of such fact" - or "Attempt to harvest "- (and convey) "some clue" - appears not quite, "best/brightest" - does it not?

Flash "Erase/Write" IS a complex process - and it would appear "reasonable" - that (some) defensive and/or identifying measures - would prove greatly helpful. Especially so - to (any/all) "becalmed" posters - un-notified - and "waiting helplessly"... Again, "No your job" - but suggested "improved handling" (surely) trumps, "Wait/Question/Prayer."

0 cb1_mobile over 6 years ago

Guru 117855 points

While claiming (almost) no experience nor qualification w/your MCU - and this, "Flash Erase/Write" issue - "KISS" may (again) rise to your assistance. (one would hope)

Firm/I (also) favor FreeRTOS (which is notably licensed by MIT & Amazon) - yet its use may "cause/contribute" to your issue - is that not so?

Follows several general - KISS-based - suggestions:

Might you (temporarily) "Exit from any RTOS" - and repeat your test. Our intent is to "Minimize the number" of (potentially) IMPACTING Variables.
Are your two "Flash Regions" PROPERLY SPLIT - into their specified (distinct) locations. I believe that such Flash "Erase/Write" may (only - or best) operate between such, "separate regions."
Prior to this operation (again temporarily) might you, "Slow the System Clock" (perhaps by Half) just to see if this impacts your result.
Insertion of "Delays" - unless expressly forbidden - between key/critical portions of your specific "Flash Erase/Write" code sections - often proves useful.
Cannot you "prove" that "Flash Erase" succeeded - prior to launching "Flash Write"? Does that not prove insightful?

KISS dictates that we, "Decompose the issue into far smaller, more reasonably Measured & Analyzed parts, and then systematically progress - on a "One at a Time" basis.

I offer this as vendor's Bob has noted his delayed return - and your recovery of (some) of the information here (or that you uncover) will likely, "Speed, Ease & Enhance" Bob's Diagnostic Effort (if not solved) prior to his return...

0 Dmitry Govorov over 6 years ago in reply to cb1_mobile

Prodigy 130 points

I'll try to answer on your questions:

As I said in starting post, this issue is very sensible to a code adding/delaying to/from project. Simple project which will just write to flash in the main loop isn't demonstrate the issue. I've just add FreeRTOS as it was used in my 'big' project, because my idea was in assumption that the RTOS is uses the SysTick timer and maybe sometime the SysTick interrupt is appeared during FlashWrite operations which caused the failure. And even in this 'simple' project, if you comment out 'CheckResetCause()' function calling which is used only ones and before the FreeRTOS scheduler started the issue won't appeared.
I don't know what do you mean about '"Flash Regions" PROPERLY SPLIT'. They are not split physically, because they both in internal flash. In by 'big' project I've limited the flash size to a half of available internal flash. In this 'simple' project I didn't do this.
I've tried to reduce clock speed to a 60MHz 'globally' at the start of main function and this fixes the issue. But I can't be sure is it really fixes the issue or it was just hidden because I've made some changes in code.
FlashErase and FlashWrite functions are from TivaWare and they are checking Flash busy bit inside them. MAP_FlashErase & Write functions are from ROM and I don't know how they work.

But all suggestions are not answering on a question why all core register become to equal values including CTRL_FAULT_BASE_PRI register which is not saved to stack and souldn't be corrupted in the case of stack overflowing. On the screen shot you can see the place there program is after the failure. I think this is not a random address. But why all core registers contain it?

0 cb1_mobile over 6 years ago in reply to Dmitry Govorov

Guru 117855 points

Thank you - have read/reviewed your writing - I've calls into several of our firm's clients - who use the '129 family. (we do not - preferring the 180MHz M4 & 200MHz+ M7offerings of others - I did note my inexperience w/your MCU...)

You wrote, " I'm using TM4C1294NCPDT microcontroller and split it's internal flash to a two equal parts and also I'm using mirror mode."

And I wrote, "Flash Regions" PROPERLY SPLIT". By this - I noted the fact that " (certain) '129 MCUs employ - "separate & independent" - Flash Bank Pairs." And these appear, "Split Physically." From my experience w/that implementation (upon "others' MCUs") and from several (past) posts here - use of MCUs w/those "Flash Bank Pairs" - often proves best. While I'm not prepared to state that this will (positively) "work for you" - it does merit your consideration...

You note that my suggestion to, "Reduce the System Clock" - "fixes the issue." But you then, "cloud the issue" by adding, (you) "made some changes in code." Any such "changes" were OUTSIDE my suggestion - Anti-KISS - and confounding - were they not? (the goal was to MINIMIZE (Unknowns) & Variables - not introduce new ones.)

Not using - nor "liking the '129" - I've no explanation for your report of, "All core registers containing equal values." Does that common value provide any insight? (is it at least expected and/or "legal?") This repetition of value may occur when a, "Write to a Core Register (unwantedly) "loops" - with the Register Address set to increment. Your review of the pertinent "Flash Erase/Write" code is suggested.

The more advanced MCU family from this vendor (C2000) includes this note: "The Erase operation is a dynamic operation - pulses are applied until the erase is complete - or the maximum number of pulses is reached and the erase fails." Such may - or may not - describe your issue. I did ask that you, "Test your Erase operation" - you did not respond in a manner which I could understand...

I've offered the best guidance I can - thus far "thanks free" - and as an "outsider" - firm's/my ability to answer, "Highly specific (insider) vendor MCU issues" - may not be, "in the cards." I have tried...

0 Genatco over 6 years ago

Guru 54568 points

Hi Dmitry,

From your screen shots first post notice RTOS operating system and Debug ROV has not been launched. Debug ROV may help you to see if an M3 module interrupt or other exception flag was thrown by the CPU during flash writes. Is not IntMasterDisable() being called in the flash writes part of application and my disrupt debug if the ICDI settings for F5/F6 stepping are not set correctly.

Mirroring or shadow copies of lower flash sectors into higher ones may require 8k page boundary and 256KB blocks be maintained during writes as CB1 mentions. That would be one reason the CPU may be halting and ICDI looses contact with the DAP. Perhaps ensure MPU is enabled in the ICDI ARM advanced settings which allows it to run during a fault condition.

0 Dmitry Govorov over 6 years ago in reply to Genatco

Prodigy 130 points

Hi BP101,

I'm using FreeRTOS and as I know the Debug ROV won't works with it, it works only with TI-RTOS. And ICDI isn't disconnected from target during failure I'm still able to observe peripheral registers and can run and pause the firmware execution, but I can't do F6 (Step run).

I'm writing Flash by 16k sectors and buffer in RAM which I'm using as a data to write is also aligned on 16k boundary.

Returning to my project:
The failure occurs on 4th sector writing each time. I've tried to pause firmware after 3rd sector then place the breakpoint in SysTick interrupt handler. As I see the SysTick interrupt occurs several times during FlashWrite operation. FlashWrite function executed from ROM (in a region 0x1000000). So I'm running the firmware again (F8) after each breakpoint stop. And thus the 4th sector is able to be writed and the firmware starts to write to 5th sector. Then I disable the breakpoint in SysTick timer and run firmware without breakpoints and it catch the failure on 8th sector writing.

Regards,
Dmitry

0 Dmitry Govorov over 6 years ago in reply to Dmitry Govorov

Prodigy 130 points

Hi All,

I have an update. I've managed to make my project failure or not just by adding some NOPs.

See the same project, but I add some NOPs in the main function. There are 26 NOPs. And Flash operations work well. If you comment out the last 26th NOP, the Flash operations will fail. Then if you will comment out the next NOPs by one from the bottom the Flash operations will fails till 21 NOPs are in code, then continue to comment out the NOPs and project will work fine till the 11 NOPs are in code. If you comment the 11th NOP, the project will fail again.

FlashFailureTest2.zip

0 cb1_mobile over 6 years ago in reply to cb1_mobile

Guru 117855 points

cb1_mobile said:
Insertion of "Delays" - unless expressly forbidden - between key/critical portions of your specific "Flash Erase/Write" code sections - often proves useful.

As earlier suggested - your insertion of "NOPs" - complies w/the "Insertion of Delays" suggestion - as posted earlier...

It is suspected that the additional, "Slowing of System Clock" - also suggested earlier - will reduce the number of "required NOPs."

As earlier advised - the "combination" of, "Reduced System Clock Speed" and "Insertion of key Delays" - should work to your favor... (has succeeded w/our Cortex M7 devices & M4s (from others))

0 Genatco over 6 years ago in reply to Dmitry Govorov

Guru 54568 points

Dmitry Govorov said:
As I see the SysTick interrupt occurs several times during FlashWrite operation.

Perhaps why IntMasterDisable() call prior to Flashing a 16KB block may stop undesired interruptions, IntMasterEnable() just after. Similar interruptions are noted in UART data transfers from SRAM where index pointers were thus being corrupted to the buffer array.

0 Chester Gillon over 6 years ago

Guru 92251 points

Dmitry Govorov said:
Is it possible bug in hardware? I found the similar issue in TM4C123 CPU

Possible TM4C123 hardware bug

TI E2E Community

--------------------------------------------- Summary of this long thread as of 2/4/2015 : There is a serious bug in Tiva TM4C123 microprocessors that may lead to undefined behavior (such as stack corruption) under the following conditions: - A...

That bug in the TM4C123 device was documented as errata MEM#14 "Flash Write Operation During Execute from Flash may Result in Wrong Instruction Fetch". Looking at the "Tiva™ C Series TM4C129x Microcontrollers Silicon Revisions 1, 2, and 3 Errata" SPMZ850G dated March 2017 I can't see a corresponding errata documented for the TMC129 devices.

At the time of the investigation into the TM4C123 MEM#14 the example program which failed on a TM4C123 didn't fail on a TM4C129 device, believed to be because the TM4C129 used a different flash architecture.

However, based upon your description of the problem on a TM4C129 device, in that changing NOPs can make the problem come and go, suspect you have encountered a bug in the TM4C129.

0 cb1_mobile over 6 years ago in reply to Chester Gillon

Guru 117855 points

0 cb1_mobile over 6 years ago in reply to Chester Gillon

Guru 117855 points

Love the investigatory "proof/detail" - which most always accompany your postings.

You properly note, re: TM4C123 MEM#14: "the example program which failed on a TM4C123 didn't fail on a TM4C129 device, believed to be because the TM4C129 used a different flash architecture." Yet - was the "offending program" (properly) mated/constrained to the (lesser) TM4C123? (had You run the program - that surely would have been implemented correctly.)

Yet - was the "offending program" (properly) mated/constrained to the (lesser) TM4C123? (had You run the program - such surely would have occurred!) Has it been established that the example program WAS well suited to TM4C123?

Another point - there are (multiple) versions of TM4C129 - and the Flash Architecture varies. Use of the segregated/independent, "Flash Bank Pairs" (as past suggested, this thread) appears to offer the, "Highest Odds of Success!" It appears useful then - to employ a '129 family member" - equipped w/those, "Flash Bank Pairs."

As reported - other (alien) ARM MCUs have reported similar such issues. These issues (usually) have not been promoted to "bugs" - instead the use of, "Conservative System Clock" AND proper (sufficient) Delays - between discrete Flash operations - along w/use of "Flash Bank Pairs" (when/where available) has proven to succeed...

0 Chester Gillon over 6 years ago in reply to cb1_mobile

Guru 92251 points

cb1_mobile said:
You properly note, re: TM4C123 MEM#14: "the example program which failed on a TM4C123 didn't fail on a TM4C129 device, believed to be because the TM4C129 used a different flash architecture." Yet - was the "offending program" (properly) mated/constrained to the (lesser) TM4C123? (had You run the program - that surely would have been implemented correctly.)

I started with a program which failed on a TM4C123GH6PM and then ported the program to a TM4C1294NCPDT (which didn't fail). Unfortunately I can't find the version of the program for the TM4C1294NCPDT to find exactly what I changed.

Therefore, will investigate with the example from Dmitry.

0 Chester Gillon over 6 years ago in reply to Dmitry Govorov

Guru 92251 points

Dmitry Govorov said:
See the same project, but I add some NOPs in the main function. There are 26 NOPs. And Flash operations work well. If you comment out the last 26th NOP, the Flash operations will fail.

I can repeat the failure, using the same version of the TI compiler and TivaWare but using CCS 8.0.0.00016.

For the tests used a TM4C129ENCPDT part revision 3.

A summary of the tests so far is:

1) The unmodified program ran successfully.

2) Used a XDS110 and the CCS Statistical Function Profiling, with the default of sampling the PC every 1024 cycles, to capture a profile of a successful run.

3) Commented out the last 26th NOP and the program started failing. After most failures the debugger was unable to halt the target. Used the CCS Statistical Function Profiling to capture a profile of a failed run. The Statistical Function Profiling appears to show the program get stuck in the xPortPendSVHandler() function since the trace ends with all sampled PC values being the same address 0x55ae which is the bx r14 instruction at the end of xPortPendSVHandler().

4) Taking the previous failing program, if a hardware breakpoint is set on xPortPendSVHandler and then the program is resumed when the breakpoint is hit the program runs successfully to completion. I have performed multiple runs and:

a) Without a breakpoint on xPortPendSVHandler the program fails.

b) With a breakpoint on xPortPendSVHandler and resuming when the breakpoint is hit and the program does complete successfully.

Don't yet understand what the root cause is, but given setting a breakpoint can make the problem go away suspect some timing issue.

0 Chester Gillon over 6 years ago in reply to Chester Gillon

Guru 92251 points

Chester Gillon said:
b) With a breakpoint on xPortPendSVHandler and resuming when the breakpoint is hit and the program does complete successfully.

Also, if set a breakpoint on xPortPendSVHandler but use a non-zero Skip Count to cause the debugger to automatically resume after the breakpoint has been hit also allows the program to complete successfully. The Skip Count is handled by the CCS IDE, and so the target will be halted for the order of milliseconds before being resumed.

With the TM4C123 bug Mem#14 the act of single stepping the test program was also sufficient to prevent a failure.

0 cb1_mobile over 6 years ago in reply to Chester Gillon

Guru 117855 points

Spectacular display of time/effort on your part Chester - thank you - very much appreciated.

Is it not now (almost) obvious that the, "Insertion of Delays" - at key/critical portions of user code - surely adds "robustness" - and may even lead to program's success?

Such, "Insertion of Delays" was urged right here - 3 days past...

From firm's & my experience - such "Flash Erase/Writes" INDEED REQUIRE a "Guard-Band" - even if - and ESPECIALLY IF - they "appear" to succeed - presently. This proves so as such operations are sure to "INCREASE in the time required to "fully/properly" execute - and these times will (only) expand as the device ages - and proves "temperature sensitive" - as well.

Guard-Bands - provided by Delays - offer "Inexpensive yet effective Insurance" - and have been "proven to succeed" with (other/alien) ARM MCUs - and (now) w/those here - as well...

0 Chester Gillon over 6 years ago in reply to cb1_mobile

Guru 92251 points

cb1_mobile said:
Is it not now (almost) obvious that the, "Insertion of Delays" - at key/critical portions of user code - surely adds "robustness" - and may even lead to program's success?

The issue is determining the "key/critical" portions of code at which to insert a delay to provide a robust fix for the underlying cause, rather than masking the cause.

With the example program from Dmitry the act of inserting a NOP in the main function, which doesn't program the flash, can make the failure "go away".

The example program leaves the Flash Configuration Register (FLASHCONF) at its default value of zero. If after loading the failing program with the program halted at main:

a) If start the program running with the Flash Configuration Register at its default value, the program fails.

b) If use the debugger to set the FPFOFF (Force Prefetch Off) bit in the Flash Configuration Register before running the program, the programs runs successfully.

Therefore, think the flash prefetch buffers are having some effect on the problem. While inserting a NOP into the main function can make the failure "go away" that can shuffle the address of other functions in memory and therefore need to find which function is sensitive to the address alignment of the flash prefetch buffers.

0 cb1_mobile over 6 years ago in reply to Chester Gillon

Guru 117855 points

I would note that my (again earlier) suggestion of employing MCUs (properly) equipped w/the, "DUAL BANKED FLASH" - along w/the insertion of key/critical "Delays" - specifically address the, "Sensitivity imposed by the Flash Prefetch Buffers!"

Importantly - both w/in Vendor's MCUs - AND those of, "others!" Such (near) universal solution markedly trumps, "One & only one vendor" (solution)!

0 Chester Gillon over 6 years ago in reply to Dmitry Govorov

Guru 92251 points

Dmitry Govorov said:
See the same project, but I add some NOPs in the main function. There are 26 NOPs. And Flash operations work well. If you comment out the last 26th NOP, the Flash operations will fail.

Following the previous tests which show the failure seem related to the flash prefetch buffers interacting with the xPortPendSVHandler function looked the addresses of instructions in the xPortPendSVHandler function w.r.t. the boundary of the 32-byte flash byte prefetch buffers. For these tests run the program without any breakpoints and the Flash Configuration Register at its default value.

This is the disassembly of xPortPendSVHandler of the working program with 26 NOPs in main:

          xPortPendSVHandler():
00005554:   F3EF8009            mrs        r0, psp
109       	isb
00005558:   F3BF8F6F            isb        sy
112       	ldr	r3, pxCurrentTCBConst
0000555c:   F85F302C            ldr.w      r3, [pc, #-0x2c]
113       	ldr	r2, [r3]
00005560:   681A                ldr        r2, [r3]
116       	tst r14, #0x10
00005562:   F01E0F10            tst.w      lr, #0x10
117       	it eq
00005566:   BF08                it         eq
118       	vstmdbeq r0!, {s16-s31}
00005568:   ED208A10            vstmdb     r0!, {s16, s17, s18, s19, s20, s21, s22, s23, s24, s25, s26, s27, s28, s29, s30, s31}
121       	stmdb r0!, {r4-r11, r14}
0000556c:   E9204FF0            stmdb      r0!, {r4, r5, r6, r7, r8, r9, r10, r11, lr}
124       	str r0, [r2]
00005570:   6010                str        r0, [r2]
126       	stmdb sp!, {r3}
00005572:   F84D3D04            str        r3, [sp, #-0x4]!
127       	ldr r0, ulMaxSyscallInterruptPriorityConst
00005576:   F85F0040            ldr.w      r0, [pc, #-0x40]
128       	ldr r1, [r0]
0000557a:   6801                ldr        r1, [r0]
129       	msr basepri, r1
0000557c:   F3818811            msr        basepri, r1
130       	dsb
00005580:   F3BF8F4F            dsb        sy
131       	isb
00005584:   F3BF8F6F            isb        sy
132       	bl vTaskSwitchContext
00005588:   F7FBFC46            bl         #0xe18
133       	mov r0, #0
0000558c:   F04F0000            mov.w      r0, #0
134       	msr basepri, r0
00005590:   F3808811            msr        basepri, r0
135       	ldmia sp!, {r3}
00005594:   F85D3B04            ldr        r3, [sp], #4
138       	ldr r1, [r3]
00005598:   6819                ldr        r1, [r3]
139       	ldr r0, [r1]
0000559a:   6808                ldr        r0, [r1]
142       	ldmia r0!, {r4-r11, r14}
0000559c:   E8B04FF0            ldm.w      r0!, {r4, r5, r6, r7, r8, r9, r10, r11, lr}
146       	tst r14, #0x10
000055a0:   F01E0F10            tst.w      lr, #0x10
147       	it eq
000055a4:   BF08                it         eq
148       	vldmiaeq r0!, {s16-s31}
000055a6:   ECB08A10            vldmia     r0!, {s16, s17, s18, s19, s20, s21, s22, s23, s24, s25, s26, s27, s28, s29, s30, s31}
150       	msr psp, r0
000055aa:   F3808809            msr        psp, r0
151       	isb
000055ae:   F3BF8F6F            isb        sy
152       	bx r14
000055b2:   4770                bx         lr

And this is the disassembly of xPortPendSVHandler of the failing program with 25 NOPs in main:

          xPortPendSVHandler():
00005550:   F3EF8009            mrs        r0, psp
109       	isb
00005554:   F3BF8F6F            isb        sy
112       	ldr	r3, pxCurrentTCBConst
00005558:   F85F302C            ldr.w      r3, [pc, #-0x2c]
113       	ldr	r2, [r3]
0000555c:   681A                ldr        r2, [r3]
116       	tst r14, #0x10
0000555e:   F01E0F10            tst.w      lr, #0x10
117       	it eq
00005562:   BF08                it         eq
118       	vstmdbeq r0!, {s16-s31}
00005564:   ED208A10            vstmdb     r0!, {s16, s17, s18, s19, s20, s21, s22, s23, s24, s25, s26, s27, s28, s29, s30, s31}
121       	stmdb r0!, {r4-r11, r14}
00005568:   E9204FF0            stmdb      r0!, {r4, r5, r6, r7, r8, r9, r10, r11, lr}
124       	str r0, [r2]
0000556c:   6010                str        r0, [r2]
126       	stmdb sp!, {r3}
0000556e:   F84D3D04            str        r3, [sp, #-0x4]!
127       	ldr r0, ulMaxSyscallInterruptPriorityConst
00005572:   F85F0040            ldr.w      r0, [pc, #-0x40]
128       	ldr r1, [r0]
00005576:   6801                ldr        r1, [r0]
129       	msr basepri, r1
00005578:   F3818811            msr        basepri, r1
130       	dsb
0000557c:   F3BF8F4F            dsb        sy
131       	isb
00005580:   F3BF8F6F            isb        sy
132       	bl vTaskSwitchContext
00005584:   F7FBFC48            bl         #0xe18
133       	mov r0, #0
00005588:   F04F0000            mov.w      r0, #0
134       	msr basepri, r0
0000558c:   F3808811            msr        basepri, r0
135       	ldmia sp!, {r3}
00005590:   F85D3B04            ldr        r3, [sp], #4
138       	ldr r1, [r3]
00005594:   6819                ldr        r1, [r3]
139       	ldr r0, [r1]
00005596:   6808                ldr        r0, [r1]
142       	ldmia r0!, {r4-r11, r14}
00005598:   E8B04FF0            ldm.w      r0!, {r4, r5, r6, r7, r8, r9, r10, r11, lr}
146       	tst r14, #0x10
0000559c:   F01E0F10            tst.w      lr, #0x10
147       	it eq
000055a0:   BF08                it         eq
148       	vldmiaeq r0!, {s16-s31}
000055a2:   ECB08A10            vldmia     r0!, {s16, s17, s18, s19, s20, s21, s22, s23, s24, s25, s26, s27, s28, s29, s30, s31}
150       	msr psp, r0
000055a6:   F3808809            msr        psp, r0
151       	isb
000055aa:   F3BF8F6F            isb        sy
152       	bx r14
000055ae:   4770                bx         lr

For the failing program the 1st isb instruction in xPortPendSVHandler is at address 0x5580 which is the start of a flash prefetch buffer boundary, whereas with the working program the 1st isb instruction in xPortPendSVHandler is at address 0x5584. The ARM documentation for the isb instruction is:

ISB acts as an instruction synchronization barrier. It flushes the pipeline of the processor, so that all instructions following the ISB are fetched from cache or memory again, after the ISB instruction has been completed.

In order to investigate if the address alignment of the isb instruction was sensitive to the flash prefetch buffer boundary in the third_party\FreeRTOS\Source\portable\CCS\ARM_CM4F\portasm.asm source file:

a) Change the .align directive for the xPortPendSVHandler and following vPortSVCHandler function from 4 to 32 bytes (the size of a flash prefetch buffer):

	.align 32
xPortPendSVHandler: .asmfunc

<snip>

	.align 32
vPortSVCHandler: .asmfunc

b) In the xPortPendSVHandler function insert an increasing number of NOPs between the pair of dsb and isb instructions.

The effect of this is to control the address alignment of the isb instructions in the xPortPendSVHandler function relative the to the flash prefetch buffer boundaries.

The results are the following (with 3 runs for each row to check always got the same result):

Number of NOPs between dsb and isb	Address of 1st isb	Address of 2nd isb	Program result
0	0x55b0	0x55da	Pass
1	0x52b2	0x52dc	Pass
2	0x52b4	0x52de	Pass
3	0x52b6	0x52e0	Pass
4	0x52b8	0x52e2	Pass
5	0x52ba	0x52e4	Fail - all core registers the same value when halted. Saw 0x00005CA3, 0xffffffff or 0x0
6	0x52bc	0x52e6	Fail - all core registers the same value when halted. Saw 0x00005CA3 or 0x0
7	0x52be	0x52e8	Fail - all core registers the same value when halted. Saw 0x00005CA3 or 0xffffffff
8	0x52c0	0x52ea	Fail - all core registers the same value when halted. Saw 0x00005CA3, 0x0 or 0xffffffff
9	0x52c2	0x52ec	Fail - all core registers the same value when halted. Saw 0x00005CA3, 0x0 or 0xffffffff
10	0x52c4	0x52ee	Pass
11	0x52c6	0x52f0	Pass
12	0x52c8	0x52f2	Pass
13	0x52ca	0x52f4	Pass
14	0x52cc	0x52f6	Pass
15	0x52ce	0x52f8	Pass

Based upon the above tests when the 1st isb instruction in xPortPendSVHandler has an address offset of 0, 2, 26, 28 or 30 bytes relative to a flash prefetch buffer boundary then the program fails.

Hopefully TI will able to determine if this program has exposed an errata in the TMC129 devices with regard to the address placement of isb instructions when the flash prefetch buffer is in use and a program running in flash is programming flash sectors.

In the meantime, a work-around may be to ensure the xPortPendSVHandler function is given 32-byte alignment so that the problematic address offset for its isb instructions does not occur as a result of other changes to the program.

0 cb1_mobile over 6 years ago in reply to Chester Gillon

Guru 117855 points

Whether your tests, "Prove the cause - or not" - your great (and focused) effort is commended.

That said - if the test you created explores (only, or primarily) "the address alignment of the isb instructions" - such may not confirm that the, "entirety of the "Erase/Write" process" has (fully) succeeded...

It (would) prove of interest if the "failures" you've just identified are impacted by:

"System Clock" reduced (say to 60MHz)
MCU chosen employs, "dual bank" Flash - and the "Erase/Write" occurs (between) each bank
Extending the "size" of the data transfer

Again - when employing the "IAR" IDE - and (other) ARM Cortex MCUs (both M4 & M7) - and complying (as above) - no such "MCU Hang" - as o.p. reports - occurs...

0 Chester Gillon over 6 years ago in reply to cb1_mobile

Guru 92251 points

cb1_mobile said:
such may not confirm that the, "entirety of the "Erase/Write" process" has (fully) succeeded...

The example program checks the return result from MAP_FlashErase() and MAP_FlashProgram() functions and there are no errors reported from these functions. When the program fails it crashes, rather than reporting that the flash erase or program failed.

cb1_mobile said:
MCU chosen employs, "dual bank" Flash - and the "Erase/Write" occurs (between) each bank

The TM4C1294NCPDT has a 512 KB lower flash bank and a 512 KB upper flash bank. The example program is running from the lower bank and erasing / programming the upper bank.

0 cb1_mobile over 6 years ago in reply to Chester Gillon

Guru 117855 points

Again - in "NO WAY" is my writing to be interpreted as "criticism" of your "great" effort. As often revealed here - you are, "Among the few" - who make consistent efforts to "Aid others." Bravo!

Chester Gillon said:
The example program checks the return result from MAP_FlashErase() and MAP_FlashProgram() functions and there are no errors reported from these functions. When the program fails it crashes, rather than reporting that the flash erase or program failed

Do keep in mind that the "Error Checking" methodology" - which (proves) "INCAPABLE of IDENTIFYING ANY ERROR SOURCE" (beyond "INEPTLY" Crashing) - (may) not - at all times - prove conclusive! In light of the (very) light (i.e. non-existent) "Error Identification" - "How are we assured that, Error Reporting - proves "always" perfect?" (I base this possibility upon the NOTABLE ABSENCE of "ANY Error Identification" - which (may) indicate a, "Less than Inspired" Error Reporting Methodology - as well."

In your opinion - might (other) influences have impacted the reliability of such, "Flash Erase/Write" objective? To include:

Choice of RTOS ... (FreeRTOS) in this case
Choice of IDE ... (CCS here)
Optimization Settings - and their (possible) impact upon the resulting ASM - (if any) ... (Unknown, here)

Not being a "fan" of the '129 - or CCS (we employ neither) - I was unable to "tease out" these answers via our IAR IDE. I can report that, "Flash Erase/Write" - when complying w/"Dual Bank" - and the (sometimes) addition of strategic delay (between Flash erase & write) and performed under "IAR" - has enabled "great success" w/(others') Cortex M4/M7 - operating @ 180MHz and beyond...

0 Chester Gillon over 6 years ago in reply to cb1_mobile

Guru 92251 points

cb1_mobile said:
It (would) prove of interest if the "failures" you've just identified are impacted by:

"System Clock" reduced (say to 60MHz)

I have performed some more investigations into what can make the crash come-and-go. The initial conditions were:

TM4C129ENCPDT rev A2
Clock frequency of 120 MHz.
Calling ROM_FlashErase to erase flash.
Calling ROM_FlashProgram to program flash.
xPortPendSVHandler() in flash with 32-byte alignment.
Using TivaWare_C_Series-2.1.4.178.

Summary of tests:

Change to initial condition	Result of adjusting the number of NOPs in xPortPendSVHandler() between dsb and isb
None	With 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 15 NOPs the test runs to completion. With 5, 6, 7, 8, 9 NOPs the test crashes.
Use FlashErase from TivaWare (running from flash), instead of ROM_FlashErase (running from ROM)	With 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 15 NOPs the test runs to completion. With 5, 6, 7, 8, 9 NOPs the test crashes.
Use FlashProgram from TivaWare (running from flash), instead of ROM_FlashProgram (running from ROM)	With 0 to 15 NOPs the test runs to completion.
Use FlashProgram from TivaWare (running from SRAM), instead of ROM_FlashProgram (running from ROM)	With 0 to 15 NOPs the test runs to completion.
Place the portasm.obj containing xPortPendSVHandler() in SRAM rather than flash	With 0 to 15 NOPs the test runs to completion.
Reduce the clock frequency from 120 MHz to 80 MHz	With 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15 NOPs the test runs to completion. With 9 NOPs the test crashes.
Set the FLASH_CONF_FPFOFF bit to disable force the flash prefetch off	With 0 to 15 NOPs the test runs to completion.

Where by "crash" means the program stops reporting progress to the UART after 3 or 4 flash sectors have been erased / programmed. After a "crash" can't use the CCS debugger to post-mortem what has happened. The errors reported by the CCS debugger after a "crash" vary according to the debug probe used:

Debug probe	Result of trying to post-mortem after a crash
Stellaris ICDI (JTAG)	Can suspend the program. However, all core registers show the same value (where the value displayed in all core registers can vary from run to run). Attempting to step reports the error: CORTEX_M4_0: Can't Single Step Target Program
XDS110 (JTAG or SWD)	Can't suspend with the following error reported: CORTEX_M4_0: Trouble Halting Target CPU: (Error -2062 @ 0x0) Unable to halt device. Reset the device, and retry the operation. If error persists, confirm configuration, power-cycle the board, and/or try more reliable JTAG settings (e.g. lower TCLK). (Emulation package 7.0.188.0)
Segger J-Link (JTAG or SWD)	Can't suspend with the following error reported: CORTEX_M4_0: Trouble Halting Target CPU: Halt failed!
Blackhawk USB560-M (JTAG)	Can't suspend with the following error reported: CORTEX_M4_0: Trouble Halting Target CPU: (Error -2062 @ 0x0) Unable to halt device. Reset the device, and retry the operation. If error persists, confirm configuration, power-cycle the board, and/or try more reliable JTAG settings (e.g. lower TCLK). (Emulation package 7.0.188.0)

Therefore, the "crash" puts the Cortex-M4F core into a state where the debugger is locked out. A reset is needed before the debugger can connect again.

Haven't yet managed to determine the root cause of crash, but using the Statistical Function Profiler to sample the program counter every 896 clocks shows the following:

a) The program is in ROM_FlashProgram().

b) Get two samples in FreeRTOS context switching functions, which which think are a result of the timer interrupt used by FreeRTOS.

c) All remaining samples are the same program counter, which is for the "bx lr" at the end of xPortPendSVHandler().

0 Dmitry Govorov over 6 years ago in reply to Chester Gillon

Prodigy 130 points

Hello Chester,

This is a great work which you are doing to resolve this issue.

I just can add some info about b) Get two samples in FreeRTOS context switching functions, which which think are a result of the timer interrupt used by FreeRTOS.
I saw a 2-3 SysTick timer interrupts during MAP_FlashErase() and MAP_FlashWrite() functions running.

As a quick fix for the issue I did several things:

Use MAP_ versions of Erase and Write funcions which are running from ROM
Move SysTickTimerISR handler to RAM
Move parent function in which I'm calling Erase and Write functions to RAM

Of course it is not resolving the root cause of the issue. But maybe it can get some ideas for you.

P.S. Can you explain how to get functions from TivaWare to run from SRAM? I have only idea to copy the source of these functions to my file and then add '__attribute__((ramfunc))' before them.

0 Chester Gillon over 6 years ago in reply to Dmitry Govorov

Guru 92251 points

Dmitry Govorov said:
P.S. Can you explain how to get functions from TivaWare to run from SRAM? I have only idea to copy the source of these functions to my file and then add '__attribute__((ramfunc))' before them.

I edited the linker command file to manually add the TivaWare FlashErase() and FlashProgram() functions to the .TI.ramfunc section:

    .TI.ramfunc : {*(.text:FlashErase) *(.text:FlashProgram)} load=FLASH, run=SRAM, table(BINIT)

This works because the TivaWare library has been compiled with each function placed in its own section.

0 Dmitry Govorov over 6 years ago in reply to Chester Gillon

Prodigy 130 points

Thanks. Good trick.

0 Chester Gillon over 6 years ago in reply to Dmitry Govorov

Guru 92251 points

Dmitry Govorov said:
Of course it is not resolving the root cause of the issue. But maybe it can get some ideas for you.

I still haven't determined the root cause.

The MSP432E devices seem based upon the TM4C129 devices, and I created an equivalent test program for a MSP43E401Y which fails with the same symptoms - see MSP432E401Y: If FreeRTOS context switch occurs during a flash programming operation processor can crash

0 WJ over 6 years ago in reply to Chester Gillon

Intellectual 340 points

Hi Dmitry,

we using also the TM4C1294NCPDT and we have do on request (by our customer) an update on the Appl. part.

FYI, there is no issue doing an update to the FLASH.

Perhaps some points to think about it:

- SysClk is set up to 120Mhz
       /* set sysClock '120MHz' from crystal '25MHz' */
       sysClock = MAP_SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ |
                                           SYSCTL_OSC_MAIN |
                                           SYSCTL_USE_PLL |
                                           SYSCTL_CFG_VCO_480), 120000000 );

- we using all 'FLASH' service functions with prefix 'ROM_'
( ROM_FlashErase(); ROM_FlashProgram() )

- a 'ROM_FlashProgram()' portion (write-length) is 512Bytes

- after 'ROM_FlashProgram()' we delay for the next flash portion
      /* delay */
      ROM_SysCtlDelay( sysClock/100uL );
      Note: in your case it may be a little bit longer, because
      we do also in the meantime a readout from the USB-Memstick.

- no WD is in use (still disabled)

- no RTOS is in use

Maybe it helps.

Best regards
WJ

Arm-based microcontrollers

Arm-based microcontrollers forum

TM4C1294NCPDT: TM4C1294NCPDT flash write issue

Possible TM4C123 hardware bug