This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi,
I am currently checking the CPU behaviour in various conditions of SEU and MEU errors in RAM.
1) First, I want to make a reference to an other post, in which I think there is a mistake: https://e2e.ti.com/support/microcontrollers/hercules/f/312/t/630389
For the third point of the original question of this post, the answer is that ECC generation is not disabled when setting ECC DETECT EN field of RAMCTRL register to 0x5.
With the testing performed recently, I would say that this is wrong. Here is the test procedure:
>> ECC memory reads are identical
Same test with letting the ECC enabled in RAM:
>> ECC memory reads are different due to ECC enabling and different values written
To avoid confusion, I recommend to add this point to the original post, or invalidate the given answer.
2) According to Table 8-1 of TMS570LC4357 TRM, the L2RAMW is supposed to generate a bus error when a double-bit Read-Modify-Write (RMW) error is detected during sub-64bits write by the cortex-r5f.
Thanks to point 1) being clarified, I was able to inject double-bit fault in RAM and perform a 16-bit write from the core to check this behaviour and I noticed the following:
As far as I understand, the "bus error" documented in the TMS570LC4357 TRM is related to the Cortex-R5 TRM "External faults" documented in chapter "8.3.1 Faults > External Faults". Am I correct on this point?
By the following sentence "Non-exclusive stores to normal-type or device-type memory generate asynchronous aborts", I understand that when the MPU is configured as NORMAL or DEVICE memory for the RAM region, an asynchronous abort should be generated when executing a non-exclusive store, but even with a while loop after the 16-bit write to RAM is performed, the data abort is not generated.
This behaviour seems not consistent with the documentation. Could you clarify the normal and expected behaviour?
If this is a wrong behaviour, is it related to the errata "DEVICE#40" documented in Silicon B errata document?
Best regards,
Gael
Hello Gael,
First, for the easy part of your questions, I do not believe that this is related to the known issue Device#40 since the issue described there is for accesses to unimplemented addresses/memory locations within those peripheral frames.
For the rest of your comments and questions, I need to consult with one of our former design leads and a device expert on this device. I will get back to you soon with additional comments and, possibly, some follow-up questions.
Hi,
On your first question about the answer provided by Chuck in the other post, I think he was referring to the fact that the ECC checking at the CPU cannot be disabled. You can disable the ECC code generation at the RAM wrapper level as you have demonstrated.
Gael Le Moing said:
- f the core MPU is disabled (using the default memory map documented in Cortex-R5 TRM Table 7-1) >> no abort is generated
- if the core MPU is enabled with one region for the RAM configured as DEVICE or NORMAL memory >> no abort is generated
- if the core MPU is enabled with one region for the RAM configured as STRONGLY-ORDERED memory >> a data abort is generated and the data fault is logged in the Cortex-R5 DFSR register.
- independently of the MPU confuguration, the ESM group 3 channel 3 is triggered as expected
Your observation is correct. When the RAM wrapper fails the ECC checking during a Read-Modify-Write operation it generates a bus error signal back to the CPU. Note that this error is associated with the Sub-word write operation that you perform. How did you write to the RAM when you are in NORMAL or DEVICE. Note that the Cortex-R5F can perform write merge. It may have merged multiple sub-word writes into a 64-bit write in which case the entire 64-bit along with the 8-bit ECC will be written to the RAM overwriting what you previously had in RAM. A complete 64-bit write will not create Read-Modify-Write operation.
Gael Le Moing said:As far as I understand, the "bus error" documented in the TMS570LC4357 TRM is related to the Cortex-R5 TRM "External faults" documented in chapter "8.3.1 Faults > External Faults". Am I correct on this point?
Your understanding is correct.
Gael Le Moing said:By the following sentence "Non-exclusive stores to normal-type or device-type memory generate asynchronous aborts", I understand that when the MPU is configured as NORMAL or DEVICE memory for the RAM region, an asynchronous abort should be generated when executing a non-exclusive store, but even with a while loop after the 16-bit write to RAM is performed, the data abort is not generated.
See my above comment. Try to do one 16-bit write instead of multiple of them which create write merging.
Hi Charles,
Ok for first point.
For the second point:
Charles Tsai said:Gael Le Moing
- f the core MPU is disabled (using the default memory map documented in Cortex-R5 TRM Table 7-1) >> no abort is generated
- if the core MPU is enabled with one region for the RAM configured as DEVICE or NORMAL memory >> no abort is generated
- if the core MPU is enabled with one region for the RAM configured as STRONGLY-ORDERED memory >> a data abort is generated and the data fault is logged in the Cortex-R5 DFSR register.
- independently of the MPU confuguration, the ESM group 3 channel 3 is triggered as expected
Your observation is correct. When the RAM wrapper fails the ECC checking during a Read-Modify-Write operation it generates a bus error signal back to the CPU. Note that this error is associated with the Sub-word write operation that you perform. How did you write to the RAM when you are in NORMAL or DEVICE. Note that the Cortex-R5F can perform write merge. It may have merged multiple sub-word writes into a 64-bit write in which case the entire 64-bit along with the 8-bit ECC will be written to the RAM overwriting what you previously had in RAM. A complete 64-bit write will not create Read-Modify-Write operation.
I only performed only one 16-bit write to the MEU impacted memory: "*((unsigned int *)0x08000000) = 3;". So there is no merge from the Cortex. Moreover, the error is flagged in the ESM in Group 3 channel 3.
Can you explain why the bus error (data abort) is not generated when MPU is configured in NORMAL or DEVICE memory for RAM region?
For third point:
Charles Tsai said:Gael Le MoingBy the following sentence "Non-exclusive stores to normal-type or device-type memory generate asynchronous aborts", I understand that when the MPU is configured as NORMAL or DEVICE memory for the RAM region, an asynchronous abort should be generated when executing a non-exclusive store, but even with a while loop after the 16-bit write to RAM is performed, the data abort is not generated.See my above comment. Try to do one 16-bit write instead of multiple of them which create write merging.
I did perform a single 16-bit write followed by an infinite loop and the asynchronous abort never occurs (though the Group 3 channel 3 error is raised in the ESM). How do you explain this abort is not generated?
Thanks,
Gael
Charles,
For the first point, I would then ask for a documentation clarification. Indeed, RAMCTRL[ECC DETECT EN] has a misleading name and its description only says ECC detection will be disabled. Moreover, there is no text in the whole L2RAMW module chapter that says how the ECC generation can be disabled. Adding this information in the bit description seems necessary.
Thanks.