This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hello,
I am having difficulties using external SRAM memories (using EMIF) on the TMS570 MCU in FreeRTOS tasks. The problem appears as, from my point of view, various errors related to the memory which finally results in MCU stuck in aborts or simply incorrect behavior.
One of the more obvious error I've seen while debugging is in a function similar to:
void (void* x)
{
void* y;
y = x;
}
The SRAM is used as the stack, x points to a memory region on the SRAM and so does y. After some successful calls to this function, the assignment fails and the pointer y still has the value 0xA5A5A5A5 (as the stack is initialized to by FreeRTOS). If I add a dummy variable prior to this code, it never fails and the program works as expected (The program in this case is an LWIP web server example).
The following has been tested:
- Write and read to the entire SRAM, with MPU region set up both as device and normal. Successful.
- Lowering EMIF clock frequency. No difference.
- Raising EMIF clock frequency. Works in this special case but not in another part of the code. Indicates some kind of race condition?
Any advice or guidance on how to set up the memory region or what could cause this behavior is appreciated. Or tips on how to troubleshoot the memory accesses on the TMS570.
Thanks and regards
Mikael Rothin
Hi Mikael,
Unfortunately, there is an errata that we will publish this month explaining that there are some cases where STM (store multiple) and likely PUSH and other multi-word code doesn't work with EMIF under certain conditions. The cause of the issue has to do with on chip bus protocols and not EMIF timings, so there isn't really anything you can adjust on the EMIF to fix the problem.
One of the conditions has to do with trying to STM from to an EMIF address not aligned to a 64 bit boundary... Another has to do with STM at the same time as DMA accessing the EMIF. From a practical standpoint you really can't control or prevent these conditions using the C compiler. Therefore you should not use memory on the EMIF for stack.
But, you should also not really want to use external memory for stack since on this processor there are no caches; and external memory does not have ECC protection.
I would suggest thinking about external memory as useful for non-critical buffer store, and use the DMA module or the DMA's built into the EMAC and/or other peripherals (ex. FlexRay FTU) for transfers between external memory an on chip memory.
Hi Anthony,
First of all, thanks for the quick reply.
This problem, does it involve STR as well? (Do you mean multi-word from the EMIF-SRAM bus or CPU point of view?) Since this is the assembly operation performed in this case. The following is where the STR fails:
00071210: E59DC000 LDR R12, [R13]
00071214: E58DC024 STR R12, [R13, #36]
If I understand it right, this rules out all direct access to the external SRAM and SDRAM and force us to access this memory using the DMA instead? And there are no known issues with that approach?
Regards
Mikael Rothin
Hi Mikael,
The issue that's being written up now is pretty specific to STM. There's a case where an STR can be a problem but this would be when preceded by an STM directly. Of course the issue you are seeing could be something else.
But, in the above code segment:
00071210: E59DC000 LDR R12, [R13] << Is this your code fetching 'void *x' from the stack?
00071214: E58DC024 STR R12, [R13, #36] << and is this your code writing the value back to y, on the stack?
Then, wouldn't the problem be really that the argument 'x' wasn't written to the stack in the first place, if the LDR is reading 0xA5A5A5A5 from the stack? Meaning the actual fail would have occurred during the function call sequence not during these two instructions - wouldn't it?
Since the EMIF issues are also related to the MPU settings - it would be good to get down for the reference how you have the EMIF configured and also to confirm that the MPU has been initialized. That information won't cause a 'eureka' answer at the moment but as we dig further I think we'll need to know this.
Second, which silicon exactly are you using? It would be great if you could record what is written on the 3rd line of the part number symbolization after the TMS570 part # (which should be split into 1st and 2nd lines). What I'm looking for is a code of the form ###-####### and critical is the # character right before the '-', this should be A, B, or C. Also I would like to know the part # in the TMS570 family? Is it LS3137, or LS1227, or some other part.
The final EMIF SDRAM timings for the LS3137 were disappointingly low due to an internal timing path. It can't run reliably above about 50MHz but this is only on the parts in the LS3137 'niche' The LS1227 can run faster.
You mentioned that you had slowed the SDRAM frequency down, but 50MHz is pretty slow and not sure you'd tried it running that slow...
Hi,
Yes, x and y are on the stack. At address 0x60000E00 (x) and 0x60000E24 (y).
And no, the LDR reads the correct value, verified by browsing core registers in the debugger. I have a breakpoint placed at the instruction following the STR.
The EMIF (ASYNC1) is configured with the following values:
emifREG->CE2CFG =
(0U << 31U)| // Select strobe
(0U << 30U)| // Extended wait
(2U << 26U)| // W_SETUP
(6U << 20U)| // W_STROBE
(1U << 17U)| // W_HOLD
(1U << 13U)| // R_SETUP
(11U << 7U)| // R_STROBE
(3U << 4U)| // R_HOLD
(3U << 2U)| // Turnaround
(emif_16_bit_port);
emifREG->AWCC = 0x00000000U;
emifREG->PMCR = 0x04040404U;
Reading the MPU regions via CP15_MPU_REGION_* I have the following regions at that moment:
(region 4-10 are switched between tasks by FreeRTOS, number 4 is the task stack area)
number, base_address, size_enable, access
0, 0x00000000, 0x0000002B, 0x00000603
1: 0x00000000, 0x0000001D, 0x00000503
2: 0x08000000, 0x00000021, 0x00000103
3: 0xFC000000, 0x00000033, 0x00001301
4: 0x60000000, 0x00000017, 0x00000303
5: 0x64000000, 0x0000002D, 0x00000303
6: 0x68FC0600, 0x00000009, 0x00000301
7-10: 0x00000000, 0x00000000, 0x00000000
11: 0xFFF80000, 0x0000025, 0x00001101
The 3rd line on the device reads YFB-2CA6N4W, it's from the LS3137 family.
We are using the SRAM ASYNC1, is the upper limit of 50MHz valid for this as well? Lowering the clock frequency (VCLK3) to below 50MHz (From 60MHz) does not resolve the issue. However, raising the clock frequency to 90MHz does solve this specific issue (same addresses used ..E00 and ..E24). But another memory related issue which is less clear still remains.
Regards
Mikael Rothin
Hi again,
I have been using other contacts at TI available here in this issue. It seems like changing the C and B flag in the MPU setting for the SRAM region solved some issues (i.e. using the memory type "Strongly Ordered"). However, I then saw problems at store-multiple operations at addresses not aligned to 64 bits, just as you mentioned. But with an understanding of this, it should be possible to avoid.
Another issue with the external SRAM was when using this to store packets to send on the EMAC interface. Outgoing large packets are dropped and the counter for FIFO underflow is incremented. We have not been able to solve this but have instead moved the packets to be stored at the internal RAM as a workaround.
Thanks for the support
Best Regards
Mikael Rothin
Dear Anthony,
I am asking my issue is the same that the STM instruction in EMIF area memory.
My case is partially different from your conversation which is the CPU and it is accessing SDRAM not SRAM.
May I have made sure my issue is same or not from that conversation?
And, I also want to have the errata document describing this issue if the issue happens on my case.
The CPU part number is:
XRM48
L952ZWTT
YFB-23AGPSW
Best regards,
M. Watanabe
Hi Mr. Watanabe,
There is a separate issue with SDRAM where unalinged STM or STR (unaligned to x64-bit boundaries) causes problems, either writing to the wrong address or writing fails. The root cause/possible workarounds for this problem are still being analyzed and it should show up in errata when it is released again. However at the moment my understanding is that there is no workaround from the CPU side other than to avoid the unaligned accesses, but DMA to/from SDRAM is OK.
Dear Anthony,
Thank you for making me clear.
I have one more question.
I am using RM48L952ZWT. Does it have the same issues?
Best regards,
M.Watanabe
Hi Mr. Watanabe,
Yes, unfortunately it does. All of the current RM48 and RM46 products are affected.
Dear Anthony,
we are using the LS3137 derivative of the TMS570 in our project.
Attached to the EMIF port, there's an SRAM, a NOR flash and an FPGA. The EMIF's memory range except the SRAM's area is currently configured to be strongly ordered. The memory area for the SRAM is currently configured to be device type. Also the EMIF timings should be correct according to the TMS's datasheet and the devices datasheets.
By now, we had no problem when configuring and interacting with the FPGA and it's IP.
Since a while I'm noticing random data faults with the heap, which is placed in the external SRAM's memory address area. Because of those random problems, I tried today to change the SRAM's memory to be strongly ordered. As a consequence, the CPU hangs up and is not even more accessible with the debugger.
Now, as I read this forum post, I found the root cause. Am I right, if this is exactly errata DEVICE#179?
So: How do I have to setup my MPU and/or EMIF, that the heap is working correctly in the external EMIF SRAM?
Can we also maybe run in problems when accessing some FPGA registers?
Kind regards,
Michael
Hi Michael,
Just to confirm something first - is your SRAM asynchronous memory on CS2,3,4 or is it SDRAM?
Best Regards,
Anthony
Dear Anthony,
it's an asynchronous SRAM, connected via CS2. So it's memory mapped to 0x60000000.
-> speed grade is 10 ns
-> bus width is 16 bits
The EMIF is configured be clocked at 90 MHz. Settings are following for the SRAM:
R_SETUP: 1
R_STROBE: 4
R_HOLD: 1
W_SETUP: 1
W_STROBE: 1
W_HOLD: 1
page mode is disabled
Kind regards,
Michael
Michael,
We are currently updating the errata documents and this specific one is being updated as well. The bug effectively rules out the use of an STM instruction to write to any memory accessed via the EMIF.
The latest version of the compiler for the R4F devices (v5.1.3) includes an option to disable generation of STM instructions. This option can be enabled at a source file level, so you can enable it on any file that has a function that accesses external memory. This does cause a significant drop in performance especially if you are using external memory as stack or heap.
The option to disable STM instructions is "--no_stm" (note the two dashes). This is a "hidden" option in CCS, in that you cannot configure it via the GUI and the option needs to be manually added to the compile command.
Regards,
Sunil
Thanks Sunil!
I think you meant this but I want to clarify - becaues 'manually added to the compile command' might be interepreted to mean that you cannot use the CCS IDE builder but have to switch to a command line build flow outside of CCS.
This isn't the case, you can add the --no_stm option through the IDE but it's not there as a 'checkbox'.
You add it under the 'Set Additional Flags' button and you have to type in --no_stm. But once you do this you can still build through the IDE's builder - no need to setup a different build flow.
And the opiton below can be added at project level by right clicking on the project and selecting 'properties'. Or you can apply the option to specific files by right clicking on the file and selecting properties.
Hi Anthony and Sunil,
since we are not directly using CCS, the configuration of the compiler settings is no problem for us at all. We are directly calling the compiler, assembler and linker in a Makefile based environment.
We are using the ARM Code Generation Tools 5.1.1 at the moment but it will be no problem to switch to the newest version.
The fact, which I'm more worrying about is the "significant drop in performance". Because the LS3137's EMIF has some design issues, where the data signals need to be stable for at least 30 ns before they can be sampled, the overall performance will get even worse by this new restriction.
Are you able to figure out the performance drop in numbers?
When are you able to provide the updated errata?
Kind regards,
Michael
Michael,
This issue forces single store instructions (STRxx) to be used instead of the "store multiple" (STMxx) instructions. This essentially prohibits the application from benefiting from the burst write capability of the CPU. These STM instructions are typically used by the compiler for saving registers onto stack, or during some memory copy routine. Any dynamic memory allocated in the heap is typically only accessed in single word sizes. If that is the case you may not see any significant drop. But the fact that you ran into the issue indicates that you do have some code construct that is causing the compiler to generate multiple-store instructions for writing to external memory.
The actual performance drop that you will observe will depend on how many such store-multiple accesses you have in your current application.
Regards, Sunil
Hi Sunil,
would it also be possible to configure the external asynchronous SRAM as device type and allow the usage of this memory for the heap anyway?
Or is it a must to have it configured as strongly ordered?
Some additional information:
Inside the heap, a lot of big buffer structures will be stored by a run time system, consisting of several timer interrupt based tasks. By now i noticed some data inconsistencies when configuring this memory range as device type, but maybe i did something wrong...
Kind regards,
Michael
Hi Michael,
Configuring the external memory as device type is also okay. The key requirement is to avoid store-multiple instructions to write to external memory. This is even more possible if you are storing big buffer structures. The latest compiler version (5.3.x) supports an option called "--no_stm" to replace all instances of store-multiple instructions with single stores.
Regards, Sunil
Do you have any advice on how to work around this on IAR compilers (Functional Safety edition)?