Tool/software: TI-RTOS
Good evening all,
I have a problem where I am seeing the first 56 bytes of either a 128k global buffer array or a calloc'ed buffer get corrupted when a isr has fired. However if I move this buffer to be a local the problem goes away.
The code in question is a ftp client running with LWIP. We are using a custom AM335x PCB, CCS v6.1.1.00022, SYS/BIOS v642335, XDC tools v331024_core. I have cut the code back as far as I can where I can still see the bug so just the ftp and LWIP isr's running and as far as I can tell the corruption only happens when an isr fires during this code in the ftp task and either gp_ftp_buffer or the buffer inside the MlsdBuffer class get the corruption.
fs_fread((uint8_t *)gp_ftp_buffer, btr, &br, ftp_instance->file_read); // sys_prot_t hwi_state = sys_arch_protect(); ftp_instance->MlsdBuffer->push(gp_ftp_buffer, br); if(ftp_instance->bytes_read == 0) { char buf[4]; //ftp_instance->MlsdBuffer->push(gp_ftp_buffer, br); ftp_instance->MlsdBuffer->peep(buf,4); if((gp_ftp_buffer[3] != 'S') || (buf[3] != 'S')) stdout("Bad start to file!!! %.2x %.2x", (int)gp_ftp_buffer[3], (int)buf[3]); ftp_instance->dbg_file_start = 1; ftp_instance->tx_state = SM_FTP_TRANSFER_IN_LIMBO; } // sys_arch_unprotect(hwi_state);
As I said if i move the buffer to be local the problem goes but this is quite a large application and we have seen corruption else where that may or may not be related so I would like to get to the bottom of this issue and understand what is going on.
I have ruled out:
- Any particular part of the task code being the cause of the corruption by detecting where the corruption can happen and introducing delays to make the isr call happen in different places.
- stack or heap overflow by its position in the memory map, observing that nothing around it gets corrupted and doubling all the stacks and heaps to make sure.
What I am left with is suspecting that this is cache related and that how I have the cache setup/how i am using it is not thread safe. When I disable the cache or set bufferable to false in the mmu setup for the ddr3 region I have not observed any corruption however the performance is terrible and LWIP keeps reporting asserts and eventually kills the connection. Does this explanation sound sound feasible? If so what can I try to stop the corruption without ruining the performance?
This is how I set up the mmu and cache in my project cfg file:
var Cache = xdc.useModule('ti.sysbios.family.arm.a8.Cache'); var Mmu = xdc.useModule('ti.sysbios.family.arm.a8.Mmu'); // Enable the cache Cache.enableCache = true; // Enable the MMU (Required for L1/L2 data caching) Mmu.enableMMU = true; // Force peripheral section to be NON cacheable var peripheralAttrs = { type : Mmu.FirstLevelDesc_SECTION, // SECTION descriptor tex: 0, bufferable : false, // bufferable cacheable : false, // cacheable shareable : false, // shareable noexecute : true, // not executable }; // Set the descriptor for each entry in the address range for (var i=0x44000000; i < 0x80000000; i = i + 0x00100000) { // Each 'SECTION' descriptor entry spans a 1MB address range Mmu.setFirstLevelDescMeta(i, i, peripheralAttrs); } // descriptor attribute structure var attrs = { type: Mmu.FirstLevelDesc_SECTION, // SECTION descriptor tex: 0x1, bufferable: true, // bufferable cacheable: true, // cacheable }; // Set the descriptor for each entry in the address range for (var i=0x80000000; i < 0x90000000; i = i + 0x00100000) { // Each 'SECTION' descriptor entry spans a 1MB address range Mmu.setFirstLevelDescMeta(i, i, attrs); }
I am at a loss where to go with this now any advice on what I am doing wrong would be greatly appreciated.
Thanks
Sean