I've narrowed down a crash to code that clears MAR bits covering DDR3. Stepping into the code with the debugger, the stack becomes corrupted. I've confirmed corruption occurs the moment the MAR bit that covers the stack area is cleared in DDR3.
Is this expected behavior? According to spru862a section 2.7.1, it seems like clearing MAR bits should not affect data already cached. Or am I misinterpreting this paragraph?
"Disabling external memory caching after it was enabled should not be generally necessary. However if it is, then the following considerations should be taken into account. If the MAR bit is changed from 1 to 0, external memory addresses already cached stay in the cache and accesses to those addresses still hit. The MAR bit is only consulted if the external memory address misses in L2. (This includes the case where L2 is all SRAM. Since there is no L2 cache, this can also be interpreted as an L2 miss).
If all addresses in the respective external memory address space are made noncacheable, the addressesneed to be written back and invalidated first"
The corruption occurs on both the C6678 and the C6657 devices. I noticed a L2 corruption Errata, but it talks only about coherency operations, so I'm assuming this does not apply here.
This happens whether or not I do a full cache clean before the MAR bits are cleared. Of course, the cache clean only can do so much - any bit of code is likely to touch the cache if the stack is in DDR3.
In our application, clearing the MAR bits only occurs as part of a global cache disable. I'm assuming the original code touched the MAR bits either to address an old errata or in the hope that this would make the cache-disabled performance slightly better. So the workaround for me is simple - I just move the MAR bit clearing until after the cache is already turned off. But I worry that other folks might fall into the same trap.
It sounds like they data has not been written back to DDR. And that you're disabling the MARs, which is ok and the data that is cached will remain cached, but once you do the Global Cache Disable, it won't look to see if these where cached and if you haven't written these back to DDR then DDR would still physically only what the last items that where written back to them. If the stack was always in cache which is highly likely, then you'd have only the initial data in DDR and not what was pushed onto the stack.
Best Regards,
Chad
------------------------------------------------------------------------------------------------------------
Please click the Verify Answer button on this post if it answers your question.
Thanks for the response! When the crash occurs, I have not yet performed the Global Cache Disable (i.e. setting all cache sizes to zero). I've observed the crash with clearing a MAR bit being the very first action that touches the cache configuration.
Just to be crystal-clear, the corruption occurs immediately after the MAR bit is cleared, without any other cache action.
If you have CCS up can you display the Memory window where the crash is / corruption is occuring, and have it display what's in cache versus actual memory? Should have color coding.
-Chad
Just before the call that clears the MAR bit that covers the stack, I can see the variable values in L1D and L2 and stale data in DDR2.
The moment that the MAR bit is cleared, I see corrupt values in L1D and L2. I confirmed that this is not just a JTAG display issue - when I step through the dissasembly and the values are loaded from stack into registers, the corrupt values get loaded, not the stale values in DDR2 (which have not changed).
This smacks of the L2 corruption Errata (Advisory 9: L2 Cache Corruption During Block and Global Coherence Operations Issue)- but that seemed to cover just the narrow case of an explicit cache coherence operation.
Can you display and capture the before and after in the memory window, and showing what's in L1D, L2 and DDR during the captures. I'd like to see the captures.
Here are "before" and "after" screenshots. I ended up showing the memory that covers variables in the calling function, but for this purpose it still demonstrates the issue. The MAR index is set to 0x86 or 134, which covers the area 0x86000000 to 0x86FFFFFF....
Now we step past the MAR bit set, and the memory updates...
I went back to your original quote because I couldn't remember that being in there. And it's not on the C66x family of devices. That quotation is from the C64x+ DSP Cache UG.
If you go to the C66x CorePac UG, it discusses the caches. You will need to perform a Write-back Invalidate first.
From SPRUGW0
4.3.7.3 Requirements for updating MAR registers at runtimeMAR registers are runtime programmable, except as noted in Section 4.3.7.2. All MARregister bits reset to a value of 0, thereby making the entire address space non-cacheableby default (except as noted in Section 4.3.7.2).Whenever MAR registers are updated dynamically, programs must follow thefollowing sequence to ensure that all future accesses to the particular address range arenot cached in L1 and L2 caches.1. Ensure that all addresses within the affected range are removed from the L1 and L2caches. This is accomplished in one of the following ways. Any one of the followingoperations should be sufficient.a. If L2 cache is enabled, invoke a global writeback-invalidate using L2WBINV.Wait for the C bit in L2WBINV to read as 0. Alternately, invoke a blockwriteback-invalidate of the affected range using L2WIBAR/L2WIWC. Waitfor L2WIWC to read as 0.b. If L2 is in all SRAM mode, invoke a block writeback-invalidate of the affectedrange using L1DWIBAR/L1DWIWC. Wait for L1DWIWC to read as 0.Note that the block-oriented cache controls can only operate on a 256K-byte addressrange at a time, so multiple block writeback-invalidate operations may be necessary toremove the entire affected address range from the cache.2. Clear the PC bit in the appropriate MAR to 0.
It does not appear that the Global Writeback Invalidate has been done since the Data In DDR memory is not the same as what is in cache. That would appear to be the issue.
Thanks Chad! Somehow I fell into the wrong datasheet. The two cache documents are very close - that removes the confusion!
Actually this same quote can be found in the C66x cache datasheet (sprugy8.pdf), same section (2.7.1). It looks like it is a cut-and-paste from the C64+ cache document.
But thanks again for pointing out the complete MAR bit warning! I was searching around for any MAR explanation and I missed that section in the CorePac doc.
Thanks for pointing out the quote in the C66x Cache UG, it shouldn't be there. I'll make sure it gets cleaned up. It probably was a cut an paste issue, that said, I confirmed this with our design team that the way the MARs are handled is an immediate change in state for the associated TAGs (i.e. the TAG RAMs would immediately show them as not cached if the MAR went 1->0. Since they're not associated I'm not sure what the emulation shows the association to but the DSP would be getting the data straight from the DDR at that point.)
I'll make sure this is corrected for the next revision, you may want to reference the C66x CorePac UG for now on the caching info. It's not a ported document like the C66x Cache UG was and should cover most caching related questions.