In the process of trying to enable L2 cache on DDR3 memory, I've run across odd behavior that i can't explain, nor can I eliminate. I'm working with a C6657 DSP on both a TMSXEV6657LE evaluation card and our production card.
In DDR3 memory, I have marked the first two regions (0x80000000 - 0x81ffffff) as being non-cached. The rest of DDR3 memory is cached. Later, I read a block from NAND flash (on the EMIF16 bus) into memory at 0x82000000, which is the first cached area in the DDR3 region. The data comes into DDR3 correctly. However, it also shows up in program memory, as well. Here's where it gets really strange (well, strange to me, anyhow).
Two consecutive cache lines of data (128 bytes) in DDR3 space correspond to that same amount of memory in the program area. The next 128 bytes in DDR3 space corresponds to 128 bytes in program memory, but not after the previous 128-byte chunk. It appears 512 bytes before the previous chunk.
Specifically, the first four 128-byte chunks map like this:
0x82000000 --> 0x008fff80
0x82000080 --> 0x008ffd80
0x82000100 --> 0x008ffb80
0x82000180 --> 0x008ff980
Some extra experiments show that this continues for the entire size of the NAND flash block (128 KB).
If I modify the cached DDR3 by hand, using CCS's memory browser, the program memory changes, then, as well. When I modify the first location in DDR3, the entire 128-byte chunk in program memory changes to match DDR3. It's as if two cache lines were fetched into DDR3 memory.
In CCS's memory browser, I have cache-coloring enabled. The DDR3 memory is shaded with a light-green color, which indicates that the region is in L2 cache. The program memory has no shading, which says that it is not in any cache.
Here is the code I used to initialize and enable L2 cache. Can anyone see what I've done wrong?
typedef unsigned long WORD32;
#define DDR3_BASE 0x80000000
#define DDR3_SIZE 0x20000000
#define MAR_REGION_SIZE 0x01000000
#define MAR_INDEX(addr) ((addr) / MAR_REGION_SIZE)
#define SRIO_HEAP_SIZE 0x02000000
#pragma DATA_SECTION(srio_heap,"noncached_ddr3");
BYTE srio_heap[SRIO_HEAP_SIZE];
static void ddr3_cache_init (void)
{
uint mar;
Uint8 pcx, pfx;
WORD32 offset, addr, nc_rgn_start, nc_rgn_end;
// Get the start and end of the non-cached region buffer.
nc_rgn_start = (WORD32)srio_heap;
nc_rgn_end = (WORD32)srio_heap + SRIO_HEAP_SIZE;
// Set DDR3 caching.
for (offset = 0; offset < DDR3_SIZE; offset += MAR_REGION_SIZE)
{
// Get the address of this DDR3 region. Calculate which MAR
// controls this region.
addr = DDR3_BASE + offset;
mar = MAR_INDEX (addr);
// If this region is in the non-cached part of DDR3, disable caching.
// Otherwise, turn on caching for this region.
if ((nc_rgn_end < addr) || (nc_rgn_start > ((addr + MAR_REGION_SIZE) - 1)))
CACHE_enableCaching (mar);
else
CACHE_disableCaching (mar);
// Ensure memory region is not prefetchable.
// NOTE: It behaves the same if these two lines are deleted.
CACHE_getMemRegionInfo (mar, &pcx, &pfx);
CACHE_setMemRegionInfo (mar, pcx, 0);
}
// Enable L2 cache.
CACHE_setL2Size (CACHE_1024KCACHE);
// Make caches coherent.
CSL_XMC_invalidatePrefetchBuffer(); // NOTE: It behaves the same if this line is deleted
CACHE_invAllL1p (CACHE_WAIT); // NOTE: I've tried multiple combinations of invalidation
CACHE_wbInvAllL1d (CACHE_WAIT); // and writeback. They didn't make any difference.
CACHE_wbInvAllL2 (CACHE_WAIT);
}