This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Bios Cache and CSL Cache

Hello,

I don't feel like an expert on caching so this may be some simple questions.

1.  I was wondering, when and why I should use BIOS cache and if there is an example? 

2.  I am using CSL caching.  When I ask it to invalid just the size of the variable, I get undefined behavoir.  Sometimes it works but if I readjust the memory map, then it might not.

CACHE_invL1d((void *) &glAssocationStartFlag, sizeof(unsigned int), CACHE_WAIT);

All the examples use 128, so I thought that value must just have been chosen as the default.  But I am thinking now, maybe not.  Why do they use 128?

 

CACHE_invL1d((void *) &glAssocationStartFlag, 128, CACHE_WAIT);

3.  What if I want to invalidate a variable (such as a struct) that is larger than 128, what should i do?

 

Thanks,

brandy

 

  • Brandy,

    First of all, the C66x DSP Cache User Guide (sprugy8) could be a good starting point for learning the cache operations.

    1. The BIOS and CSL cache APIs are very similar that they both program the device cache registers and then the hardware will do the cache operations accordingly. If you are using a RTSC (BIOS) project, it could be convenient to use the BIOS cache APIs since the RTSC project will include the BIOS library for you. In your regular C project, you could include the CSL library and use the CSL cache APIs as what you are doing currently. I do not see a scenario that you have to use BIOS. You can take a look at the "BIOS user guide" or other document in " \ti\bios_xx_xx\docs" for the API details.

    2. A cache line is the smallest block of data that the cache operates on. For instance, although the core may request single bytes from memory, on a read miss the cache reads an entire line's worth of data to satisfy the request. Please refer to the C66x DSP Cache User Guide for details.

    In 66x, the L1D cache line size is 64B and L2 cache line size is 128B. If your variable is in L2 SRAM and only be cached in L1D, the "byteCnt" in "CACHE_invL1d()" should be multiple of L1D cache line size (64B). If your variable is in external memory (e.g. DDR) and being cached in both L1D and L2 cache, you may need to use "CACHE_invL2()" to invalidate the variable from both L1D and L2 cache, instead of only using "CACHE_invL1d()", which invalidates the L1D cache only.

    3. As we discussed above, the "byteCnt" could be multiple of the cache line size and it could be more than 128. 

    I am not sure what your variable is located, and how is cached, 

     cached in both L1D and L2,  enable both L1D and L2 cache and your variable is external memory (such as DDR), the 

    Please take a look at the CSL document "ti\pdk_C66xx_xx\packages\ti\csl\docs" to see the API's description. 

  • Hello Steven,

    Okay, so you confirm my thinking.  I tried to fix my problem by aligning my variable and by doubling the bytes in the invalidate line.  Still my problem exists.

    Here is my problem.  I have a variable that seemingly gets invalidated from cache and then when I go to read it, it is the old value again from cache not the new one from memory.  It is very odd.  Odder still is that the code worked fine until I deleted some unused variables and the memory map changed.  Here is the more details:

    In my custom platform, I am using only L1d cache - L2 cache is 0 bytes.

    I have a flag that is in MSMCSRAM.  This flag is written by Core 0 and read by Core 1 to 7.  Cores 1 to 7 read this flag continuously until it is a certian value.  Everytime it is not a certian value, the core will invalidate the cache and read again.

    Here is the declaration:

    #pragma DATA_ALIGN(64)

    #pragma DATA_SECTION(".srioSharedMem")

    unsigned int far glAssocationStartFlag;

    Here is the loop on Cores 1 to 7:

    while((glAssocationStartFlag & (CORE0_ASSOC_TASK << CORE_NUM)) == 0) //wait for your flag to be active.

    {

    Task_sleep(ASSOC_SLEEP_DELAY);

    CACHE_invL1d((void *) &glAssocationStartFlag, 128,CACHE_WAIT);

    }

    Here is the cfg file:

     Program.sectMap[".srioSharedMem"] = "MSMCSRAM";

     

    Basically what happens if I step through the code on Core 1, the variable is zero at the while statement.  I step again, the variable is zero.  I step to the invalidate cache and the variable turns to 0x00007F00 - which is the expected value.  I step again to the while statement and the variable returns back to zero.  I am stuck in the while loop.

    This code was working before I made changes to the memory map by deleting unused variables.

    Please advise if you have any ideas!

     

    Thanks,

    Brandy 

     

  • Brandy,

    Could you try to define the flag variable as "volatile" and give a try again ("volatile unsigned int glAssocationStartFlag")?

    If it does not solve your problem, could you please attach one simplified example project to reproduce the issue please? I think two-core version (Core0 write and Core1 read) should be fine if it can reproduce the issue. Thanks.

  • I just tried volatile and it didn't help. 

    #pragma DATA_ALIGN(64)

    #pragma DATA_SECTION(".srioSharedMem")

    volatile unsigned int far glAssocationStartFlag;

    I am sorry, I can't attach a version of the project.  It looks like it is there when I check and uncheck the L1D Cache box at the top of the memory browser but zero keeps getting loaded into the register. 

    Screen 1:

    Shows the flag as zero right before the function sleeps.

    Screen 2:

    Shows the flag as 0x0007f000 before the comparison

    Screen 3:

    Shows the flag back to zero.

    Screen 4: 

    I over wrote the variable manually with 0x0007f000 and then the while loop exited.  You can see the change this time in the registers. 

    Sorry about all the screen shots.  Its like the cache invalidate did not work.

    Thanks for your help!

    Brandy

  • hi Steven,

    If I move the varible to DDR3, it works just fine.  Here is the screen shot:

    It must have to do MSMCSRAM.  Still trying different ideas, but suggestions would be great!  I don't want this variable to be in DDR3 since it is polled so often, I would like it on board.

    Thanks!

    Brandy

  • Hi,

    Try to use DATA_ALIGN(128) or size of invalidaiton/wb= 64, so to be consistent with the cache line size and to be sure to wb/inv some other valiables then the required ones. I tthink a DATA_ALIGN(128) or higher is the fastest way to check if there are some problems of overlaps between inv/wbb/ areas.

    When you break in the test (screenshot 1), try to break the core 0 also, then step in assmbler and look at the coherency of the mem/cache by disabling the tick "L1 Cache" in the memory view. Mybe also open a memory view associated with core 0 and check the consistent.

  • Hi,

     

    It worked when I changed it to 128, however I don't understand why.  I have disabled all L2 cache through the RTSC Tools->Platform.

    Why woudl it work with alignment 128 and not alignmetn 64 when it should have only been in L1D cache?

    Thanks,

    Brandy

  • Hi,

    If you use 128 you wb/inv at least 128 bytes (round up to the cache line, that is 2 lines for L1 and 1 for L2). So if you align 64 but flush/inv 128, you could overwrite/invalidate portion of memory you would not to. As general rule, indipendent variables should be aligned at cache line and inv/wb operation should use the size of the variable you are operating on; that is sizeof(var). it is ithe hardware that zill round first to 32bits and then to cache line.

    Example:

      int var1:    aligned 64

      var2:  &var1 + sizeof(var1) + aligned 64

      ...

     L1wb(&var1, 128)  will write back var1 and var2 also, regardless of cache line size

    So I suppose core 0 inv/wb by 128 a variable located just before your control one at a distance of less then 128 bytes from it.