This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to enable cache on DM6437?



Hi all,

I would like to know how to enable cache on DM6437?

First of all, i have implemented a codec and i have measured the cycle
count of each frame on c64x+ CPU cycle simulator. Simulator being flat
memory model will not consider the cache miss stall and memory bank
conflict stalls.

Later on, i have ported the same codec on DM6437 evm and measure the cycle
count.Initially all the code and data are placed in the external memory.
Now i would like to know how to enable cache.

Through the earlier posts discussed, I came to know that the CSL used for 64x
cannot be used for 64x+ DSPs. But the SPRU862A uses the commands like:

CACHE_L1pSetSize();
CACHE_L1dSetSize();
CACHE_enableCaching(CACHE_CE00);
CACHE_SetL2Size(CACHE_256KCACHE);

to cofigure L1 and L2 caches and enable external memory cacheable.

1)How do i use these commands on DM6437 platform?

(i know csl.h and csl_cache.h  supports till DM642 chips but the example given on page 27
of SPRU862A uses these header files along the above mentioned commands which drives me into
confusion)

2)Using BCACHE API from within DSP/BIOS is the only way to enable cache?(SPRU403) or is there any other way to enable cache without involving DSP/BIOS?

3) when the board is reset,after the project is builded and the code gets loaded. what are the status
of the L1p,L1d and L2 cache?

4) What role does Register (RCSL) has with respect to DM6437 cache?

Awaiting for your replies,

Regards,
Sandeep

 

  • To respond to your questions:

    1. With the DM6437 the old CSL that is seen in SPRU862a is no longer available, only a small remnant of it is still out there in the form of the register level CSL. The reason that the SPRU862 document mentions these function calls is because it was written based on the C645x line of devices which are C64x+ based and still had the old CSL. Essentially these commands no longer exist on the DM643x devices; this document should probably be updated to reflect that.
    2. The BCACHE API is the suggested method of managing the cache, and is the easiest way to manage it on the DM6437. Of course you could write to the cache control registers directly, or use the register level CSL as an interface to the registers, however this would require more coding than just using the BCACHE API in BIOS. The registers you would write to when performing cache operations are discussed in chapters 2, 3, and 4 of SPRU871 for the L1P, L1D, and L2 respectively. http://focus.ti.com/lit/ug/spru871i/spru871i.pdf
    3. The ROM boot loader on the DM6437 disables the cache on power up, so the cache would only be enabled if your initialization/application code or a GEL file enables it. It looks like the default evmdm6437.gel file enables L1P and L1D to the max cache, but leaves L2 as RAM so if you are loading from CCS chances are that this is the cache configuration you are seeing.
    4. As mentioned above, the register level CSL can be used as a way of abstracting the cache registers making it easier to write your own cache management code, however it does not have simple functions to do cache operations as seen in the BCACHE API.
  • Hi all,

    Thanks for the reply.

    I would like to ask few more queries just for better understanding of cache effects on 64x+ devices.

    A)How does the code and data navigate to the CPU for the following scenarios:

              1)L1P and L1D set to maximum cache size

                         a)assume the code and data are place in the external memory and  all L2 configured as SRAM
     
                                i)External memory made non cacheable(default scenario)
                               ii)External memory made cacheable
     
              2)Disabling the L1P and L1D cache:
     
                        a)assume code and data are place in the external memory and  all L2 configured as SRAM 
                        b)assume code and data are place in the L2SRAM  

    B) Any other material other than SPRU403 which explains BCACHE API to configure cache with examples?


    Awaiting for your replies,

    Regards,
    Sandeep

  •  

    I am not sure if I entirely understand your question, however I will try to answer it. Code and data can still be read by the CPU out of external memory even if caching is disabled for the location or cache is turned off entirely within the device, however this will slow things down significantly as the CPU will stall on such accesses until the data is available. Because everything will be a cache miss in such a case it means that the CPU will be incredibly slow. In general I would suggest having caching enabled to the maximum extent possible unless you are setting up your memory map up manually to the point that the majority of your code and data that will be accessed over time is in internal L1 or L2 memory.

     

    A.1.a.i.  If external memory is made non cacheable than every access to it will be considered a cache miss so your speed will be effectively limited to the rate at which you can access external memory, which is a huge performance loss. You can set the cacheability of external memory with the BCACHE_setMar function and you can set the cache sizes with the BCACHE_setSize function.

     

    A.1.a.ii. If external memory is cacheable in this case than the L1 caches will allow the CPU to run much faster than if no cache were available. Because these caches are relatively small it can be fairly easy to run into cache thrashing. As with the prior example you would set the cacheability with BCACHE_setMar and the cache sizes with BCACHE_setSize.

     

    A.2.a. With no cache enabled at all this scenario should have essentially the same performance as 1.a.i above, where your execution speed is limited by your external memory access speed. Again you can control your cache sizes with BCACHE_setSize, just setting them to 0 for this scenario (though you could go without it as this is default).

     

    A.2.b. If code and data are kept in the internal L2 SRAM you will see a huge performance increase over having them stored in internal memory, particularly when caching is disabled. In this case there will still be some performance loss as the L2 memory does not run as fast as the L1 memory, if you were to enable L1 caching this would increase performance even further. This scenario would also use BCACHE_setSize to configure your cache sizes.

     

    B. SPRU403 is the primary reference for the BCACHE API; I do not believe there are other application notes out there that discuss this in greater detail. Some of the examples in the DVSDK utilize BCACHE, however mostly in the cache coherency capability only (i.e. writebacks and invalidates) as the examples assume that the cache is already configured by CCS through the GEL file.

  • Sorry,

    me also I work with DM6437, but really I didn't manage to decide what's the recommended configuration of cache memory to have the best performance to DM6437.

     

    Is it ( L1P = 32Ko , L1D= 32 Ko) ==> All L1P and L1D are used as a cache memory.

    L2 = 0 Ko ==> All L2 is used as SRAM.  

     

    Regards,