This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/TMS320C6678: BIOS Cache implementation

Part Number: TMS320C6678
Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

Hi,

In my project, which should receive data from FPGA via RIO and send some intermediate results via TCP, I suffer from random exceptions in different places and under random conditions.

Spending several hours, investigating the situation, I found that most likely the reason was L1/L2 cache non-coherence. This issue discussed many times here in various circumstances.

Original TCP client application, which I took as a reference, mix BIOS and CSL Cache functions.

It seems that CSL Cache functions are not multitask environment safe.

Working on this issue I walked through the BIOS code (ti/sysbios/family/c66/Cache.c file) and found that some peaces of code may also cause problem.

1. BIOS Cache_xxx() c66x family specific implementation totally ignore cache type parameter (third argument of BIOS Cache_xxx() call), but operate with L2 cache control registers instead.

This shouldn’t cause any troubles when block cached in L1/L2, as L2 cache operations should make appropriate updates in L1 also, but this may not work when L2 cache is not used for the block or disabled.

2. Address alignment in function Cache_block() is not correct, as it is aligned on double word boundary, instead of cache line size. This cause incorrect byte aligned value calculation.

3. Function Cache_block() check cache module ready state, reading L2 Write Word Count control register, which is the mirror of 2 other L2 Word Count control registers, but this is not a case for checking L1 cache ready, I guess.

4. Wait parameter of Cache_xxx() (fourth argument of BIOS Cache_xxx() call) ignored except last cache block operation when xdc Cache_atomicBlockSize parameter has default non-zero value. It should be better if function only wait for cache module ready condition then program cache control registers and return without waiting, when wait parameter is not set.

Some extra. DSP silicon designer should implement shadow control registers mechanism in the cache module, similar to what was introduced in the other processor subsystems, like RIO. It would avoid loosing DSP performance, while check and waiting cache module ready in multi-task environment.

  • Hi,

    I've forwarded this to the design team. Their feedback will be posted here.

    Best Regards,
    Yordan
  • Hi,

    Can you elaborate which version of the SYSBIOS it is? I saw in 6.46.01.38 functions like: Void Cache_inv(Ptr blockPtr, SizeT byteCnt, Bits16 type, Bool wait) doesn't take the third argument "type", but would like to confirm with you.

    Regards, Eric
  • Hi Dmitry

    I am not sure if this is your observation, but I want to add one more comment. You may already know it.

    There is a known issue with write-back cache address because as you know, all cache operations are done on cache line (128 bytes for L2, 64 bytes for L1D) so if 128 bytes contain variables that "belong" to two cores, and one core does write-back, it overwrite the variables that belong to the other core.  (If what I have written is not clear enough tell me and I will elaborate)

    And you do understand that when I say write back I include cache misses that cause loading a new cache line (and write-back the previous cache line)

    To prevent this from happening, TI recommends that variables that are in area that may be used by multiple cores will be stored on 128 bytes boundary. 

    Does it help you?  Please tell us what you think

    Regards

    Ran

  • Hi Eric,

    I'm using BIOS 6.46.01.38, but this functions are the same with all other previous BIOS versions, I've on my PC.

    There are 3 BIOS cache functions, having the same parameters number and meaning.

    Void Cache_inv(Ptr blockPtr, SizeT byteCnt, Bits16 type, Bool wait);

    Void Cache_wb(Ptr blockPtr, SizeT byteCnt, Bits16 type, Bool wait);

    Void Cache_wbInv(Ptr blockPtr, SizeT byteCnt, Bits16 type, Bool wait);

    which call internally

    Cache_block(blockPtr, byteCnt, wait, <some L2 Cache BAR>);

    Parameter Bits16 type doesn't used.

    Hope this helps.

    Dmitry

  • Hi Ran,

    I'm aware that any cache operation involves full cache line block and TI recommendation of cached variables location alignment.

    Regarding the cache operations in multicore environment.

    If in some application for example core0 and core1 use MSMC shared memory with L1D cache enabled, I suppose, that L1D cache write-back operation (forced or indirect) from core0 will NOT cause core1 appropriate L1D cache content update, but only MSMC address range belonged to the core0 L1D cache line. Am I wrong?

    Thanking you.

    Dmitry.

  • Yes you are right

    One more reminder (and I am sure you know it) unless special tricks are used, all MSMC memory is cached by L1D but not by L2


    Ran