This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Cache coherency issue within a task

Hi,

I try to run an application from DDR memory. There is a 50 Hz periodic function triggers the application by posting a semaphore. In the application program, There is a simple counter that counts the number of  run.

I have noticed that although counter is increased by one at the current run, at the next run released from pending the semaphore it becomes 0 again. So this can be corrected by cache invalidate and cache write back operations. At this point I have several questions:

a) Is this the normal situation? I mean I suppose that cache operations are used for communicating the cores over a shared external memory or a hardware peripheral (dma, PA, etc). What is expected within a task or tasks running on the same core?

b) If this is the expected situation, should I apply cache operations for all the global and local variables? It seems to me not practical. Does there exist an other rule to decide which variables should be applied cache operators?

c) I have seen that the variable should be aligned to 128 byte boundary for applying cache operators. But I have observed that it seems to work if it is not aligned to 128 byte. So is it a advice to increase the speed or mandatory requirement? 

Thank you

Alpaslan

  • Note: Only L1 memory is used for cache.

  • You should only need to invoke cache coherence operations if code on one core is interacting with some other entity in the system -- hardware or another core -- that is modifying the same data (or other resource).  Alternatively, even on one core, something may be invalidating the cache rather than doing a writeback-invalidate -- so that the modified counter never leaves the cache.  If your code is running on a single core, I would check the writeback-vs-invalidate question.

    Variables should be aligned to the largest used cache line (64 bytes for L1D, 128 bytes if L2 is used) to avoid sharing cache lines between unrelated data.  If variables A and B are mapped to the same cache line, and updated by unrelated tasks on different cores, then core 1 might modify variable A while core 2 is modifying variable B.  Without some kind of serialization in the software, either core could write back the cache line without seeing the other core's change.  By moving the variables to different cache lines, one can avoid this extraneous synchronization.

    Michael

  • Michael thank you for your reply.

    I have worked with 6678, but the counter variable is a local variable of the code running on a single core (core0). Since it is a local variable that is manipulated by only the code running on the same core, I expect that it does not suffer from cache issue. But I observe from memory browser that whenever I increment the counter, only cache is updated not DDR memory (it can be accepted). But for the next run I have seen that the value of the counter is taken from DDR memory. As a result it is never incremented, remains as 0 at all.... I have used cache invalid and cache wb and the problem is solved...

    So, it seems that for local variables that is manipulated by the code running on the same core, cache inv and wb should be used. But this sounds to me not good.. Is there any explanation the phomenon that I summerize?

    Alpaslan

  • Alpaslan,

    Thanks for the post.

    Please refer below threads which may help you to solve the cache coherency issue.

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/259851.aspx

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/275398.aspx

  • Alpaslan,

    For a single core performing all the accesses to/from an external memory location, the coherency is maintained.  If the data has been read from DDR and is cached in L1D, then subsequent accesses will be to/from the cached location in L1D up to the time that it gets Evicted or Invalidated from L1D.  

    If it's Evicted such as the cache needed to be used to cache in another location, then the data is written back to DDR.

    However if it's invalidated by caching operations (i.e. your code invalidated it.) Then it's just no longer valid and the next access would be from the DDR memory location.

    Please make sure you're not invalidating the address range and then trying to access it.  If you're trying to perform cache coherence operations, you want to perform a Writeback before performing an Invalidate so that any valid locations are written back to DDR before invalidation.

    Best Regards,

    Chad

  • Chad,

    "For a single core performing all the accesses to/from an external memory location, the coherency is maintained. " This is the same as my expectation. For a single core that is running from DDR memory, the cache coherency on variables residing on DDR, is maintained without implementing INV and WB operations.

    But I have seen the following situation for the set up below:

    • There is a counter variable whose physical location is on DDR memory,
    • There is a periodic function runs as 50Hz. In every run, it increments the counter and post a semaphore
    • There is a task which is waiting for semaphore and whenever the semahore released it does something. 

    I put a breakpoint in the periodic function before incrementing the counter variable. I have observed that every time at the brakpoint , the value of the counter is zero. In the peridic function the counter is incremented by 1, it is incremented in cache, not incremented in physical memory (DDR). This is an expected behavior. But for the next run of periodic function, it remains zero. It looks like a cache-coherency problem. Because when I apply INV and WB operations, it is corrected. But we assume that there is no cache coherency maintenance for single core...

    So this phomenon can effect the other variables at all...

    Thank you

    Alpaslan


  • Alpaslan,

    You've stated this before, but it's not clear as to what specifically is doing the writing.  Please confirm that it's only CorePac writes and not some other IP, and it's from the same CorePac.

    Also, you say that for the next run of the periodic function it remains zero.  Have you verified that the code which does the increment is indeed executing (i.e. single step through assembly execution and see that the update occurs.)  This is most likely were the error would be.

    Best Regards,

    Chad