This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H14: DDR and cacheability attribute

Part Number: 66AK2H14

Hello

 

I have a question for my understanding of how the DDR (or DDR memory controller) works.

 

Let’s say that we have a keystone I or keystone II processor.

Let’s suppose we use L2 memory as RAM and not cache.

 

For these architectures from my understanding and tests,

The cache (L1D) is of type “write around”

Which to me means that :

-          If an address is not in the cache, the data is written to the address and not in the cache.

-          A subsequent read at that address would cause a “cache miss” (the data is not in the cache), and the data will be fetched from physical memory

 

Q1) is this understanding correct ?

 

Then let’s move to two use cases :

a)

-          I have disabled cacheability in the DDR address range (through MAR registers)

-          I never read data in DDR, I only write data

-          I measure speed performances : they are low, a couple hundreds of MB/s

 

b)

-          I have enabled cacheability in the DDR address range (through MAR registers)

-          I never read data in DDR, I make sure the DDR address range is not in cache, I only write data to the DDR address range

-          I measure speed performances : they are much better, with several thousand of MB/s

 

In b) as no data is in the cache and because the cache is of type “write around”, I write directly in DDR memory. So the cache is not used. However performances are very different.

 

Q2) how do you explain such a big difference in performances ?

 

Thank you

Regards

Clement

  • Clement FR said:
    Let’s suppose we use L2 memory as RAM and not cache.

    Based on this comment, my assumption is that we're discussing the 66x DSP specifically.

    Clement FR said:
    The cache (L1D) is of type “write around”

    No.  Please see the C66x Cache User's Guide:

    http://www.ti.com/lit/sprugy8

    All cache in the 66x is "write-back cache".  I strongly recommend carefully reviewing Table 1-1 Cache Terms and Definitions.  In addition to understanding "write back" vs "write through" (and vs "write around" which isn't covered), you should be sure to understand the difference between "read allocate" and "write allocate".  For the types of scenarios you're discussing, understanding those definitions is important.

    Clement FR said:

    In b) as no data is in the cache and because the cache is of type “write around”, I write directly in DDR memory. So the cache is not used. However performances are very different.

     

    Q2) how do you explain such a big difference in performances ?

    This comes up so often that I have authored a wiki page on the topic:

    http://processors.wiki.ti.com/index.php/Common_Issue_Resulting_in_Slow_External_Memory_Performance

    Best regards,
    Brad

  • Hello Brad

    Yes for now I'm only considering C66x DSP cores access (not ARM cores).

    OK the cache is not write around but "read allocate" and "write back".

    I have freshen up on the definitions.
    Thank you for the link.

    in the scenario discussed above.
    I don't do a single "read" in DDR.
    So the cache, being "read allocate" is not involved.
    Said differently : all write operations to DDR go through the write buffer and go to the DDR memory.

    In this specific scenario, with no reads, why and where the "cacheability" bit makes a difference ?
    I want to understand where in the SoC architecture it has an impact. Is it in the DDR memory controller ? elsewhere ?

    What really happens under the hood ? in your article you mention "the CPU will be stalled". In this scenario I don't get why and how (b) is less stalled than (a).

    Best
    Clement

  • Clement FR said:
    OK the cache is not write around but "read allocate" and "write back".

    In the Cache User Guide, see Table 1-6 L2 Cache Characteristics.  The L2 cache is read and write allocate.

    On a related note, I'm sorry to be skipping over a lot of your questions.  However, given that so far they've been based off an incorrect premise, I think it's necessary for you to re-evaluate and/or re-phrase the questions accordingly.