This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/TDA2: ddr l2 l1 Speed comparison

Part Number: TDA2

Tool/software: Code Composer Studio

              from     spna165.pdf            L1 cache  600 MHz   L2 cache 300 MHz    External memory~100 MHZ memory

so  i  test  this  use  the  same  function    ,change site  of   the   in put  buffer    and   comparison  use cycles

this   is the cmd  


-stack 0x4000
-heap 0x2000000

MEMORY {
   L1P_SRAM           : origin = 0x00E00000,  len = 0x8000
   L1D_SRAM           : origin = 0x00F00000,  len = 0x8000     /* 16 KB SRAM  */
  // L1D_CACHE          : origin = 0x00F04000,  len = 0x4000     /* 16 KB cache */
    // L1D_CACHE      : origin = 0x00F00000,  len = 0x8000     /* 16 KB cache */
   L2_SRAM            : origin = 0x00800000,  len = 0x48000    /* SARAM in L2, = 256 + 32 - 128 = 160 KB*/
  // L2_CACHE         : origin = 0x00828000,  len = 0x20000    /* Cache for L2, which is configured as 128 KB*/
   DSP2_L2_SRAM       : origin = 0x40800000,  len = 0x48000
   SL2_SRAM           : origin = 0x5B000000,  len = 0x40000
   EXT_MEM_CACHE      : origin = 0x80000000,  len = 0x06000000 /* DSP Used cachable area */
    EXT_MEM_heap      : origin = 0x86000000,  len = 0x02000000 /* DSP Used cachable area */
}

SECTIONS
{
  vectors   :> EXT_MEM_CACHE
  .cio      :> EXT_MEM_CACHE
  .bss      :> EXT_MEM_CACHE   ////usually reserves space for uninitialized variables
  .text     :> EXT_MEM_CACHE   //////contains executable code
    .cinit    :> EXT_MEM_CACHE
  .const    :> EXT_MEM_CACHE
  .far      :> EXT_MEM_CACHE
  .fardata  :> EXT_MEM_CACHE   /////usually contains initialized data
  .neardata :> EXT_MEM_CACHE   ///////usually contains initialized data
  .rodata   :> EXT_MEM_CACHE
  .sysmem   :> EXT_MEM_CACHE
  .switch   :> EXT_MEM_CACHE
  .L2SramSect   :> L2_SRAM
  .stack        :> L2_SRAM
  .heap         :> EXT_MEM_heap
}

input = (int *)0x00F00000;     for   L1        1300000

input = (int *)0x00800000;     for  L2           1300000

input = (int *)0x80000000;     for  DDR        12000000

result   is    L1  and  L2   same   DDR   much  more

so     problem   is   why   L1  not  faster   than  L2?

  • The cycles depend very much on what we are trying to do in the function. If you have to measure acutal cycles then you need to write a specific test case.

    For measuring L1D performance,

    1. Partition L1D as SRAM and cache.
    2. Declare an array of lengh (say 1KB) and pipe it to L1DSRAM using #pragma DATA_SECTION
    3. Verify in .map file that array is piped to L1DSRAM section.
    4. Write a function which reads the array in a loop and accumulate contents. Send the accumulated value as the return value of the funciton.
    5. Profile the function.

    For measuring L2 performance,

    1. Partition L1D and L2 as SRAM and cache.
    2. Declare an array of lengh (say 1KB) and pipe it to L2SRAM using #pragma DATA_SECTION
    3. Verify in .map file that array is piped to L2SRAM section.
    4. Write a function which reads the array in a loop and accumulate contents. Send the accumulated value as the return value of the funciton.
    5. Profile the function.

    For measuring DDR performance,

    1. Partition L1D and L2 as SRAM and cahe.
    2. Cache DDR space by setting appropriate MAR bits
    3. Declare an array of lengh (say 1KB) and pipe it to L2SRAM using #pragma DATA_SECTION
    4. Verify in .map file that array is piped to L2SRAM section.
    5. Write a function which reads the array in a loop and accumulate contents. Send the accumulated value as the return value of the funciton.
    6. Profile the function.

    Hope this shows the difference bettwen L1D, L2 and DDR performance. The performance can vary depending on L1D, L2 cache sizes and also if DDR cached or not.
  • Have you confirmed that you can properly read/write proper data in L1 and L2? The reason I ask is because access to these is dependent on the configuration. For example, at reset, the L1D is configured as full cache. The L2SRAM is configured as full RAM. So unless you change the configuration, L1D is not accessible as memory mapped RAM. If you are seeing same for L1D and L2, it might be because the L1D access is not actually reading/writing to the L1D RAM. Without seeing your code, I can only ask the question. I would also expect the L1D access to be faster than the L2 access.