This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Can the entire C6748 L2 RAM be used as cache?

Good day experts,

I am using the C6748 DSP together with DSP/BIOS 5.42.1.09 in CCS 5.4.

I am desperately trying to configure the entire 256K L2 internal RAM of the C6478 as cache to improve our system performance. Can DSP/BIOS use L3 RAM instead?

Is this possible at all, and if it is, how can I achieve this?

Thanks in advance!

Regards

  Reinier

  • Reinier,

    Yes, BIOS is not required to run from L2 memory.  By default I believe all the sections are placed into L2 memory so you would have to put it in L3 or DDR or whatever memory that is available on your board.  Hopefully you are familiar with the gconf (Graphical configuration Tool).  You should be able to open up your config file and go into the MEM module and place all sections into the new memory segment and then also using the Cache module, set it to 256K Cache.

    Judah

  • Hi Judah,

    Thank you for the information, I managed to place all of the BIOS sections in DDR, except a few sections that TI recommends should be in internal RAM (L3).

    However, for some reason this did not have a significant impact on the performance of my code. We have ported our system from the C6418 to the C6748 and in many instances we see close to a factor of 2 reduction in performance on the C6748. This was quite unexpected! I played around with a few compiler options (i.e. -O3 optimization, no symbolic debug information) and this did not make a significant improvement.

    I have already clocked the C6478 for the maximum CPU clock speed of 456MHz and the DDR memory for 150 MHz.

    Changing the actual code is a last resort, for various reasons. Is there some other way to improve performance? Maybe some other chip/peripheral configurations that can be tweaked?

    Thanks in advance!

    EDIT: 

    I tried some benchmarking to determine where my bottleneck lies, and I came across some interesting findings. I wrote a simple complex matrix multiplication function, which I executed in a non-DSP/BIOS environment with the following parameters:

    - 11x11 complex matrix multiplication

    - Code and data in DDR

    - Cache turned on (128K)

    - Compiler optimization (-o3)

    The average execution time for a single matrix multiplication was in the order of 75us.

    I repeated the same test in the DSP/BIOS environment, where the matrix multiplication was performed in the only task. The parameters were the same as above. In this case the average execution time increased to an average of 640us per matrix multiplication (about 8.5 times longer!).

    What is happening here? Surely DSP/BIOS does not generate so much overhead??

  • Did you enable the MAR bits for the external memory?  You need to enable the cache + you need to enable the MAR bit for each particular 16MB memory range so that the memory gets cached.

    Judah

  • Hi Judah,

    Yes, I did enable the MAR bits in TCONF script.

    I noticed that the L3 RAM on the C6748 appears to be slower than the external DDR memory. Is this to be expected? I can't seem to find more information on the specific 128KB L3 on-chip RAM of the C6748, can you maybe provide me with some more detail? 

  • Sorry, I don't have any additional information on the L3 memory.  I looked in the datasheet but it did not offer any additional information.  You might have to ask a hardware expert on one of the device forums.  My thought is...It should be faster than DDR memory so if its showing up as slower, it doesn't sound right to me.

    Is it possible for you to posts your BIOS project here?  I'm curious what else if anything is running no your BIOS program?

    Judah