This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6748 Silicon errata 2.0.17

I need to select a work around for the C6748 2.0.17 silicon errata.  Method 1 seems to be the easiest, however we use a bunch of 3rd party SW routines and I have no idea what tool version they used to compile there code.  In order for Method 1 to work, I would need to make sure everything used the --C64p_dma_l1d_workaround compiler flag.  Correct?

Method 6 also seems easy (configure entire L2 as RAM).  What are the drawbacks to having L2 configured to all RAM, and yet have L1P / L1D configured as 100% Cache?  Will this setup even work?

What is the most common way people are handling this errata?

Thanks, Dean

 

  • Dean,

    Dean Hofstetter said:
    What is the most common way people are handling this errata?

    I doubt we would know which methods are most common. My recommendation would be Method 1, but your issues with 3P SW may be a roadblock.

    Dean Hofstetter said:
    In order for Method 1 to work, I would need to make sure everything used the --C64p_dma_l1d_workaround compiler flag.  Correct?

    Yes. Can you ask the 3P SW vendor(s) to recompile with this switch turned on? If not, this is not a good choice.

    Dean Hofstetter said:
    What are the drawbacks to having L2 configured to all RAM, and yet have L1P / L1D configured as 100% Cache?  Will this setup even work?

    The drawback is that you do not get the advantage of the extra cache space. The advantage is that you can use the RAM for buffer space and manually (EDMA3) manage it, which could be a performance improvement, depending on your system.

    Yes, it will work to have L2 set to 100% RAM and L1D & L1P set to 100% cache. They are all three independent in terms of configuration and operation (other than this issue).

    But you could also consider Method 5 if you do not want to use L2 as RAM.

    Regards,
    RandyP

     

    If you need more help, please reply back. If this answers the question, please click  Verify Answer  , below.

  • Randy,

    Thanks for the feedback.  I've discovered using the compiler flag won't work (for us anyway).  The 3rd party SW suppliers make a library, and then don't touch it if they don't have to.  For example, they can make a library for the C64x+ core, and that will run on a lot of different DSP's (including the C6748 which runs C64x+ code).  We have 3rd party SW from three different suppliers.  So the compiler switch is out.

    Setting L2 to either 100% Cache or 100% RAM seems our best option.  I'm leaning to 100% Cache, and using the 128kB of internal RAM that the C6748 has, to store critical code / data.  What is the penalty difference between L2 and the internal RAM?  Example:  cache miss in L1P, but the code exists in L2 RAM  .. vs.. cache miss in L1P, cache miss in L2, code exists in internal RAM.  I also assume accesses to the internal RAM will be a lot lot lot better then going out to DDR2.

    Thanks, Dean

  • Dean,

    Without a specific application reason to use the L2 memory as RAM, 100% cache would be the easiest solution. L2 cache augments the capacity of the L1D/L1P caches by holding more memory locations and being a 4-way cache instead of 2-way or 1-way (direct mapped) for L1D/L1P, respectively. For many DSP algorithms, linear program code will benefit from the larger cache and blocks of data will benefit, too.

    I do not know of specific benchmarks that show the cycle-count differences that you ask for. There may be some topics on the TI Wiki Pages or someone may have written on this already on the forum, so I will recommend you search those resources. When I do searches like this, I always find something interesting to read and end up learning a lot even before I find a specific answer.

    I wish we had a better name for the On-chip RAM at 0x8000 0000. A quick scan of the datasheet did not tell me anything about the performance tradeoffs, but that might be in another document. Sorry I am just making recommendations rather than doing the searches for you, but I am travelling and it is late and I wanted to at least get a comment back to you in terms of my opinions, for what they are worth.

    I agree that On-chip RAM accesses should be better than going out to DDR2, but it might only be "better" and not a "lot lot lot better". Mostly it would depend on random vs. burst accesses and page hits. The only way to figure this out is to run some timing analysis of your own that would model the types of operations that you would be doing. I recommend using the TSCH/L timer for your benchmarks, if you choose to do that; it is easy to use and very accurate. Be sure to enable caching for the On-chip RAM and for DDR2, but it might be interesting to see the performance differences with caching turned off or with L2 configured as 100% RAM.

    Regards,
    RandyP