This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Cache Coherency during PCIE access of MSMC-SRAM

Is cache coherency preserved when PCIE accesses cacheable sections of MSMC-SRAM?

Three examples

   1. An external host over (PCIE) reads a 32 bit value from MSMC.

   2. A DSP Core (over PCIE) initiates a 32 bit write from MSMC to an external address.

   3. Same as #2 except the write is performed using DMA.

 

 

  • I don't think PCIe has different rules than any other peripheral: data needs to be written back in order for a peripheral to read the correct values. I know this is the case for EDMA and SRIO.

    When data located in the MSMCSRAM region is cached, it lives in the core-local L1 cache (and not in the actual SRAM region until it is written back) -- but peripherals typically read directly from the explicit address without interacting with the core or its caches.

  • Thanks to twentz for answering the question and answering it clearly.

    Cache coherency is an involved question. If you want to understand more details about caching in the C66x Keystone architecture devices, I would like to refer you to some specific written material and some additional training material.

    You are already aware of the C66x DSP Cache User's Guide, which is the best source of details on how the cache operates. It is written from the perspective of the C66x CorePac and not the perspective of the C6678 or C6670 device, meaning that it considers "external memory" to be anything outside of the CorePac, including the MSMC SRAM and DDR3. Appendix B "C66x DSP Cache Coherence" is a very technical document that covers every scenario concerning L1P/L1D/L2 cache coherency with respect to various bus masters that may write to or read from memory that may be cached. To understand some of the terminology, such as "higher-level memory" and "snoop-read", I had to search the rest of the User's Guide and read those sections, too. Fully understanding the operation of the caches will require reading and understanding all of this User's Guide.

    In the Training section of TI.com, there is a training video set for the C66x SOC architecture. It may be helpful for you to review all of the modules. In particular, the CorePac & Memory Subsystem Module may help you understand some of the features of the memory and cache. You can find the complete video set here.

    In the CorePac & Memory Subsystem training video, slide 19 says that one feature that is not supported in the MSMC SRAM is "Cache coherency between L1/L2 caches in CorePac cores and MSMC memory".

    Regards,
    RandyP

     

    If you need more help, please reply back. If twentz has answered the question, please click  Verify Answer  on that post, above.

  • does TI have a recommended solution on how to perform cache coherent access from the host processor (say via PCIe) for debugging purposes say.

    for simplicity say it is possible to identify the specific core that has the data cached. is it possible for the host to initiate a L1D cache flush?

  • Sanjay,

    My recommendation is to not ever try that. The only time it is safe for the host to control the cache of a DSP core is when that DSP core is waiting during device initialization and bootloading.

    You will have to study the device datasheet to see if the host has access to the cache control registers, but I cannot imagine of a situation that would be improved by having an outside influencer directly manipulating a DSP core's cache registers. The dangers are too great.

    If you need the host to see some data that the DSP is caching, setup another buffer and have the DSP put a copy of its data there for the host to access.

    Regards,
    RandyP

     

    If you need more help, please reply back. If this answers the question, please click  Verify Answer  , below.

  • thanks for the quick reply!

    couple of points:

    • for debugging we may need access to all of DSP memory and it would be hard to target specific data.
    • when you say 'setup another buffer and have the DSP put a copy of its data there for the host to access', are you saying we should reserve a non-cached area that the DSP can put a copy in?
  • Sanjay,

    CCS and JTAG are the most powerful tools for debug. An external host reading memory while the DSP is running is not as robust, but of course you can do some things through the external host. Just keep in mind that cache coherency will be an issue to work-around.

    Do not use the PCIe to force cache coherency operations. It can be (so it will be) very bad for your system.

    The "setup another buffer" was just an idea to give you another direction to start thinking about. I was not saying a non-cached area, but that is a good idea, so I succeeded in getting you to think up great thoughts. Keep going, and come up with more ideas. Share some with us while you work on this, please.

    Regards,
    RandyP