This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5728: DSP access latencies

Part Number: AM5728

I am considering a DSP core application that accesses data from external memory, apply a computation, and then generate control output on McSPI (L4) module.  McSPI will be written at 10 microsecond interval.

Assuming that the computation takes less than 1 microsecond and data is fetched in small chunks (100 bytes).  What level of determinism can I depend on in terms of accessing external memory and SPI peripheral ?  In other words what are the maximum latencies in accessing external memory and SPI ?

We will have ARM MPU running its own control application that access various L4 peripherals.

How does L3_MAIN interconnect’s “QoS management for real-time hardware operators, while maintaining optimal memory latency for CPU access to memory resources” (TRM 14.2.1) support this scenario?

What are recommended design/implementation patterns for this type of operation.

  • The AM57x team have been notified. They will respond here.
  • There are two segments of the latency path:
    1. DSP CorePac: we need to determine if we will use DSP master writes to the SPI, or use the built-in DMA;
    2. L3 interconnect: the QoS configurations is managed by a combination of bandwidth regulators, bandwidth limiters, and MFLAGs. See Sec 14.2.3.3-12.2.3.5 for these settings.
    also can you confirm if the data used or computation is or isnot prepared by another processor core, i.e., is it already cache-coherent to the DSP?
    regards
    Jian
  • Thank you for your post.

    Could you elaborate a bit more on the expected latency differences between "DSP master write to the SPI" and "built-in DMA"?

    What magnitude of access latency (min and max), in terms of clock cycles, should I anticipate in both cases ?

    TRM sections you pointed to (14.2.3.3-14.2.3.5) seems pretty complicated. 

    With default settings what level of external memory access (by DSP) variances should I expect?

    Data used by DSP for computation IS prepared by another processor core, but all at once before the DSP starts processing it.

    Entire data will NOT fit in the DSP cache.

    Thank you Jian.

  • Let me check if there are any measured data for SPI write through in terms of magnitude/range with default settings.
    I will assume:
    1. DSP will master write to the SPI region;
    2. SPI region will be configured as non-cache-able peripheral space, and bypass MMU;

    Regarding to your description on fetch data to DSP from external memory, wanted to further clarify:
    1. you mentioned data got brought in as 100 byte chunks, are you considering using the 288KB SRAM in the DSPSS?
    2. what is the DSP L2 configuration between cache and SRAM you have in mind?

    jian
  • Assumptions you proposed regarding DSP write to SPI are agreeable.

    Regarding DSP data fetch from external memory:

    The entire data set that must be processed in a time-deterministic fashion does not fit into DSP's SRAM.

    For this reason, I am thinking that the data will be consumed directly from the external memory without first copying to DSP's SRAM.

    However, if you can suggest a design where a big blocks of data can be moved from external memory to DSP SRAM while

    the DSP is reading and processing 100 byte chunks out of SRAM, I would very much like to hear about it.  In this case,

    the Important thing is that once the DSP processing starts, data must be available within a known maximum latency, i.e. 

    (background) data move from external memory to SRAM must not introduce additional access delay for DSP's access to SRAM.

    Regarding DSP L2 configuration between cache and SRAM, I am hoping that you can suggest one given our use case.

    Thank you very much Jian.