This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28379D: Theoretical maximum read and write speed of the EMIF connected to ASRAM

Part Number: TMS320F28379D

Hello Community,

I am trying to calculate the theoretical maximum read and write speed of the EMIF (connected to an ASRAM) of the TMS320F28379D. This should be done only with timing constraints on controller side, ignoring any ASRAM timing constraints.

Making following assumptions:

  • Clock running @ 200 MHz
  • EMIF clock @ 200 MHz --> E = 5 ns
  • DMA will be used; 4 cycles / word = 20 ns [datasheet Rev. J, chapter 6.8]
  • Busy / wait mode is not used

Using formulas from table 5-38 [datasheet Rev. J] and looking at figure 5-21 and 5-23 respectively, there are a few things I do not understand.

1. Read timing(figure 5-21)

The read cycle time tc(EMRCYCLE) is number 3 and calculated using the formula:

tc = (RS+RST+RH+2)*E-3

With RS,RH at least 1 cycle and RST at least 4 cycles and E = 5 ns this gives:

tc = (1+4+1+2)*5 - 3 = 37 ns

Looking at figure 5-21 tc(EMRCYCLE) (number 3) is the time span including the numbers (4 || 8 || 6 || 29) + number 10 + (5 || 9 || 7 || 30), where 4 to 9 and 29, 30 are at least 2 ns and number 10 at least 19 ns. This result in a total of 2 + 19 + 2 = 23 ns.

    1. Where does the difference of 14 ns (37 ns <--> 23 ns) come from?
    2. When the strobe mode is active, the numbers 4 and 5 are at least -3 ns. How can a wait time span be negative?

2. Write timing (figure 5-23)

The write cycle time tc(EMWCYCLE) is number 15 and calculated using the following formula:

tc = (WS+WST+WH+2)*E-3

With WS, WH and WST at least 1 cycle and E = 5 ns this gives:

tc = (1+1+1+2)*5 - 3 = 22 ns

Looking at figure 5-23 tc(EMWCYCLE) (number 15), the time span including the numbers (16 || 18 || 20 || 22) + number 24 + (17 || 19 || 21 || 23), where 16 to 23 are at least 2 ns and number 24 at least 4 ns. This result in a total of 2 + 4 + 2 = 8 ns.

    1. Where does the difference of 14 ns (22 ns <--> 8 ns) come from?

3. Calculate data transfer rate

 

Calculation of the data transfer rate using the cycle times calculated in 1) and 2) and adding the bus turnaround time of 2 ns:

 

 

read

 

 

write

 

 

37 + 2

 

 

22 + 2

 

cycle time [ns]:

39

39

 

24

24

data width

16

32

 

16

32

 

 

 

 

 

 

bit rate [MBit/s]

410,26

820,51

 

666,67

1333,33

data rate [MB/s]

51,28

102,56

 

83,33

166,67

In appendix A.1 in "Design and Usage Guidelines for the C2000 EMIF" (SPRAC96A), the asynchronous test configuration uses the fastest possible values for read / write setup, strobe and hold times. These values are the same as assumed in 1) and 2) to calculate the minimum cycle time.

Looking at the diagrams in figure 5 and 6 with a data bus size of 16b, the DMA has a throughput of approximately 80 MB/s (write) and 47 MB/s (read).

Looking at the diagrams in figure 8 and 9 with a data bus size of 16b, the maximum throughput is approximately 78 MB/s (write) and 55 MB/s (read) for the CPU.

While the measured values for the DMA are slightly below the theoretical calculated values, the CPU seems to read a bit faster than the calculated values; but never mind. The question here is:

    1. Why is data throughput so slow for a 32b data bus size? I would expect the double data rate of the 16b data bus width. Looking at figures 5, 6 and 8, 9 there is a speed increase of about 38% (CPU write), 44% (DMA write), 53%(DMA read) and 80% (CPU read).
      With a maximum speed of 4 cycles / word, the DMA has a maximum throughput of 200 MB/s @ 200 MHz and should be fast enough.
  • Yannik Sch. said:
    1. Read timing(figure 5-21)
    Where does the difference of 14 ns (37 ns <--> 23 ns) come from?
    When the strobe mode is active, the numbers 4 and 5 are at least -3 ns. How can a wait time span be negative?

    2. Write timing (figure 5-23)

    Where does the difference of 14 ns (22 ns <--> 8 ns) come from?

    Be careful not to confuse the digital state machine cycles of EMIF operations with the propagation timings of electrical signals as they travel to/from the device pins.

    The RS, RST, RH, and TA variables represent digital state machine cycles. The E variable is used to convert the digital cycles to ns units.  The +/- numbers following the digital cycle counts represent variability in signal propagation.  The propagation delay numbers are meant to describe the relative placement of signal transitions, not the signal durations.  The signal durations are determined by the digital state machine.

    For example, parameter 8 is the amount of time that the address pins are valid before OE is asserted low.  The -3ns MIN value means that the valid time could be 3ns shorter than ideal (meaning that the address pins may be invalid for 3ns after CS is asserted).  The +2ns MAX value means that the valid time could be 2ns longer than ideal (meaning that the address pins may be valid 2ns prior to when CS is asserted).

    Yannik Sch. said:
    3. Calculate data transfer rate

    Why is data throughput so slow for a 32b data bus size? I would expect the double data rate of the 16b data bus width. Looking at figures 5, 6 and 8, 9 there is a speed increase of about 38% (CPU write), 44% (DMA write), 53%(DMA read) and 80% (CPU read).
    With a maximum speed of 4 cycles / word, the DMA has a maximum throughput of 200 MB/s @ 200 MHz and should be fast enough.

    In short, the EMIF was integrated into F2837x in a manner that is inefficient.  The bridge between the DMA/CPU/CLA and EMIF does not allow for maximum data throughput.

  • Hello tlee,

    thanks for your help and explanation. Why is this explanation about the propagation timings not part of the datasheet?
    What is about the difference mentioned in 1.a. and 2.a.? Excluding the -3 ns propagation timings result in a difference of 40 ns - 30 ns = 10 ns.
  • Yannik,

    In my previous post, I used "propagation timings" as a generic way of referring to deviation from ideal. It is not an actual parameter to be defined. I think the intent of the Tc parameter was to communicate the potential deviation from ideal (like a SETUP or HOLD time parameter), but it may just be confusing instead.

    For your question regarding the difference in read cycle times, I believe that you are accumulating the deviations from each EMIF stage (RS, RST, RH) as independent parameters. In actuality, deviations from one stage will be largely offset by deviations in the other stages. For example, a shorter EMxOE low width in the RST stage will result in longer EMxOE high widths in the RS and/or RH stages.

    -Tommy