This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM8168 Memory Bandwidth

Guru 10685 points

We are using DDR3 memory running at 1600MHz with CAS latency of 6 and are finding that the bandwidth is not amazing.

Using the "bandwidth" application (http://zsmith.co/bandwidth.html) we only get sequential reads up to 6.8GB/s and not the 12.8 GB/s we would expect for 2 EMIFs, each at 6.4GB/s. Does anyone else find this? Does TI publish DDR3 memory bandwidth results?

Thanks,
Ralph

  • The two banks of memory are interleaved on course boundaries. The memory never behaves as 64-bit memory. What this dual memory architecture does do, is allow multiple memory clients to access memory at the same time when they access different banks, eliminating 50% of contentions. It's discussed in the TRM.

    You also have to figure in all the cycles (command, pre-charge, etc) before the data actually comes or goes. Sounds like you're actually doing pretty well.

    I note that most ARM processors are running 32-bit DDR2-800 or less, and it's not clear to us how memory-bound they are for real applications.

    Given the '8168 errata (the last one for 2.0), you probably don't want to push pixels with the processor if you're using the HDVPSS anyway.

    Careful. We think the CPU clock on the DDR3 EVMs is only 800 MHz.

    I don't work for TI, so if I'm wrong, I hope someone will set me straight.

  • Hi Herb,

    Thanks for the information. I'll have a read of the TRM.

    Update; I've read the TRM and it says in section 4.1.2:

    "Ability to interleave the DDR data between two EMIF banks, using the programmable multi-zone
    DRAM memory mapping . This increases the memory throughput by a factor of 2. Up-to four unique
    memory sections supported."

    So I'm not sure where you read about the contention as this paragraph suggests otherwise.

    Would be good to hear a response from TI as well.

    Ralph

  • Ralph,

    Memory bandwidth is not a single-point item. The architecture of a device like the DM8168 is broad in scope and is designed to support a very large number of bus masters operating at the same time.

    If you application is as singular as the "bandwidth" application that you ran, then that is the right number for you to base your application trade-offs on. But then, it would really mean that you need to choose a different device because you are ignoring a lot of processing elements.

    I have an application that will be on a Wiki App Note sometime this year (when I find the time to get it productized) that runs on the C674x and uses multiple EDMA3 channels to exercise the DDR3, and that application reaches 80% utilization of the DDR3 bandwidth when running at 1600 MT/s (not MHz, MTransfers/sec). It is not ready for release; the comments are wrong and do not explain how to run it, so you will just have to trust me for now.

    My belief is that most applications will benefit from using the 2 EMIFs in interleaved mode, and that is how the test I mentioned was run. But you can also use them in non-interleaved mode to allow two different masters to have separate 6.4MB/s ports to DDR3. The problem is that you are more likely to have dead time on both EMIFs in that case, so the total bandwidth will be used less efficiently. With them in interleaved mode, two masters may have to share the 2 EMIFs but either should average faster transfers in interleaved mode. I have not done studies on that, but it is my belief that the normal multi-processor use of the DM8168 or any of its spin-offs will run best with the 2 EMIFs interleaved.

    Regards,
    RandyP

  • Thanks for the post. We are using interleaved mode but I see that we have to assume that because the architecture is optimised for multiple bus masters, we aren't going to see maximum theoretical data rate for DDR.

    Ralph