This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

QDMA and memory bandwidth

 

Hi,

 

I bought DSK6455 revision 2 and it has one strange feature.

Its memory bandwidth is too low. For example: when I copy data from IRAM to DDR2 the memory bandwidth  is near 1000MB/s and when I copy data from DDR2 to IRAM the memory bandwidth is near 1900MB/s. All data and a code are placed in IRAM. 

 

I have attached the test project.

In this project I try to copy 32768 byte from DDR2 to IRAM and from IRAM to DDR2.

It takes  36000 and 20000 cycles respectively.

I copy data using CSL_DAT library which works through QDMA.

Could you please explain this difference in memory bandwidth?

 

DAT_TEST.ZIP
  • Here are a couple of things to try that will help you determine the amount of memory bandwidth you are getting and the cost of software overhead:

    1. Measure the time to copy 16384 bytes, and the time to copy 49152 bytes. Find the delta between the three counts and see how much memory bandwidth was used for the delta count.
    2. Figure out which QDMA channel PARAM is being used, then after calling CSL_DAT once (which sets up all the PARAM for that channel), try writing 0x00000001 to the last word of the 8-word PARAM set and measure the time for that transfer. Assuming that CSL_DAT does not close everything out, this should work and will remove sw overhead.

    In the Embedded Processing Wiki, there is one topic where the author shows how to write directly to EDMA registers to do transfers. It is not as high-level as we usually like to work, but it is efficient. Check out this topic at http://www.tiexpressdsp.com/index.php/Programming_EDMA_without_EDMA3LLD_package and other topics are listed at http://www.tiexpressdsp.com/index.php/Category:DSPBIOS .

  • Can you change the TC from 0 to 2 and re-run the tests, please?

    Also, your numbers do not match up, so please restate them after the new test to make sure we understand which is faster.

  • I have done everything according to your recommendations. Unfortunately, the memory bandwidth has  not changed.

     Can you explain why with the same TC and the same configuration I have produced different results?

    Can you run this test and public results which you produce? Or can you create your own test with high memory bandwidth? I will find my mistake if I see any project where memory transfer from DDR2 to IRAM is near 1900 MB/sec.

    DAT_TEST.rar
  • In you first posting, you stated

    Mr.Juri Temnikov said:
    and when I copy data from DDR2 to IRAM the memory bandwidth is near 1900MB/s.

    You are already meeting the performance that you request. In what way am I confused, sir?

  • I am terribly sorry. I’ve mistaken in my first post. When I copy data from IRAM to DDR2 the memory bandwidth is near 1900MB/s and when I copy data from DDR2 to IRAM the memory bandwidth is near 1000MB/s. But data transfer speed in both cases must be the same. Why are they different? May be my GEL file is incorrect?

  • The choice of Transfer Controller (TC) is critical. If you are using EDMA3TC0 or EDMA3TC1, the perfromance you are reporting is the expected behavior. If you use EDMA3TC2 or EDMA3TC3 then it is expected that you will get performance in the 1900 MB/s range for both directions.

  • Thank you. I have found my mistake. You were right, I used TC0. How can I switch TC by CSL_DAT module? I tried to use EDMA LLD but it uses TC0 too.  I haven’t found how the Transfer Controller is selected.  Could you please describe how I can select the Transfer Controller when I use LLD library?

     

  • In DAT_TEST.c from your first posting above, please change the following line

        datSetup.priority = CSL_DAT_PRI_0;

    to

        datSetup.priority = CSL_DAT_PRI_2;

    Even though the struct field is called "priority" and the constant uses "PRI" in the symbol, this field is written to the QDMAQNUM register to set the CC Queue number which then corresponds to a single TC number, by default and normal use this means the matching TC number.

    CSL_DAT_PRI_N will select TCN for the QDMA channel used for the DAT module.