This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Measuring EDMA3 transfer period

Guru 15580 points
Other Parts Discussed in Thread: OMAP-L138

Config=LogicPC Exprimenter kit with OMAP-L138 SOM. Running DSPBIOS 5.x

I have implemented an EDMA3 ping pong transfer from McASP to/from the C6748 in which I have an algorithm processing the audio data. The algorithm exists in a TSK that is held off by two SEM_pend's. These SEMs are posted by Tcc interrupts that fire after 256 bytes (128 left + 128 right) are transferred to & from the McASP. Here is a code snippet of the TSK. 

===========================================

 

void procBuffTsk(void)

{

int16_t pingPong = PONG;

while (1)

{

SEM_pend(&rcvBuffReady, SYS_FOREVER);

SEM_pend(&xmtBuffReady, SYS_FOREVER); 

STS_delta(&STS_sample_period, CLK_gethtime()); 

STS_set(&STS_benchmark, CLK_gethtime()); 

for

(i=0; i<x; i++)

{

 

memcpy(gBufferXmt[pingPong], gBufferRcv[pingPong], BUFFSIZE*sizeof(int16_t));  //test algorithm

STS_delta(&STS_benchmark, CLK_gethtime()); 

}

============================================

I have placed an STS_delta at the beginning of the TSK in an attempt to measure the time between DMA transfers. However, the number of CPU cycles appears to be approximately half of what I would expect. Here are my calculations and measurements.

48,000 samples per second 256 samples per DMA transfer (128 left +128 right) = 5.334mSec per DMA transfer

If CPU clock=300MHz, then gethtime() should return 5.334mS/(1/300e6) = 1,600,000 clock cycles.

However, my STS(sample_period, gethtime()) is returning ~800,000 clock cycles, or approximately half of what I am expecting. Can anyone spot my error? I would appreciate someone doing a sanity check on my assumtions and math.

Thx

MikeH

 (sorry for the mangled formatting of the post. the forum editor is giving me fits....:(

 
 

  • What do you get 48000 times per second?

  • Randy,

    48000 is the sample rate of the McASP, so I get 48000 16-bit audio samples per second from the Rx port of the McASP. As we discussed on another thread, I then EDMA 4 bytes per sample period (32-bit words) to my buffer 256 times per DMA cycle (A count=4, Bcount=256, Bindex=2, Ccount=1), so that my rx buffer contains 128 left channel 16-bit samples and 128 right channel 16-bit samples, interleaved (which is what my algorithm wants to see). As you previously suggested, by using Bindex=2 the buffer is packed with 16-bit samples.

    Quite frankly, I am struggling to understand how to use the same EDMA3 parameters on the Tx side. If I use Acount=4, Bcount=256, Count=1, and Bindex=2, I will be sending 4 bytes (32-bits) to the McASP, which it wants to see. But two of the bytes are from the next (adjacent) sample. I am currently working on a way to copy these samples to another buffer that holds two additional "dummy" bytes instead of sending bytes from the next sample

    At any rate, back to the original subject of this thread. In my view, the time period between EDMA3 transfers should be (1/48000)*256*(1/300MHz)=1,600,000 clock cycles....?

    Thx,

    MikeH

  • MikeH 2 said:
    48000 is the sample rate of the McASP, so I get 48000 16-bit audio samples per second from the Rx port of the McASP.

    One 16-bit audio sample = what? One pair of left+right 16-bit samples, or 1 left 16-bit sample or 1 right 16-bit sample? Or is it one pair of left+right 8-bit samples?

    In other words, when the McASP Receive event occurs and triggers the EDMA to do a transfer, what data is available at that time? A pair of left+right channels or a single left or single right sample? And, how wide is one sample from left or from right?

    The answer to this is determined by what the external A/D delivers, the format of what the A/D delivers, and the configuration of the McASP.

    MikeH 1 said:
    These SEMs are posted by Tcc interrupts that fire after 256 bytes (128 left + 128 right) are transferred to & from the McASP.

    This implies that 1 left sample = 1 byte and 1 right sample = 1 byte.

    MikeH 2 said:
    my rx buffer contains 128 left channel 16-bit samples and 128 right channel 16-bit samples, interleaved

    This says that 1 left sample is 2 bytes and 1 right sample is 2 bytes.

    MikeH 2 said:
    back to the original subject of this thread. In my view, the time period between EDMA3 transfers should be (1/48000)*256*(1/300MHz)=1,600,000 clock cycles.

    My best guess is that you are measuring elapsed time correctly. And I also believe you are not making arithmetic errors in calculating the expected elapsed time.

    But since your numbers are inconsistent and you have not shown the values you program into the EDMA PARAM, I really cannot tell you where your error is.

    Show me the 8 32-bit hex words from the DMA Channel's active PARAM set (from right after programming them but before starting the McASP running), and from the Link PARAM set(s) if you are using them.

  • Randy,

    Sorry for being vague. Let me try to clarify.

    Receive - I am reading consecutive 16-bit audio samples (LRLR) from the McASP. However, as you know, one must actually read 4 bytes, 32-bit samples when using DMA. So, in order to read my desired 128 16-bit samples per DMA transfer, so I have set Acount=4, Bcount=256, Bindex=2, Ccount=1, Cindex=0. As discussed, this provides 128 packed 16-bit samples of LRLR audio in the receive buffer.

    Transmit - After processing the samples in my algorithm, I then DMA the 128 16-bit samples (LRLR audio) from my tx buffer to the McASP using EDMA3. So I set Acount=4, Bcount=256, Bindex=2, Ccount=1, Cindex=0. Since I am sending 4 bytes (Acount=4) per transfer, this means that two erroneous adjacent bytes are sent to the McASP. But I have programmed the McASP to ROR 16-bits and use 2 slots. This essentially masks the unwanted 2 bytes.

    The above settings work well and provide a ping-pong buffered audio stream to the McASP.

    RandyP said:

    Show me the 8 32-bit hex words from the DMA Channel's active PARAM set (from right after programming them but before starting the McASP running), and from the Link PARAM set(s) if you are using them.

    Here are the params after programming and before enabling DMA transfer.

     

    I appreciate your taking a look. Let me know if you see something obvious, or need further clarification, or would like me to email a copy of my project to you.

    Thx

    MikeH

     

  • It looks like you expect a McASP receive event for each 16-bit sample. Is this correct?

    How often do you expect to get a left-channel 16-bit sample? 48000 times per second?

    How often do you expect to get a right-channel 16-bit sample? 48000 times per second?

    How often do you expect to get McASP receive events to trigger one DMA transfer?

    Will one event trigger the DMA channel to transfer one sample or one LR pair?

    Would you still plug the same numbers into the equation for calculating the time between DMA events? Or for the entire "frame" to be collected?

  • RandyP said:

    It looks like you expect a McASP receive event for each 16-bit sample. Is this correct?

    Correct. I receive an AREVT0 for each 16-bit sample, but transfer 4 8-bit bytes (Acount=4).

    RandyP said:

    How often do you expect to get a left-channel 16-bit sample? 48000 times per second?

    Yes.

    RandyP said:

    How often do you expect to get a right-channel 16-bit sample? 48000 times per second?

    Yes, but interleaved with left-channel samples.

    RandyP said:

    How often do you expect to get McASP receive events to trigger one DMA transfer?

    A 4-byte transfer occurs every 1/48000 seconds (Acount=4), and 256 4-byte transfers occur every 1/4800 * 256 seconds (Bcount=256).

    RandyP said:

    Will one event trigger the DMA channel to transfer one sample or one LR pair?

    As currently configured, one AREVT0 event transfers either one left *or* one right 4-byte sample. This continues until there have been 256 LRLR samples (128 left interleaved with 128 right) transferred. At that point the Transfer Completion Interrupt posts a SEM to my "Task" to begin alorithm processing, the EDMA links to the Pong params, and the Pong buffer transfer begins.

    RandyP said:

    Would you still plug the same numbers into the equation for calculating the time between DMA events? Or for the entire "frame" to be collected?

    I'm not quite sure what you mean with this question. It may be that you are asking about the term "DMA events" as opposed to a "DMA transfer". What I am trying to measure is the entire "transfer" time for 256 samples (128 left + 128 right) to be transferred from the McASP to the RAM buffer. The technique I am using to make this measurement is to inset STS_delta(gethtime()) at the beginning of my algorithm code. The algorithm should be begin processing when the transfer completion interrupt posts a SEM to unblock the TSK that holds the algorithm.

    FYI, for my project, I have adapted the TTO's EDMA LL3 example code to use the McASP instead of the McBSP.

    Again, thanks for taking a look at this. It's very difficult to try to explain all of this in text on a forum. I would be happy to discuss in person or via desktop sharing, which I find to be very efficient.

    thx

    MikeH

     

  • I have never liked people like me who answer questions with more questions, but it was your luck to get on the line. Well, I think we are getting close.

    RandyP said:
    How often do you expect to get a left-channel 16-bit sample? 48000 times per second?

    MikeH 3 said:
    Yes.

    This is correct. If you were to draw the left-channel samples being received on graph paper, it might look something like _-___-___-___-___. How much time is there between these pulses that come in 48000 times per second, meaning between the left-channel samples being received?

    RandyP said:
    How often do you expect to get a right-channel 16-bit sample? 48000 times per second?

    MikeH 3 said:
    Yes, but interleaved with left-channel samples.

    This is also correct. If you were to draw the right-channel samples being received on graph paper, it would also look something like _-___-___-___-___. How much time is there between these pulses that come in 48000 times per second, meaning between the right -channel samples being received?

    RandyP said:
    How often do you expect to get McASP receive events to trigger one DMA transfer?

    MikeH 3 said:
    A 4-byte transfer occurs every 1/48000 seconds (Acount=4), and 256 4-byte transfers occur every 1/48000 * 256 seconds (Bcount=256).

    This is incorrect. Line up the two pulse streams from above to show how left samples get received and then how right samples are received in between the left samples. Does this make the left samples come in slower? Is the time between successive left channel samples the same as in the first pulse stream you drew?

    I do not know which number you need to change, but one of them is off by a factor of 2. And this is why your calculation is wrong.

    And in the end, it does not matter as long as your data is coming in and going out okay.

    Is your data coming in okay?

    Is your data going out okay?

  • Randy,

    RandyP said:

    This is incorrect. Line up the two pulse streams from above to show how left samples get received and then how right samples are received in between the left samples. Does this make the left samples come in slower? Is the time between successive left channel samples the same as in the first pulse stream you drew?

    A picture is worth a thousand keystrokes...:)

    From sprufx4, page 18, figure 1.8:

    You are absolutely correct (oh wise professor...:). The actual "packet rate" is 48000 * 2 channels = 96000 Samples/sec, or 1/96000=10.4uS/sample period. Since there are 256 samples, the entire buffer fill time is 10.4uS * 256 = 2.67mS. With a CPU clock period of 3.3nS, the total number of clock cycles between DMA transfers is (3.3e-9 * 2.67e-3 = 80808, which is exactly what I am measuring with gethtime()!!

    RandyP said:

    Is your data coming in okay?

    Is your data going out okay?

    Unfortunately, only my test algorithm fits within this window. My actual algorithm takes too long. But that will be the subject of another thread......

    Thanks for pointing out the error in my calculations, Master Pro.

    Patient Young Grasshopper

     

  • MikeH,

    It sounds like you have had your system running for a while. Excellent job.

    Thanks for making me laugh with your humor.

    Be sure you keep in mind that the receive buffer needs to be extended by at least 2 bytes. This is because of the fact that you are reading in 4 bytes each time but only incrementing the DST address by 2. The very last 16-bit sample will be written to the right place, but the next 16-bit location after it will have the high half-word of the McASP data port written there.

    Be sure to use the optimizer. The Debug Configuration is easy to debug but terrible in performance. There is a lot of optimization advice in the Compiler documentation, the TI Wiki Pages, and the forum.

    Best of luck to you, and regards,
    RandyP

  • RandyP said:
    Be sure you keep in mind that the receive buffer needs to be extended by at least 2 bytes.

    Yes, simply....

    =================================

    int16_t gBufferXmt[2][BUFFSIZE];

    // Transmit PING & PONG buffers, must hold 32-bit words for McASP

    int16_t gBufferRcv[2][BUFFSIZE+1]; // Receive PING & PONG buffers

    ===================================

    .....works well.

     

    RandyP said:
    Be sure to use the optimizer.

    Yes, trying to crank thru optimization now (now that I know what the actual time window is...:),

    RandyP said:
    There is a lot of optimization advice in the Compiler documentation, the TI Wiki Pages, and the forum.

    Yes, have been going through these recommendations, but when the algorithm takes longer to execute than the time between DMA transfers, DSP/BIOS gets very upset and the STS values are suspect. Iwill post the details of this on another thread since this may be an even bigger challenge that what I have been through so far.

    Thanks again!

    MikeH