This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6748 Starterware McASPPlayBack Example question

Guru 15580 points

The McASP Playback example in the Starterware 1.20 uses 3 audio sample buffers. Can someone explain why there are 3? Why not just 2 buffers in a ping-pong fashion?

  • Also, this app will run once after a power-on reset, but will not run after a warm-restart. Can the Starterware team comment on why this is happening, as well as the buffer architecture question above?

    FYI, I am using the C6748 LCDK platform and CCSv5.3.

  • Hello Mike,

    BTW I don't work for TI but I'm a fellow user like yourself.

    I actually fixed this to work with 2 instead of 3 buffers and it worked just as well.

    There are some other issues I found with the example as well as the mcasp.c library file:

    1)  For txDefaultPar and  rxDefaultPar aCnt is marked as BYTES_PER_SAMPLE but this is always sizeof(int) because the DMA controller always spits out 4 byte integer data that needs to be handled 4 bytes at a time regardless of WORD_SIZE and NUM_I2S_CHANNELS. If you start messing around with the WORD_SIZE you'll see this will crash. Similarly the transmitter source bIdx and receiver dest bidx need to change to BYTES_PER_SAMPLE. Then the corresponding bCnt data needs to change to NUM_SAMPLES_PER_AUDIO_BUF * NUM_I2S_CHANNELS. This is just the beginning of what needs to be fixed in the McASPPlayBack example... It looks like it was just written to the point of working without any concern for robustness.

    2) The mcasp.c file formats the data wrong. There are 2 issues:

    I) The RROT receiver rotate right and XROT transmitter rotate right should be fixed to properly right justify the data. The original code set both of these to WORD_SIZE. The receiver should shift 32-WORD_SIZE and the transmitter should shift by WORD_SIZE&0x1C (the 0x1C fixes the special case of 32 bit audio where no shift is necessary). The only reason the code runs is WORD_SIZE is 16 bit which is the only case where 32-WORD_SIZE=WORD_SIZE

    (wordSize >> 2) needs to change to ((32-wordSize) >> 2) under McASPTxFmtI2SSet

    (wordSize >> 2) needs to change to ((wordSize >> 2)&7) under McASPRxFmtI2SSet

    To TI people: How do we fix the libraries? When I fix included libraries, they don't recompile with the project.

    II) The zero data padding makes it impossible to do any mathematical manipulation on the data. The negative data needs to be sign extended with 1's and the positive data needs to be sign extended with 0's. So then the following need to change:

    MCASP_TX_PAD_WITH_0 needs to change to MCASP_TX_PAD_WITH_PBIT(wordSize-1) under McASPTxFmtI2SSet

    MCASP_RX_PAD_WITH_0 needs to change to MCASP_RX_PAD_WITH_PBIT(wordSize-1) under McASPRxFmtI2SSet

  • Elliot,

    Thank you for the detailed reply! I am off to another project for the moment, but will try your fixes when I return.

    Did you experience the warm-restart issue I mentioned above. I've determined that none of the EDMA transfers are generating interrupts, but have not figured out why. I would like to use some of this code as the basis for the C6748 project, but as you have pointed out, it seems a bit unreliable.

    Thanks again for sharing your solutions.

  • Hello MikeH,

    I had the same warm restart issue. The only way I could get the code to run again is by completely stopping and restarting debug mode.

    The interrupts seem to be working fine too. The EDMA generates an interrupt once all of the NUM_SAMPLES_PER_AUDIO_BUF are transferred, but not on each data transfer per audio interrupt. The one thing is it  seemed really strange that the code had an infinite polling loop to determine when buffer was received when the same thing could easily be done in the EDMA receive interrupt.

    I agree with you that I'm concerned about the reliability of the libraries and examples when they are incorporated into other projects.

  • Hey Elliot,

    I'm also trying to get that example to work.  I'm a dsp sw guy and I would just like to be able to get at the data easily, so I would rather not have to learn about the inner working of McAsp , I2C, and EDMA if I can avoid it.  Can you post your working code?  I just want to be able to read in and put out the data at a variety of typical audio sample sizes and rates.  Thanks.

    Dave

  • Elliot,

    thanks for the inputs. Maybe you have measured the in->out audio delay in the initial and your project?

    I found this at least strange the measured delay in the initial project does only correspond the length of one NUM_SAMPLES_PER_AUDIO_BUF, but not (as I guess) of four – two for input, two for output.

    Additionally, restarting debug mode w/o .gel doesn’t restart the application.

    I’m experimenting with the Zoom LogicPD OMAPL138EVM, CCS5.4 and the newest Starterware 1.10.04.01

    BR
    GenPol

  • Hello GenPol,

    The way TI did the code the NUM_SAMPLES_PER_AUDIO_BUF was actually number of bytes, so if you account for 2 bytes left 2 bytes right = 4 total, that explains the factor of 4 difference.

    Also the restart issue - the audio code is a DMA process that runs in the background so even if the processor stops the audio through will continue.

    I'm a bit annoyed that nobody from TI has mentioned even a peep on this thread frankly. So we're supposed to help everyone out here and struggle with code while TI doesn't even lift a finger to help out?

    Elliot

  • Hello Elliot,

    I would invite you to take a look on the from me opened thread

    http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/284560.aspx

    You see the same reaction on the occurring bug: leave me alone, I inform you ASAP, etc.

    As a background I suspect here an activity of a third-party or remote job team that have no interest to really present the Texas Instruments but only get money for any fresh baked CheapWare.

    I’m wondering, why Texas Instruments does allow it?

    Kindly
    GenPol

  • I have encountered many bugs in TI's libraries in the past few years. It has been quite frustrating.

    This is the most outrageous one so far. For any FIR more than 128 taps, it is fundamentally wrong and can't possibly work, yet they even documented benchmarks...

  • Hi

    I've been struggling with the example too. Being new to Texas and DSPs I thought when I turned the example into a Sys/Bios program and added a few new tasks that I'd broken it. Got it working by increasing the buffer sizes and adding Cache_inv() before reading the rxBuffer and Cache_wb() after writing to the txBuffers. Took me a while to realise the example must run with caches turned off.

     

    Next I tried to re-program the DMA to separate out the left and right channels rather than store them interleaved. This is where it starts going badly wrong for me.

    Thanks Elliot for posting your corrections. I don't know if you are still monitoring this thread, but something still puzzles me a lot. I assumed  BYTES_PER_SAMPLE would be 2 which seems reasonable for 16-bit data. But it is actually 4! You say the EDMA always reads 32 bits from the McASP, so for a while I was thinking that the McASP must pack a left and right sample into a 32-bit word for reading in a single DMA cycle. That would be efficient! But there's nothing about that in the McASP user guide. So now I'm back to thinking it reads and stores 32 bits per sample, of which only 16 contain data. The buffers are twice the size they need to be, and so is 'BIDX.

    To make it easier to debug, I wrote some sine waves into a buffer and memcpy those to the txBuffers instead of copying the rxBuffer across. This appears to confirm my suspicion that the samples in the rx and txBuffers are stored as 32 bit ints. It outputs the proper noises when I make BYTES_PER_SAMPLE equal to 2 but when I make it 4 the noises are double frequency and only last for half the time they are supposed to - clearly it's skipping alternate samples.

    I daresay I'll get it all unravelled eventually, but it would be great if someone could confirm that the McASP/DMA doesn't pack two 16-bit words into a single 32-bit read operation.

     

    Thanks

    Roy

  • I don't know if anyone's reading this, but for completeness' sake:

    I've got it working now, separating left and right channels, with these EDMA settings

    ACNT = BYTES_PER_SAMPLE = 2

    BCNT = NUM_I2S_CHANNELS = 2

    'BIDX = NUM_SAMPLES_PER_AUDIO_BUF * BYTES_PER_SAMPLE / NUM_I2S_CHANNELS  (to do channel sorting)

    CCNT = NUM_SAMPLES_PER_AUDIO_BUF / NUM_I2S_CHANNELS

    'CIDX = -(NUM_SAMPLES_PER_AUDIO_BUF * BYTES_PER_SAMPLE / NUM_I2S_CHANNELS) * (NUM_I2S_CHANNELS-1) + BYTES_PER_SAMPLE

    I also set the BCNTRLD value to 2, same as BCNT, which seemed like a safe guess, but I haven't worked out what that does yet. (Still learning.)

     

    The other question, about packing two 16 bit samples into a single cycle is also answered I think: it doesn't do that. By inspecting the rxBuf, I can see that the DMA is storing 4 bytes per sample, but 2 of those bytes are always zero so if you view the buffer as16-bit ints, with the original interleaved data storage settings, the buffer contained

    left0, 0, right0, 0, left1, 0, right1, 0, left2, 0, ...

     

    Roy

     

  • Roy,

    I assure you that people are reading this. Please keep posting your findings. It has been very helpful, especially in light of TI's apathy on this subject.

    FYI, there is some very good training info on EDMA3 in some of the DSP/BIOS workshop material. The material is useful even if you are not using DSP/BIOS. Here is a link to one of the documents:

    http://processors.wiki.ti.com/images/5/5e/EDMA3_LLD.pdf

    Thanks again for the posts.

  • Also, here is an on-line training module that has been very helpful to me in the past.

    http://learningmedia.ti.com/public/c6474/C64x_Edma/index.html

  • Thanks for those links Mike. They look pretty useful. There's masses of stuff on the wiki, but it's not easy to find what you need. Or to know whether it has been superceded!

    Thanks again
    Roy

     

  • One more thing.

    If you decide to change BYTES_PER_SAMPLE to 2, you will save memory because the buffers are exactly the right size to hold NUM_SAMPLES_PER_AUDIO_BUF 16-bit ints. But, because the EDMA stores 4 bytes every time, we have a problem. Mostly, the unwanted 2 bytes are overwritten by the following sample, but the last samples run over the ends of the buffers and clobber whatever happens to be next in memory. Expedient solution is to make all the txBuf and rxBuf slightly bigger.

     

    Roy.

  • If you guys are still tracking this thread, please take a look at this new thread on this issue.

    http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/p/376334/1324877.aspx#1324877

  • Mike,

    thanks for the update. As for me, I have since long decided not to use the StarterWare because of intransparency and instability and planned to develop my own McASP+EDMA3 drivers. Shortly: in the last driver version I use triple input and triple output buffers working with the 3 channel chaining and AFIFO.

    Unfortunately this way I’ve also experienced some troubles and also based on the TI info. If you maybe have interest and could contribute something please look on this:

    http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/369281.aspx

    Kindly
    GenPol