De-interleaving audio data while using multiple serializers

Brian Flinn

Other Parts Discussed in Thread: OMAP-L138

I have a need to deinterleave audio data using edma and McASP interface (OMAP-L138).

There are 2 transmit serializers on the McASP my data is in this forme.

chan1[8] = {L1, L2, L3, ..., L8}

chan2[8] = {R1, R2, R3, ..., R8}

chan3[8] = {L1, L2, L3, ..., L8}

chan4[8] = {R1, R2, R3, ..., R8}

The above needs to go to the serializers in this forme.

Chan1_2_3_4 xmt buffer[8*4] = {L1, L1,R1, R1,L2, L2, R2, R2,... }

From a previous thread

http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/153541.aspx?PageIndex=2

>
There's a lot of ways you could do this, but I think I like this method the best.

Solution: Use one EDMA channel per serializer

This solution ignores the fact that multiple channels exist. Here are the few modifications:

For channels 1 - "n-1":

    * ITCCHEN=1
    * TCCHEN=1
    * TCC = "next channel"
    * A-sync
    * Early completion

For channel n:

    * ITCCHEN=0
    * TCCHEN=0
    * TCC = interrupt bit in IPR that you will look for
    * A-sync
    * Normal completion

Let's use your case of 2 transmit serializers. So in this case the McASP DMA event would cause a channel to run as usual. That channel would transfer a single data element (e.g. A-sync transfer). Of course, we are required to transfer 2 elements since we have 2 serializers. To accomplish this, we chain to a second channel which also transfers a single element. In this way we have transferred our required number of samples but now we can do indexing the way we did in the single-serializer case.
>

and

>
Yes, there are a lot of ways to slice the issue:

   1. Use one parameter set per serializer and chain them together.
   2. Use one parameter set per sample in your buffer and link them together.
   3. Deinterleave after the fact by chaining to a second parameter set that does the deinterleaving for you in one shot and then generates the interrupt.

>

I was wondering what the trade offs imply as I need the most flexable / expandable solution to be able to add x more serializers to the transmitter in the future.

Thanks

Brian

over 13 years ago

0 kcastille over 13 years ago

TI__Guru 54422 points

#1:

The most obvious limitation here is that the number of channels required is equal to the number of serializers in use (let's call that N). Does the EDMA have enough channels for your max case, and what else may need EDMA channels in your system?

Also, performance may be an issue as the number of serializers/channels grow. Each time an event is triggered, the EDMA CC will submit N Transfer Requests to the EDMA TC (1 TR per channel; 1 triggered directly by the event and N-1 chain triggers) Whereas (like in #3) if a single channel were servicing all N serializers, then a single Transfer Request will be sumbitted and the EDMA TC is able to handle that larger TR much more efficiently than a smaller TR.

Further, in this approach, you are not able to use the McASP FIFO which may be more necessary as the number of channels increases.

#2:

I could be missing something, but I don't think this approach works. A "Link" copies from one PaRAM set onto the current PaRAM set when the BCNT/CCNT is exhausted. No address updates are done on the "linked from" PaRAM set. SO I don't think you can get the deinterleaving you're after (unless you get the CPU involved to babysit the address updates in the PaRAM...)

#3. IMO, this is the most attractive solution, especially considering the problems with #1.

To achieve the deinterleaving across N serializers .... you may be able to use "self-chaining" to achieve this in one shot (haven't studied it closely though). Recall that only AB-Sync trnasfers are directly supported. In order to achieve a "logical" ABC-Sync transfer, each "interemediate" transfer (using ITCCHEN) will trigger the current channel such that all CCNT blocks are transferred.

I'm not sure (and am doubtful) as to whether you can achieve deinterleaving across all streams using a single EDMA channel. If not, then you can have "link parameter sets" for each serializer ... the completion of serializer0 would link to PaRAM(serializer1) and do the same thing. You can even "chain trigger" across the link sets by setting TCCHEN=1 for all but the final serializer (which will set TCINTEN=1) to generate the interrupt to the host.

Regards
Kyle

0 Brad Griffis over 13 years ago in reply to kcastille

TI__Guru*** 125430 points

kcastille said:

1:

Each time an event is triggered, the EDMA CC will submit N Transfer Requests to the EDMA TC (1 TR per channel; 1 triggered directly by the event and N-1 chain triggers)

I agree and tried to offset that partially by using "early completion" on the first N-1 TRs. That way the spacing between TRs is minimized.

kcastille said:

#2:

I could be missing something, but I don't think this approach works. A "Link" copies from one PaRAM set onto the current PaRAM set when the BCNT/CCNT is exhausted. No address updates are done on the "linked from" PaRAM set. SO I don't think you can get the deinterleaving you're after (unless you get the CPU involved to babysit the address updates in the PaRAM...)

This approach might work well if you have a large number of serializers, e.g. 16, but very small sample buffers (e.g. < 8 samples) as is the case for Brian's application which is low latency audio. Implicit in this approach is that ACNT=4, BCNT=num_serializers and CCNT=2. The problem with doing the channel sorting for multiple serializers (and stereo data on each serializer) is that we do not have enough indexing to support it. The BIDX indicates the stride from serializer1_left to serializer2_left, etc. The CIDX is the offset from serializer1_left to serializer1_right. If there was a DIDX we would use that to do a negative offset back to serializer1_left. However, since nothing exists we can use a separate parameter set and hard code the source address.

Positives of this approach

No "CPU babysitting"
Utilizes only a single channel/event, i.e. all the serializers are serviced in one shot

Negatives of this approach

Very quickly consumes many parameter RAMs. Double for ping-pong buffering!

kcastille said:
#3. IMO, this is the most attractive solution, especially considering the problems with #1.

My main issue with #3 is that the channel interleaving/deinterleaving all occurs at the end. That is, after we've received the last sample of data the CPU still needs to wait for this "extra" transfer to happen that move the data all around. This objection might be purely academic, i.e. the number of cycles the EDMA spends swizzling data is probably insignificant. In solution #1 the swizzling is "distributed" throughout the entire audio frame reception so once that last data comes in the CPU can get right to it. This probably is not an issue for small data buffers like Brian's, but if you had huge audio buffers then the time will become longer. Of course, during this extra time the CPU will be free to do "other stuff", so as long as the audio processing is not the only thing going on in the system, then this is still an efficient implementation.

Processors

Processors forum

De-interleaving audio data while using multiple serializers