ABC EDMA transfer

Jay Gowdy

I would like to copy every other byte in a 480 row, 752 column image using EDMA.

All of the working examples I have that do this set up an ACNT=1, BCNT=356, CCNT=480, SRCBIDX = 2, DSTBIDX = 1, AB transfer with TCINTEN and ITCENTEN options set to get an interrupt every row, and the interrupt handler re-triggers the transfer until 480 rows have been dealt with.

This has always struck me as horribly inefficient (with 480 interrupts needed to make an image copy happen).

I though I had an answer with QDMA, by setting up a QDMA transfer which was triggered by the CCNT. Unfortunately, it does not appear that the EDMA system decrementing the CCNT itself will cause the QDMA to retrigger, as I end up with just one "row" done at a time. If I manually touch the CCNT field with the CPU (without actually changing the value, just writing the current one), then the next "row" goes, so I would have to go back to using ITCENTEN and retriggering for each row.

Is there any "standard" pattern for doing something like this that only generates a single interrupt for the operation, as I have the need for this kind of efficient ABC transfer OVER and OVER in my applications?

Thanks,

Jay

over 15 years ago

0 Mukul Bhatnagar over 15 years ago

TI__Guru* 85095 points

There can be different ways to accomplish this, depending on your data , even synchronization, application needs etc.

One way to accomplish triggering the AB sync transfer w/o manually triggering it in an ISR would be to "chain to self " , this is usually not documented in the user guide but several users have used this w/o issues.

In this you can essentially setup the OPT field of your DMA channel to chain to self , with Intermediate transfer complete chaining enable (ITCCHEN= 1) and TCC= DMA channel (self), this way after the first time you manually trigger the transfer, the subsequent transfers can be done by the chained event generated for the same channel (specified by TCC set equal to the DMA channel itself).

ABC transfer can be accomplished by chaining additional AB syncrhonized channel to the C dimension transfer also. Alternatively if next 480 rows are being transferred to the same source/destination, then you could possibly create a Link To Self , as illustrated in the EDMA user guide

http://focus.ti.com/lit/ug/sprugp9b/sprugp9b.pdf

Figure 9 (page 31)

to trigger automatically after link update, you can additionally set Transfer Complete Chaining Enable bit too (TCCHEN=1)

Hope this helps.
Regards

Mukul

0 Jay Gowdy over 15 years ago in reply to Mukul Bhatnagar

Prodigy 20 points

Mukul,

Self-chaining with ITCCHEN=1 and TCINTEN=1 worked like a charm: all the rows were transferred and I only had one completion interrupt at the end just like I wanted.

BTW, it was interesting to me that on the ARM I saw a 15-25% speed-up of various ABC EDMA transfers vs. re-enabling every C transfer inside an interrupt (as measured by clock cycles) while on the DSP I saw a 3-4% speed-up of the same EDMA transfers. Also, I experimented with setting the TCCMODE to early, and did not see any change in the speed for these particular transfers.

Thanks very much,

Jay

0 Mukul Bhatnagar over 15 years ago in reply to Jay Gowdy

TI__Guru* 85095 points

Jay

Good to hear that you have it working more as per your expectations.

Jay Gowdy said:
BTW, it was interesting to me that on the ARM I saw a 15-25% speed-up of various ABC EDMA transfers vs. re-enabling every C transfer inside an interrupt (as measured by clock cycles) while on the DSP I saw a 3-4% speed-up of the same EDMA transfers.

Interesting data points, what OS are you using on ARM and DSP side, and i wonder if the interrupt latency were different to begin with on the cores to show this difference in % savings on CPU loading.

Jay Gowdy said:
Also, I experimented with setting the TCCMODE to early, and did not see any change in the speed for these particular transfers.

The differences are not going to be significant especially if your assessment is based on standalone single EDMA transfer scenario, and might be more noticeable if there are other events in the queue slipping in between these chained events/transfers, and there are other contenders for the destination memory for these transfers in a loaded system, that could cause the completion status (to trigger the subsequent chained event) to be delayed. Setting TCCMODE to early completion essentially leads in a chained trigger event as soon as the transfer request packet has been submitted from CC to the TC, where as normal completion, the chained event is trigger once the data has landed or almost landed to the intended end point (destination).

Regards

Mukul

Processors

Processors forum

ABC EDMA transfer