This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6748: EDMA3 / compiler optimization

Part Number: TMS320C6748

I've run into an interesting situation.  EDMA3 (QDMA in my case) does not work as expected when the code is optimized at level 3 with speed 5.  I am using QDMA to read a 16 bit port  (same port) 16 times and transfer the readings to memory.

When code is compiled (with debug on) with optimization turned off, QDMA works as expected.

When code is compiled with optimization set as described above, QDMA does not work correctly the first time.  Specifically, instead of signalling completion when all 16 values are transferred, the QDMA signals completion after the first value read/transfer.

What might be happening here?

The compiler version is 8.3.3

  • Hi,

    Can you share which Processor SDK RTOS are you using?

    Best Regards,
    Yordan

  • Hi,

    Are you able to provide a snapshot of the EDMA code: including how the ParamSet is configured? I want to decode the OPT, also want to see how the source and destination address alignment, ACNT, BCNT, etc? And how you trigger the QDMA?

    Regards, Eric 

  • //=============================================================================

    // PaRAM Block

    #define PaRAM_START 0x01c04000
    #define PaRAM_SET_0 0x4000
    #define PaRAM_SET_1 (PaRAM_SET_0 + 32)
    #define PaRAM_SET_2 (PaRAM_SET_1 + 32)
    #define RF_IN (SOC_GPIO_0_REGS + 0x70)
    #define NO_LINK 0XFFFF

    #define QCHMAP0 0x01c00200
    #define QER 0x01c01080
    #define QEER 0x01c01084
    #define QEECR 0x01c01088
    #define QEESR 0x01c0108c

    #define IER 0x01c01050
    #define IECR 0x01c01058
    #define IESR 0x01c01060
    #define IPR 0x01c01068
    #define ICR 0x01c01070
    #define QDMA_READY (HWREG(IPR) & 0x00000001)

    struct __attribute__((__packed__)) T_PaRAM {
    // keep declaration order below!
    uint32 opt;
    uint32 src;
    uint16 acnt;
    uint16 bcnt;
    uint32 dst;
    int16 srcbidx;
    int16 dstbidx;
    uint16 link;
    uint16 bcntrld;
    int16 srccidx;
    int16 dstcidx;
    uint16 ccnt;
    uint16 rsvd;
    };

    int16 DATA_READINGS[16];

    #pragma LOCATION(data_PaRAM, PaRAM_START)
    struct __attribute__((__packed__)) T_PaRAM data_PaRAM; //Param Set 0

    //=============================================================================

    bool QDMA_Done() { // DMA transfer complete?
    return (QDMA_READY != 0);
    }

    //=============================================================================

    void StartQDMA() {

    //clear interrupt status
    HWREG(ICR) = 0x00000001;
    //set interrupt enable
    HWREG(IESR) = 0x00000001;
    // trigger DMA
    data_PaRAM.opt = 0x0010000c; //TCINTEN=1, STATIC=1, AB synchronization
    }

    //=============================================================================

    void InitQDMA() { //set up DMA

    //clear interrupt status
    HWREG(ICR) = 0x00000001;
    //enable QDMA Ch 0
    HWREG(QEESR) = 0x00000001;
    // initialize PaRAM Set 0
    data_PaRAM.src = RF_IN;
    data_PaRAM.acnt = 2;
    data_PaRAM.bcnt = 16;
    data_PaRAM.srcbidx = 0;
    data_PaRAM.dstbidx = 2;
    data_PaRAM.bcntrld = 0;
    data_PaRAM.srccidx = 0;
    data_PaRAM.dstcidx = 0;
    data_PaRAM.ccnt = 1;
    // adjust destination address per EDMA3
    data_PaRAM.dst = (uint32) DATA_READINGS + 0x11000000;
    data_PaRAM.link = NO_LINK;
    // trigger word; must be the last assignment statement in the function
    //set interrupt enable
    HWREG(IESR) = 0x00000001;
    data_PaRAM.opt = 0x0010000c; //TCINTEN=1, STATIC=1, AB synchronization
    }
    //=============================================================================

    At initialization ,   InitQDMA() is called.

    In main loop, following is executed:

    if (QDMA_Done())  {

    // execute required stuff here

    ......

    // start next 16 sample acquisition via QDMA

    StartQDMA(); //trigger DMA action

    // do other stuff hererfout = AD4_GetRFOutAverage();
    ....
    }

  • here is what buffer looks like when the problem occurs:

    0x00810198 DATA_READINGS
    0x00810198 10382 0 0 0 0 0 0
    0x008101A6 0 0 0 0 0 0 0
    0x008101B4 0 0

  • Hi,

    The QDMA setup looks right: A-B sync, source and destination address incremental mode with source index 0 (essentially no increment) and destination index by 2 bytes movement. Each time you move 16 x 2 bytes from a FIFO. BIt 20 indicated the transfer completion interrupt set. And used the static for QDMA.


    How fast the new data arrived into GPIO FIFO? Is it possible that the FIFO data is not arrived when you read the 2 to 16 times, so you got zero using optimization? If you measure the time spend from StartQDMA(); to (QDMA_Done()) , do you see the execution time is much faster?

    What if you use A-sync only and use Intermediate transfer completion interrupt enable (bit 21), will you see 16 interrupt in both optimized and non-optimized case?

    Regards, Eric 

  • 1. The speed of execution should not matter because QDMA_Done() checks the interrupt pending flag for DMA completion.  It seems that, somehow, the interrupt pending flag is erroneously getting set prior to InitQDMA() is called, as well as and subsequently QDMA_Done() is called before InitQDMA() is called.

    2. I rearranged the code by inserting a small delay after init and before start QDMA.  This seems to have fixed the bug.