This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6424 McBSP0 transmit not aligned with externally-driven FSX

My McBSP0 FSX clock is being driven by an external source, which I've verified on an o'scope to be correct.  I'm transmitting a hard-coded pattern via EDMA and I see it being transmitted on DX at random offsets from the FSX pulse. What can cause this?

McBSP0 registers are as follows:

SPCR: 0x02370031

RCR: 0x00011F40

XCR: 0x00011F40

SRGR: 0x01FF0000

MCR: 0x00040004

RCERAB: 0xFFFFFFFF

XCERAB: 0xFFFFFFFF

PCR: 0x00000000

RCERCD-GH: 0x00000000

XCERCD-GH: 0x00000000

Thanks,

Stu

  • Stu,

    It is tough to decipher the bit patterns of the register settings. My C code always uses symbols or at least bit fields with comments to say what the field settings are. If you can post that, it may be easier to get you some useful answers. Instead, you get a few less helpful but probing questions:

    What are the frequencies of FSX and CLKX?

    How big are your word size and frame size?

    Are you sending your data from the EDMA in 32-bit words, or smaller? Some devices require 32-bit reads and writes even for bytes and half-words, but I do not find that statement in the C6424 docs right now.

    Please describe the randomness. Does all the data come out, just randomly presented without regard for FSX? Or is the data shifted by random numbers of bits and some words are lost?

    Regards,
    RandyP

  • >It is tough to decipher the bit patterns of the register settings. My C code always uses symbols or at least bit fields with comments to say what the field settings are. If you can post that, it may be easier to get you some useful answers. Instead, you get a few less helpful but probing questions:

    Ok, here are the bitfields in the McBSP registers:

    SPCR 0x02370031 Memory Mapped Register: Serial Port Control Register

    _RSV 0x0 These bits are not used
    FREE 0x1 - YES Controls Free running during emulation mode
    SOFT 0x0 - NO Controls the emulation while the FREE bit is set
    FRST 0x0 - RESET Controls the Frame Sync Generator Reset
    GRST 0x0 - RESET Controls the Sample Rate Generator Reset
    XINTM 0x3 Transmit Interrupt Mode
    XSYNCERR 0x0 - NO Transmit Synchronization Error
    XEMPTY 0x1 - NO Transmit Shift Register (XSR) Empty
    XRDY 0x1 - YES Transmitter Ready
    XRST 0x1 - ENABLE Transmitter Resets and enables the Transmitter
    DLB 0x0 - OFF Sets Digital Loop Back Mode
    RJUST 0x0 - RZF Enables Receive Sign-Extension and Justification Mode
    CLKSTP 0x0 - DISABLE_00 Enables Clock Stop Mode for SPI mode.
    _RSVD *** These bits are not used
    DXENA 0x0 - OFF Enables extra delay for turn-on time.
    Reserved *
    RINTM 0x3 Receive Interrupt Mode
    RSYNCERR 0x0 - NO Receive Synchronization Error
    RFULL 0x0 - NO Receive Shift Register (RSR) Empty
    RRDY 0x0 - NO Receiver Ready
    RRST 0x1 - ENABLE Receiver Reses and enables the Receiver

    RCR 0x00011F40 Memory Mapped Register: Receive Control Register

    RPHASE 0x0 - YES Receive Phase
    RFRLEN2 0x0 Receive Frame Length 2
    RWDLEN2 0x0 - 8BIT Receive Word Length 2
    RCOMPAND 0x0 - MSB Receive Companding Mode.
    RFIG 0x0 - YES Receive Frame Ignore
    RDATDLY 0x1 - 1BIT Receive data delay
    _RSVD * These bits are not used(always cleared)
    RFRLEN1 0x1F Receive Frame Length 1
    RWDLEN1 0x2 Receive Word Length 1
    RWDREVRS 0x0 - DISABLED Receive 32-bit reversal feature.
    _RSVD **** These bits are not used(always cleared)

    XCR 0x00011F40 Memory Mapped Register: Transmit Control Register

    XPHASE 0x0 - YES Transmit Phase
    XFRLEN2 0x0 Transmit Frame Length 2
    XWDLEN2 0x0 - 8BIT Transmit Word Length 2
    XCOMPAND 0x0 - MSB Transmit Companding Mode.
    XFIG 0x0 - YES Transmit Frame Ignore
    XDATDLY 0x1 - 1BIT Transmit data delay
    _RSVD * These bits are not used(always cleared)
    XFXLEN1 0x1F Transmit Frame Length 1
    XWDLEN1 0x2 Transmit Word Length 1
    XWDREVRS 0x0 - DISABLED Transmit 32-bit reversal feature.
    _RSVD **** These bits are not used(always cleared)

    SRGR 0x01FF0000 Memory Mapped Register: Sample Rate Generator Register

    GSYNC 0x0 - FREE Sample rate generator clock synchronization. Only used when the external clock (CLKS) drives the sample rate generator clock (CLKSM=0).
    CLKSP 0x0 - RISING CLKS Polarity Clock Edge Select. Only used when the external clock CLKS drives the sample rate generator clock (CLKSM=0).
    CLKSM 0x0 - CLKS McBSP Sample Rate Generator Clock Mode
    FSGM 0x0 - DXR2XSR Sample Rate Generator Transmit frame synchronization mode. Used when FSXM=1 in PCR.
    FPER 0x1FF Frame Period. This determines when the next frame sync signal should become active. Range: up to 212 ; 1 to 4096 CLKG periods.
    FWID 0x00 Frame Width. Determines the width of the frame sync pulse, FSG, during its active period. Range: up to 28 ; 1 to 256 CLKG periods.
    CLKGDV 0x00 Sample rate generator clock divider. This value is used as the divide-down number to generate the required sample rate generator clock frequency. Default value is 1.

    MCR 0x00000000 Memory Mapped Register: Multi-Channel Register

    _RSVD ***
    DX 0x0 - HIZ
    Reserved *
    XMCME 0x0 - _AtoB
    XPBBLK 0x0 - SF1
    XPABLK 0x0 - SF0
    XCBLK 0x0 - SF0
    XMCM 0x0 - ENNOMASK
    _RSVD *****
    Reserved *
    RMCME 0x0 - _AtoB
    RPBBLK 0x0 - SF1
    RPABLK 0x0 - SF0
    RCBLK 0x0 - SF0
    _RSVD *
    RMCM 0x0 - CHENABLE

    RCERAB 0xFFFFFFFF Memory Mapped Register: Receive Channel Enable Register Partition A/B (XCER(A/B)) registers are used to enable any of the 32 channels for receive

    XCERAB 0xFFFFFFFF Memory Mapped Register: Transmit Channel Enable Register Partition A/B (XCER(A/B)) registers are used to enable any of the 32 channels for transmit

    PCR 0x00000003 Memory Mapped Register: Pin Control Register

    RSV 0x0000 These bits are not used(always cleared)
    IDLE_EN 0x0 - RESET Idle enable bit for the McBSP
    XIOEN 0x0 - SP Transmit General Purpose I/O Mode ONLY when XRST=0 in SPCR
    RIOEN 0x0 - SP Receive General Purpose I/O Mode ONLY when RRST=0 in SPCR
    FSXM 0x0 - EXTERNAL Transmit Frame Synchronization Mode
    FSRM 0x0 - EXTERNAL Receive Frame Synchronization Mode
    CLKXM 0x0 - INPUT Transmitter Clock Mode.
    CLKRM 0x0 - INPUT Receiver Clock Mode
    SCLKME 0x0 - NO Enhanced sample clock mode selection bit.
    CLKS_STAT 0x0 - _0 CLKS pin status. Reflects value on CLKS pin when selected as a general purpose input.
    DX_STAT 0x0 - _0 DX pin status. Reflects value driven on to DX pin when selected as a general purpose output.
    DR_STAT 0x0 - _0 DR pin status. Reflects value on DR pin when selected as a general purpose input.
    FSXP 0x0 - ACTIVEHIGH Transmit Frame Synchronization Polarity
    FSRP 0x0 - ACTIVEHIGH Receive Frame Synchronization Polarity
    CLKXP 0x1 - FALLING Transmit Clock Polarity
    CLKRP 0x1 - RISING Receive Clock Polarity

    >What are the frequencies of FSX and CLKX?

    FSX is 48 KHz, CLKX is 23,810 KHz.

    >How big are your word size and frame size?

    16 bit words, 32 word frames.

    >Please describe the randomness. Does all the data come out, just randomly presented without regard for FSX? Or is the data shifted by random numbers of bits and some words are lost?

    I've found that the apparent randomness is caused by the fact that sometimes a 16-bit word of data is being duplicated (sent out DX twice in a row), which pushes the end of the data stream out past the next FS pulse.  This error accumulates over time to cause a seemingly random offset from FS for the next start-of-frame data.

    The duplicated word is also apparently random: sometimes it's the 5th word, sometimes it's the 13th, etc.  The duplication doesn't happen every frame ... sometimes it's in the 16th frame, sometimes the 29th frame, etc.

    Thanks,

    Stu

  • Stu,

    Am I multiplying something wrong? It looks like if your want 32 words x 16 bits @ 48 KHz, CLKX needs to be at least 24,576 KHz.

    Could you try a quick test where you change the FrameLen to 31 words and change the EDMA to only send 31 words for each frame?

    If I can help with the EDMA part of that, you are welcome to send the EDMA setup code. Those registers I can disassemble in my head, most of the time.

    Regards,
    RandyP

  • Yes, you're right.  I think I had an aliasing error on the logic analyzer when measuring the CLKX pulse width.  If I measure across multiple clocks and average it, I get 24,576 KHz for CLKX.

    I tried the 31-word test, but I'm not sure what you expected to see.  What I'm seeing is some garbage words after my 31st word, and the alignment relative to FSX is still shifting.  Below is the setup code: I changed BCNT, BCNTRLD, and SRCCIDX (to skip to the next start-of-frame data). Old values are in the comments.

    I noticed in SPRUEN2B that the McBSP will re-transmit the last word in the case of DXR underrun.  Could that be the cause of the duplicated words I'm seeing?  If so, what could prevent EDMA from keeping up with a relatively slow 48 KHz rate?

    Thanks,

    Stu

    void ParamSetup_mcbspTxTransfer(void)
    {

    PaRAMentry[PaRAMentryIdx_McBSP0Tx].SRC =
    (uint32) & (pEdmaRxTxBuffs->edmaTxPingBuff[0]);

    PaRAMentry[PaRAMentryIdx_McBSP0Tx].DST = MCBSP_DXR_REG;

    PaRAMentry[PaRAMentryIdx_McBSP0Tx].ACNT = 2; //bytes
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].BCNT = 31; //32
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].CCNT = 1; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].SRCBIDX = 2; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].DSTBIDX = 0;

    PaRAMentry[PaRAMentryIdx_McBSP0Tx].SRCCIDX = 4; //0;
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].DSTCIDX = 0;

    PaRAMentry[PaRAMentryIdx_McBSP0Tx].BCNTRLD = 31; //32
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].LINK = PaRAM_offset_McBSP0TxLinkPong & 0xFFFF; 

    //EDMA_OPT
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.SAM = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.DAM = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.SYNCDIM = 0;
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.STATIC = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.FWID = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCCMODE = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCC = Tcc_McBSP0Tx_Ping; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCINTEN = 1; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.ITCINTEN = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCCHEN = 0; 
    PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.ITCCHEN = 0; 

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0Tx, PaRAMentryIdx_McBSP0Tx); 

    // PONG setup is identical

    }

  • Stu,

    The 31-word test was primarily based on the wrong CLKX speed, but it also eliminates frame overrun. I assume you also made a change in XFXLEN1 for 31 words.

    When these shifts occur, do you ever see error bits set in the McBSP error registers? If there is a frame underflow, XSYNCERR in SPCR will be set and it will stay set until cleared. The XEMPTY bit gets cleared to 0 on a single-word underflow condition, but it gets set again when the next word is written to the DXR.

    Underflow would definitely explain having a word repeated. And underflow is a common concern in systems. Unfortunately, the C6424 does not have a deeper FIFO to allow you to write a bunch of words less often. Instead, you have to get a 16-bit half-word written out every 16/24,576 ms, or 651 ns, which is pretty fast.

    One solution, if this is the problem, is to evaluate everything that is happening through your EDMA and arrange for only short, real-time things like the McBSP to be assigned to a selected Transfer Controller. Many longs transfers can put in a delay that could jeopardize the McBSP servicing requirement.

    Another solution, if this is the problem, is to pack your 16-bit samples into 32-bit words and transmit 16 32-bit words so that the bit stream looks exactly the same. This gives you twice as much time to service the McBSP event. This can be accomplished a couple of ways depending on your data structure and the data buffer-filling algorithm.

    I do not know of a third solution.

    To see if underflow is the problem, change the word size to 32 bits, and change the number of words per frame to 16. You could leave the EDMA the same and just know that your data is spread out wider than it should be. Or you could change the EDMA setup so it just sends every other 16-bit sample. If the shifting ends, then the two solutions above may be the right things to do.

    Regards,
    RandyP

  • Hi Randy,

    I was not seeing any XSYNCERRs when this problem occurred.  However, changing the xfer to 32-bits did eliminate the shifting problem.  Only problem now is that the data is in the wrong order word-wise (we're running little-endian).  For example, 16-bit audio samples stored in memory as ABCD are being transmitted over McBSP as BADC.

    I saw an earlier post by you on EDMA byte-order swapping that suggested some multiple-stage chained xfers to implement byte reordering.  Is this still your recommended approach?  I'm concerned that if EDMA can't keep up with a 48 KHz data rate while doing single 16-bit xfers, will it be able to keep up while doing 3 stages of xfers?

    Right now, there's nothing else running in this system ... just this single EDMA-based McBSP driver.

    I'm really surprised that EDMA can't keep up with 48 KHz at 16bits/xfer.  As you mentioned, it has 651ns to complete the xfer.  My understanding is the EDMA module runs at 1/3 DSP clk, which in our case is 700/3=233.333 MHz (or 4.286ns/clk).  This gives 151 clock cycles to move 16 bits.  That's not enough?

    Thanks,

    Stu

  • Stu,

    The good news is that it is very likely you have now identified what the problem is. There is no bad news, just the opportunity to find the best solution. Two things before we start talking about that:

    a. Byte swapping:

    To clarify, your 32-bit word at address 0x0000 with value shown as ABCD has those bytes at byte addresses 3210, respectively, right?

    Since your PARAM for 16-bit data was setup for packed data (ch 0, ch 1, ch 2, ...), the first sample going out was CD from byte addresses 10. Did this go out as DC on the serial line? Then the second sample was AB from byte addr 32.

    I could have understood if the 32-bit version went out as CDAB where the half-words were swapped, but swapping the bytes inside a 16-bit value does not make sense. I do not know how the McBSP would be capable of this. Can you verify this as the binary bits in the order they go out?

    b. byte-order swapping:

    Whether it is byte-order swapping or half-word swapping that needs to be done, the EDMA3 can do this. The most efficient way to do it is in one big chunk copying the data for an entire buffer instead of doing it for each 32-bit word one-at-a-time. The more efficient method would be to redefine where the algorithms store the data for the various output channels so that whichever channel was written to byte offset 32*2*n+0 gets written to 32*2*n+2 and whichever channel was written to byte offset 32*2*n+2 gets written to 32*2*n+0, and so on. But a DMA channel can be setup to do that swapping separately if that is the best choice.

    Since you are most likely using ping-pong buffers, this swapping channel would be run once the next buffer is full and ready to be sent out. It would be the last step before the data is actually ready to be sent out. The depth of the buffer may affect how we write this.

    c. 151 DSP/3 clock cycles:

    This is easily plenty of time to service the McBSP. If there was nothing else going on to interfere with it, then every transfer would work every time. But if there is a 1KB transfer in front of it, that might use up a bunch of those clocks. And if memory serves, the McBSP requires the new data to get there 3 CLKX cycles before the next event (DXR->XSR transfer), which cuts the number of available clocks even more.

    Unfortunately, this is a common problem for servicing real-time peripherals. It is solvable, and that is where we need to look now.

    Improving the servicing of a real-time peripheral:

    Since this was a random event, it was a problem just waiting to fail. Then when something else got in the way, either ahead of of the McBSP Tx in the Transfer Controller, or conflicting with its access of the data from the memory buffer, or conflicting with its access to the McBSP output port (least likely if you are using the data port address), suddenly the data gets missed.

    By moving to 32-bit data, we at least double the amount of time that the EDMA3 has to respond to the McBSP Tx event. If that is good enough, then we are set. But there is still the possibility that you have other, longer delays that could fight with the EDMA3 or memory endpoint or McBSP and still cause a stall long enough to hit this problem again.

    Hopefully this simple solution will work. If we need to add more margin, the things to consider will be:

    1. Put the data buffers in L1D or L2 SRAM instead of in DDR2 memory. This is the second-best solution after moving to 32-bit data, mainly because it can be easy if the internal memory is available. This solves the problem if the stall is due to memory endpoint conflicts, usually DDR2. Sections 2.8.1-2.8.3 in the DDR2 User's Guide talk about the possible read/write command conflicts that can occur.

    2. Keep one of the three TCs set aside for short real-time access like these for the McBSP data servicing. Depending on how much short real-time activity you have and how much DMA traffic you have, this could be an easy solution or it could be a difficult one. It will solve the problem if TC conflicts are causing the problem.

    Regards,
    RandyP

  • Hey Randy,

    Re a.: I apologise for not being precise in my description.  In my example of ABCD, each letter represents one 16-bit sample (A is the first sample, B is the second, etc.).  

    Stu

  • Stu,

    This sounds more like what I was assuming, except that data ABCD are 16-bits instead of 8-bits. So A is at byte offsets 10, B is at 32, C is at 54, and D is at 76, right?

    Since the McBSP is configured to send ms-bit first, bit 31 of a 32-bit value will go first, and that is the msb of B in byte position 3. So the solution for this is the solution b in my post above. And the best choice performance-wise is for the writing algorithm to put the half-words into the alternate half-word offset in the ping-pong output buffer. Or, if you use EDMA to build the output buffer from 32 different buffers, that step could be modified slightly to get the output buffer in the right arrangement.

    Regards,
    RandyP

  • Hi Randy,

    Getting back to the issue of word-order swapping using EDMA:

    I have the system working as desired using software word-swapping.  For performance reasons, I'd like EDMA to do this for me.  I've implemented a 3-stage chaining algorithm as follows:

    1.  McBSP TX event triggers the first PaRAMset (for channel 2) which moves the high word (offset 0 in my output ping buffer) into the low word of a temp buffer (at byte offset 2), for each 32-bit word in the buffer.  This PaRAMset is chained to the second stage PaRAMset.

    2. The second stage PaRAMset (which uses channel 4) moves the low word of the output ping  buffer to the high word of the temp buffer, for each word in the buffer.  This PaRAMset is chained to the third stage PaRAMset.

    3.  The third stage PaRAMset (which uses channel 5) moves the entire temp buffer out to the McBSP DXR.

    All three PaRAMsets are also linked to equivalent pong buffer PaRAMsets.  The code that configures these PaRAMsets is at the end of this post.

    When I enable this setup, I find that the RX ping and pong buffers are unexpectedly incorrect.  I see that the RX data is misaligned within the ping and pong buffers, as if the RX event is not firing in sync with the FSR or we aren't servicing McBSP0 DRR quickly enough or something.   I tried reading DRR 32 bits at a time, but that didn't help this time.  If I simply disable the McBSP0 TX event in the EER, the RX data looks fine.

    In your experience, how can enabling chaining on channels 2, 4, & 5 affect the behavior of channel 3?

    Thanks,

    Stu

    void ParamSetup_mcbspRxTransfer(void)
    {
    PaRAM_SET PaRAMset;

    PaRAMset.SRC = MCBSP_DRR_REG;
    PaRAMset.DST = (uint32) & (pEdmaRxTxBuffs->edmaRxPingBuff[0][0]);
    PaRAMset.ACNT = 2;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES);
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 0;
    PaRAMset.DSTBIDX = 2;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = 0;
    PaRAMset.LINK = PaRAM_offset_McBSP0RxLinkPong & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0; //A-sync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Rx_Ping;
    PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable
    PaRAMset.OPT.ITCINTEN = 0; //Intermediate Transfer Completion Interrupt Disable
    PaRAMset.OPT.TCCHEN = 0; //0; //Transfer Co1mplete Chaining Disable
    PaRAMset.OPT.ITCCHEN = 0; //Intermediate Transfer Completion Chaining Disable

    // Program the Rx channel PaRAM
    PaRAM_RegSetup(PaRAM_offset_McBSP0Rx, &PaRAMset);

    /////// PING
    // Same thing in the Rx Ping PaRAM
    PaRAM_RegSetup(PaRAM_offset_McBSP0RxLinkPing, &PaRAMset);

    //////// PONG
    PaRAMset.SRC = MCBSP_DRR_REG;
    PaRAMset.DST = (uint32) & (pEdmaRxTxBuffs->edmaRxPongBuff[0][0]);
    PaRAMset.ACNT = 2;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES);
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 0;
    PaRAMset.DSTBIDX = 2;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = 0;
    PaRAMset.LINK = PaRAM_offset_McBSP0RxLinkPing & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0;
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Rx_Pong;
    PaRAMset.OPT.TCINTEN = 1;
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 0;
    PaRAMset.OPT.ITCCHEN = 0;

    // Program the Rx Pong PaRAM
    PaRAM_RegSetup(PaRAM_offset_McBSP0RxLinkPong, &PaRAMset); //PaRAMentryIdx_McBSP0RxLinkPong = 1
    }

    void
    ParamSetup_mcbspTxTransfer(void)
    {
    PaRAM_SET PaRAMset;

    #ifdef EDMA_WORD_SWAPPING

    // ******************** Ping HiWord PaRAM_SET **********************************************************
    /////// word-swap high word into tmp buffer

    PaRAMset.SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPingBuff[0]);

    PaRAMset.DST = (uint32) &edmaTmpBuffer[1];

    PaRAMset.ACNT = 2; //bytes
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 4;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongHiWord & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 1; //ABsync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = chNum_McBSP0TxLoWord;
    PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable
    PaRAMset.OPT.ITCCHEN = 0;

    // Program the Tx channel PaRAM
    PaRAM_RegSetup(PaRAM_offset_McBSP0Tx, &PaRAMset);
    // Same thing in Tx Ping Hi-word PaRAM
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingHiWord, &PaRAMset);

    // ******************** Ping LoWord PaRAM_SET **********************************************************
    /////// word-swap low word into tmp buffer
    PaRAMset.SRC = (uint32) (& (pEdmaRxTxBuffs->edmaTxPingBuff[0]))+2;
    PaRAMset.DST = (uint32) &edmaTmpBuffer[0];
    PaRAMset.ACNT = 2;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 4;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongLoWord & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 1; //ABsync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = chNum_McBSP0TxXmit;
    PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingLoWord, &PaRAMset);
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLoWord, &PaRAMset);

    // ******************** Ping XMIT PaRAM_SET **********************************************************
    PaRAMset.SRC = (uint32) edmaTmpBuffer;
    PaRAMset.DST = MCBSP_DXR_REG;
    PaRAMset.ACNT = 4;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 0;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongXmit & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0; //A-sync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Ping;
    PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingXmit, &PaRAMset);
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxXmit, &PaRAMset);

    // ******************** Pong HiWord PaRAM_SET **********************************************************
    /////// word-swap high word into tmp buffer
    PaRAMset.SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPongBuff[0]);
    PaRAMset.DST = (uint32) &edmaTmpBuffer[1];
    PaRAMset.ACNT = 2;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 4;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPingHiWord & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 1; //ABsync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = chNum_McBSP0TxLoWord;
    PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPongHiWord, &PaRAMset);

    // ******************** Pong LoWord PaRAM_SET **********************************************************
    /////// word-swap low word into tmp buffer
    PaRAMset.SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPongBuff[0])+2;
    PaRAMset.DST = (uint32) edmaTmpBuffer;
    PaRAMset.ACNT = 2;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 4;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPingLoWord & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 1; //ABsync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = chNum_McBSP0TxXmit;
    PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable
    PaRAMset.OPT.ITCINTEN = 0; //Intermediate Transfer Completion Interrupt Disable
    PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable
    PaRAMset.OPT.ITCCHEN = 0; //Intermediate Transfer Completion Chaining Disable

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPongLoWord, &PaRAMset);

    // ******************** Pong XMIT PaRAM_SET **********************************************************
    PaRAMset.SRC = (uint32) edmaTmpBuffer;
    PaRAMset.DST = MCBSP_DXR_REG;
    PaRAMset.ACNT = 4;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 0;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPingXmit & 0xFFFF;

    / PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0; //A-sync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Pong;
    PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPongXmit, &PaRAMset);

    #else

    PaRAMset.SRC = (uint32)&(pEdmaRxTxBuffs->edmaTxPingBuff[0][0]);
    PaRAMset.DST = MCBSP_DXR_REG;
    PaRAMset.ACNT = 4;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES) >>1;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = PaRAMset.ACNT;
    PaRAMset.DSTBIDX = 0;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = PaRAMset.BCNT;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPong & 0xFFFF;

    //TPCC_OPT
    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0; //A-sync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Ping;
    PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0Tx, &PaRAMset);

    ////////// PING
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPing, &PaRAMset);


    ///////// PONG
    PaRAMset.SRC = (uint32)&(pEdmaRxTxBuffs->edmaTxPongBuff[0][0]);
    PaRAMset.DST = MCBSP_DXR_REG;
    PaRAMset.ACNT = 4;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES) >> 1;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = PaRAMset.ACNT;
    PaRAMset.DSTBIDX = 0;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = PaRAMset.BCNT;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPing & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0; //A-sync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Pong;
    PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPong, &PaRAMset);

    #endif // EDMA_WORD_SWAPPING
    }

    void PaRAM_RegSetup(int32 PaRAM_offset, PaRAM_SET *pPaRAM)
    {
    REG_SETVAL((EDMA_PaRAM_SRCm + PaRAM_offset),pPaRAM->SRC);
    REG_SETVAL((EDMA_PaRAM_DSTm + PaRAM_offset),pPaRAM->DST);
    REG_SETVAL((EDMA_PaRAM_ABCNTm + PaRAM_offset),((pPaRAM->ACNT) + (pPaRAM->BCNT << 16)));
    REG_SETVAL((EDMA_PaRAM_BIDXm + PaRAM_offset),((pPaRAM->SRCBIDX) + (pPaRAM->DSTBIDX << 16)));
    REG_SETVAL((EDMA_PaRAM_CIDXm + PaRAM_offset),((pPaRAM->SRCCIDX) + (pPaRAM->DSTCIDX << 16)));
    REG_SETVAL((EDMA_PaRAM_CCNTm + PaRAM_offset),pPaRAM->CCNT);
    REG_SETVAL((EDMA_PaRAM_LNKm + PaRAM_offset),((pPaRAM->BCNTRLD << 16) | (pPaRAM->LINK)));
    REG_SETVAL((EDMA_PaRAM_OPTm + PaRAM_offset), ((pPaRAM->OPT.SAM) +
    (pPaRAM->OPT.DAM << 1) +
    (pPaRAM->OPT.SYNCDIM << 2) +
    (pPaRAM->OPT.STATIC << 3) +
    (pPaRAM->OPT.FWID << 8) +
    (pPaRAM->OPT.TCCMODE << 11) +
    (pPaRAM->OPT.TCC << 12) +
    (pPaRAM->OPT.TCINTEN << 20) +
    (pPaRAM->OPT.ITCINTEN << 21) +
    (pPaRAM->OPT.TCCHEN << 22) +
    (pPaRAM->OPT.ITCCHEN << 23)));

    }

  • Stu,

    Bear with me because this is a long thread and I am just looking at the latest post instead of going back and re-reading the entire thing. So if I ask some dumb questions, that will be why.

    You have some things right as far as chaining and linking, but I would like to step back and offer some suggestions to the whole flow.

    First, I understand that the reason for doing this is that when you configure McBSP0 Tx for a 32-bit transmit word that the two 16-bit half-words within that are swapped from how you need them to be ordered. And you want to have the EDMA do the byte-swapping for you instead of having your algorithm have to be edited just to support this system performance issue.

    Second, for the McBSP0 Rx side I believe you also need to be configured for 32-bit receives for the same system performance reasons as why you moved to the Tx being in 32-bit Tx word length mode. If you configure the McBSP0 Rx side for 32-bit Rx word length, do you also get the 16-bit half-words being swapped from how you want them to be ordered?

    Opinion 1: You do not want to do the byte swapping every time you get a McBSP0 Tx event. All you want to happen in response to the Tx event on channel 2 is that a new 32-bit word is sent to the DXR with the half-words in the right order for the serial line as you expect it to be.

    Opinion 2: The way you are doing the half-word swapping is very efficient. It would be possible to do it with one channel chaining to itself but there would be nothing to gain other than using fewer channels. Your method is probably faster than how I have often done it with 3D transfers. It definitely improves the EDMA channel thrashing over some other methods.

    So, channel 2 is the McBSP0 Tx event channel and it needs to be setup like this for the ping side:

    // ******************** Ping XMIT PaRAM_SET **********************************************************
    PaRAMset.SRC = (uint32) edmaTmpBufferTxPing;  // need a full tmp buffer for both ping and pong
    PaRAMset.DST = MCBSP_DXR_REG;
    PaRAMset.ACNT = 4;
    PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.CCNT = 1;
    PaRAMset.SRCBIDX = 4;
    PaRAMset.DSTBIDX = 0;
    PaRAMset.SRCCIDX = 0;
    PaRAMset.DSTCIDX = 0;
    PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;
    PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongXmit & 0xFFFF;

    PaRAMset.OPT.SAM = 0;
    PaRAMset.OPT.DAM = 0;
    PaRAMset.OPT.SYNCDIM = 0; //A-sync
    PaRAMset.OPT.STATIC = 0;
    PaRAMset.OPT.FWID = 0;
    PaRAMset.OPT.TCCMODE = 0;
    PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Ping;
    PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable
    PaRAMset.OPT.ITCINTEN = 0;
    PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable
    PaRAMset.OPT.ITCCHEN = 0;

    // PaRAM Register settings
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingXmit, &PaRAMset);
    PaRAM_RegSetup(PaRAM_offset_McBSP0TxXmit, &PaRAMset);

    You can take care of the pong side, but in particular there needs to be a full tmp buffer for both ping and pong.

    Channel 4 can be like you have it now other than the tmp buffer. Ch 4 will do the Hi half-word transfers through the whole buffer and then chain to Channel 5. Channel 5 can be like you have it now other than the tmp buffer, and now it will not chain anywhere and will not generate an interrupt so clear the TCCHEN bit in Ch 5's OPT.

    The data flow for the transmit side is:

    • You do some processing to fill the pEdmaRxTxBuffs->edmaTxPingBuff.
    • When that processing is completed, manually trigger Channel 4 to start the half-word swapping which fills edmaTmpBufferTxPing.
    • Channel 4 will chain to Channel 5 to continue the half-word swapping to complete edmaTmpBufferTxPing.
    • The processing plus swapping must be done before the Tx side has finished sending out all of the Tx pong data.
    • When the Tx pong data is finished, Channel 2 will get linked to move data from the edmaTmpBufferTxPing to DXR.

    I think this will solve the Rx corruption which is due to too much activity on each Tx event.

    If you need to do the half-word swapping on the Rx side, then a similar process will be needed, but in reverse. The McBSP0 Rx event on Channel 3 will copy data from DRR to a temp buffer like edmaTmpBufferRxPing, then when the whole RxPing buffer is filled, Channel 3 will chain to something like Channel 6 to do the Hi half-word swapping, Ch 6 will chain to Channel 7 for the Lo half-word swapping, and then Ch 7 will generate an interrupt using TCC=Tcc_McBSP0Rx_Ping.

    I hope this makes some sense with how I have written it. I believe this will work for you, whether you need the Rx swapping or not.

    Regards,
    RandyP

  • Ok, I implemented the design you laid out (including the 32-bit Rx), but am still seeing misalignment/skewing in the Rx buffer when Tx word-swap chain is manually triggered.   I tried moving the word-swap to channels 28 & 29, thinking that maybe getting onto a different Transfer Controller would help, but (a) I don't know which TCs handle which channels, and (b) it didn't change anything.

    Any other ideas?

    Thanks,

    Stu

  • Stu,

    You should be using CSL to initialize the EDMA3 at the beginning of your program. If you are doing everything through direct register writes, it is much more difficult to get everything initialized correctly or cleanly. [end of lecture.]

    That said, the DMAQNUMn registers are what you use to set the Channel Controller's Queue number for each DMA channel. Every four bits in DMAQNUMn contains a 3-bit field that sets the Queue number, which also corresponds directly with the TC number. You can set this field to 001b for channels 28 & 29 and see if that improves the Rx performance.

    Regards,
    RandyP

  • Stu,

    Can you describe the misalignment/skewing? Does it look like dropped data?

    Regards,
    RandyP