My McBSP0 FSX clock is being driven by an external source, which I've verified on an o'scope to be correct. I'm transmitting a hard-coded pattern via EDMA and I see it being transmitted on DX at random offsets from the FSX pulse. What can cause this?
McBSP0 registers are as follows:
SPCR: 0x02370031
RCR: 0x00011F40
XCR: 0x00011F40
SRGR: 0x01FF0000
MCR: 0x00040004
RCERAB: 0xFFFFFFFF
XCERAB: 0xFFFFFFFF
PCR: 0x00000000
RCERCD-GH: 0x00000000
XCERCD-GH: 0x00000000
Thanks,
Stu
Stu,
It is tough to decipher the bit patterns of the register settings. My C code always uses symbols or at least bit fields with comments to say what the field settings are. If you can post that, it may be easier to get you some useful answers. Instead, you get a few less helpful but probing questions:
What are the frequencies of FSX and CLKX?
How big are your word size and frame size?
Are you sending your data from the EDMA in 32-bit words, or smaller? Some devices require 32-bit reads and writes even for bytes and half-words, but I do not find that statement in the C6424 docs right now.
Please describe the randomness. Does all the data come out, just randomly presented without regard for FSX? Or is the data shifted by random numbers of bits and some words are lost?
Regards,RandyP
Search for answers, Ask a question, click Verify when complete, Help others, Learn more.
>It is tough to decipher the bit patterns of the register settings. My C code always uses symbols or at least bit fields with comments to say what the field settings are. If you can post that, it may be easier to get you some useful answers. Instead, you get a few less helpful but probing questions:
Ok, here are the bitfields in the McBSP registers:
SPCR 0x02370031 Memory Mapped Register: Serial Port Control Register
_RSV 0x0 These bits are not usedFREE 0x1 - YES Controls Free running during emulation modeSOFT 0x0 - NO Controls the emulation while the FREE bit is setFRST 0x0 - RESET Controls the Frame Sync Generator ResetGRST 0x0 - RESET Controls the Sample Rate Generator ResetXINTM 0x3 Transmit Interrupt ModeXSYNCERR 0x0 - NO Transmit Synchronization ErrorXEMPTY 0x1 - NO Transmit Shift Register (XSR) EmptyXRDY 0x1 - YES Transmitter ReadyXRST 0x1 - ENABLE Transmitter Resets and enables the TransmitterDLB 0x0 - OFF Sets Digital Loop Back ModeRJUST 0x0 - RZF Enables Receive Sign-Extension and Justification ModeCLKSTP 0x0 - DISABLE_00 Enables Clock Stop Mode for SPI mode._RSVD *** These bits are not usedDXENA 0x0 - OFF Enables extra delay for turn-on time.Reserved *RINTM 0x3 Receive Interrupt ModeRSYNCERR 0x0 - NO Receive Synchronization ErrorRFULL 0x0 - NO Receive Shift Register (RSR) EmptyRRDY 0x0 - NO Receiver ReadyRRST 0x1 - ENABLE Receiver Reses and enables the Receiver
RCR 0x00011F40 Memory Mapped Register: Receive Control Register
RPHASE 0x0 - YES Receive PhaseRFRLEN2 0x0 Receive Frame Length 2RWDLEN2 0x0 - 8BIT Receive Word Length 2RCOMPAND 0x0 - MSB Receive Companding Mode.RFIG 0x0 - YES Receive Frame IgnoreRDATDLY 0x1 - 1BIT Receive data delay_RSVD * These bits are not used(always cleared)RFRLEN1 0x1F Receive Frame Length 1RWDLEN1 0x2 Receive Word Length 1RWDREVRS 0x0 - DISABLED Receive 32-bit reversal feature._RSVD **** These bits are not used(always cleared)
XCR 0x00011F40 Memory Mapped Register: Transmit Control Register
XPHASE 0x0 - YES Transmit PhaseXFRLEN2 0x0 Transmit Frame Length 2XWDLEN2 0x0 - 8BIT Transmit Word Length 2XCOMPAND 0x0 - MSB Transmit Companding Mode.XFIG 0x0 - YES Transmit Frame IgnoreXDATDLY 0x1 - 1BIT Transmit data delay_RSVD * These bits are not used(always cleared)XFXLEN1 0x1F Transmit Frame Length 1XWDLEN1 0x2 Transmit Word Length 1XWDREVRS 0x0 - DISABLED Transmit 32-bit reversal feature._RSVD **** These bits are not used(always cleared)
SRGR 0x01FF0000 Memory Mapped Register: Sample Rate Generator Register
GSYNC 0x0 - FREE Sample rate generator clock synchronization. Only used when the external clock (CLKS) drives the sample rate generator clock (CLKSM=0).CLKSP 0x0 - RISING CLKS Polarity Clock Edge Select. Only used when the external clock CLKS drives the sample rate generator clock (CLKSM=0).CLKSM 0x0 - CLKS McBSP Sample Rate Generator Clock ModeFSGM 0x0 - DXR2XSR Sample Rate Generator Transmit frame synchronization mode. Used when FSXM=1 in PCR.FPER 0x1FF Frame Period. This determines when the next frame sync signal should become active. Range: up to 212 ; 1 to 4096 CLKG periods.FWID 0x00 Frame Width. Determines the width of the frame sync pulse, FSG, during its active period. Range: up to 28 ; 1 to 256 CLKG periods.CLKGDV 0x00 Sample rate generator clock divider. This value is used as the divide-down number to generate the required sample rate generator clock frequency. Default value is 1.
MCR 0x00000000 Memory Mapped Register: Multi-Channel Register
_RSVD ***DX 0x0 - HIZReserved *XMCME 0x0 - _AtoBXPBBLK 0x0 - SF1XPABLK 0x0 - SF0XCBLK 0x0 - SF0XMCM 0x0 - ENNOMASK_RSVD *****Reserved *RMCME 0x0 - _AtoBRPBBLK 0x0 - SF1RPABLK 0x0 - SF0RCBLK 0x0 - SF0_RSVD *RMCM 0x0 - CHENABLE
RCERAB 0xFFFFFFFF Memory Mapped Register: Receive Channel Enable Register Partition A/B (XCER(A/B)) registers are used to enable any of the 32 channels for receive
XCERAB 0xFFFFFFFF Memory Mapped Register: Transmit Channel Enable Register Partition A/B (XCER(A/B)) registers are used to enable any of the 32 channels for transmit
PCR 0x00000003 Memory Mapped Register: Pin Control Register
RSV 0x0000 These bits are not used(always cleared)IDLE_EN 0x0 - RESET Idle enable bit for the McBSPXIOEN 0x0 - SP Transmit General Purpose I/O Mode ONLY when XRST=0 in SPCRRIOEN 0x0 - SP Receive General Purpose I/O Mode ONLY when RRST=0 in SPCRFSXM 0x0 - EXTERNAL Transmit Frame Synchronization ModeFSRM 0x0 - EXTERNAL Receive Frame Synchronization ModeCLKXM 0x0 - INPUT Transmitter Clock Mode.CLKRM 0x0 - INPUT Receiver Clock ModeSCLKME 0x0 - NO Enhanced sample clock mode selection bit.CLKS_STAT 0x0 - _0 CLKS pin status. Reflects value on CLKS pin when selected as a general purpose input.DX_STAT 0x0 - _0 DX pin status. Reflects value driven on to DX pin when selected as a general purpose output.DR_STAT 0x0 - _0 DR pin status. Reflects value on DR pin when selected as a general purpose input.FSXP 0x0 - ACTIVEHIGH Transmit Frame Synchronization PolarityFSRP 0x0 - ACTIVEHIGH Receive Frame Synchronization PolarityCLKXP 0x1 - FALLING Transmit Clock PolarityCLKRP 0x1 - RISING Receive Clock Polarity
>What are the frequencies of FSX and CLKX?
FSX is 48 KHz, CLKX is 23,810 KHz.
>How big are your word size and frame size?
16 bit words, 32 word frames.
>Please describe the randomness. Does all the data come out, just randomly presented without regard for FSX? Or is the data shifted by random numbers of bits and some words are lost?
I've found that the apparent randomness is caused by the fact that sometimes a 16-bit word of data is being duplicated (sent out DX twice in a row), which pushes the end of the data stream out past the next FS pulse. This error accumulates over time to cause a seemingly random offset from FS for the next start-of-frame data.
The duplicated word is also apparently random: sometimes it's the 5th word, sometimes it's the 13th, etc. The duplication doesn't happen every frame ... sometimes it's in the 16th frame, sometimes the 29th frame, etc.
Am I multiplying something wrong? It looks like if your want 32 words x 16 bits @ 48 KHz, CLKX needs to be at least 24,576 KHz.
Could you try a quick test where you change the FrameLen to 31 words and change the EDMA to only send 31 words for each frame?
If I can help with the EDMA part of that, you are welcome to send the EDMA setup code. Those registers I can disassemble in my head, most of the time.
Yes, you're right. I think I had an aliasing error on the logic analyzer when measuring the CLKX pulse width. If I measure across multiple clocks and average it, I get 24,576 KHz for CLKX.
I tried the 31-word test, but I'm not sure what you expected to see. What I'm seeing is some garbage words after my 31st word, and the alignment relative to FSX is still shifting. Below is the setup code: I changed BCNT, BCNTRLD, and SRCCIDX (to skip to the next start-of-frame data). Old values are in the comments.
I noticed in SPRUEN2B that the McBSP will re-transmit the last word in the case of DXR underrun. Could that be the cause of the duplicated words I'm seeing? If so, what could prevent EDMA from keeping up with a relatively slow 48 KHz rate?
void ParamSetup_mcbspTxTransfer(void){
PaRAMentry[PaRAMentryIdx_McBSP0Tx].SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPingBuff[0]);
PaRAMentry[PaRAMentryIdx_McBSP0Tx].DST = MCBSP_DXR_REG;
PaRAMentry[PaRAMentryIdx_McBSP0Tx].ACNT = 2; //bytes PaRAMentry[PaRAMentryIdx_McBSP0Tx].BCNT = 31; //32 PaRAMentry[PaRAMentryIdx_McBSP0Tx].CCNT = 1; PaRAMentry[PaRAMentryIdx_McBSP0Tx].SRCBIDX = 2; PaRAMentry[PaRAMentryIdx_McBSP0Tx].DSTBIDX = 0;
PaRAMentry[PaRAMentryIdx_McBSP0Tx].SRCCIDX = 4; //0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].DSTCIDX = 0;
PaRAMentry[PaRAMentryIdx_McBSP0Tx].BCNTRLD = 31; //32 PaRAMentry[PaRAMentryIdx_McBSP0Tx].LINK = PaRAM_offset_McBSP0TxLinkPong & 0xFFFF;
//EDMA_OPT PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.SAM = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.DAM = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.SYNCDIM = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.STATIC = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.FWID = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCCMODE = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCC = Tcc_McBSP0Tx_Ping; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCINTEN = 1; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.ITCINTEN = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.TCCHEN = 0; PaRAMentry[PaRAMentryIdx_McBSP0Tx].OPT.ITCCHEN = 0;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0Tx, PaRAMentryIdx_McBSP0Tx);
// PONG setup is identical
}
The 31-word test was primarily based on the wrong CLKX speed, but it also eliminates frame overrun. I assume you also made a change in XFXLEN1 for 31 words.
When these shifts occur, do you ever see error bits set in the McBSP error registers? If there is a frame underflow, XSYNCERR in SPCR will be set and it will stay set until cleared. The XEMPTY bit gets cleared to 0 on a single-word underflow condition, but it gets set again when the next word is written to the DXR.
Underflow would definitely explain having a word repeated. And underflow is a common concern in systems. Unfortunately, the C6424 does not have a deeper FIFO to allow you to write a bunch of words less often. Instead, you have to get a 16-bit half-word written out every 16/24,576 ms, or 651 ns, which is pretty fast.
One solution, if this is the problem, is to evaluate everything that is happening through your EDMA and arrange for only short, real-time things like the McBSP to be assigned to a selected Transfer Controller. Many longs transfers can put in a delay that could jeopardize the McBSP servicing requirement.
Another solution, if this is the problem, is to pack your 16-bit samples into 32-bit words and transmit 16 32-bit words so that the bit stream looks exactly the same. This gives you twice as much time to service the McBSP event. This can be accomplished a couple of ways depending on your data structure and the data buffer-filling algorithm.
I do not know of a third solution.
To see if underflow is the problem, change the word size to 32 bits, and change the number of words per frame to 16. You could leave the EDMA the same and just know that your data is spread out wider than it should be. Or you could change the EDMA setup so it just sends every other 16-bit sample. If the shifting ends, then the two solutions above may be the right things to do.
Hi Randy,
I was not seeing any XSYNCERRs when this problem occurred. However, changing the xfer to 32-bits did eliminate the shifting problem. Only problem now is that the data is in the wrong order word-wise (we're running little-endian). For example, 16-bit audio samples stored in memory as ABCD are being transmitted over McBSP as BADC.
I saw an earlier post by you on EDMA byte-order swapping that suggested some multiple-stage chained xfers to implement byte reordering. Is this still your recommended approach? I'm concerned that if EDMA can't keep up with a 48 KHz data rate while doing single 16-bit xfers, will it be able to keep up while doing 3 stages of xfers?
Right now, there's nothing else running in this system ... just this single EDMA-based McBSP driver.
I'm really surprised that EDMA can't keep up with 48 KHz at 16bits/xfer. As you mentioned, it has 651ns to complete the xfer. My understanding is the EDMA module runs at 1/3 DSP clk, which in our case is 700/3=233.333 MHz (or 4.286ns/clk). This gives 151 clock cycles to move 16 bits. That's not enough?
The good news is that it is very likely you have now identified what the problem is. There is no bad news, just the opportunity to find the best solution. Two things before we start talking about that:
a. Byte swapping:
To clarify, your 32-bit word at address 0x0000 with value shown as ABCD has those bytes at byte addresses 3210, respectively, right?
Since your PARAM for 16-bit data was setup for packed data (ch 0, ch 1, ch 2, ...), the first sample going out was CD from byte addresses 10. Did this go out as DC on the serial line? Then the second sample was AB from byte addr 32.
I could have understood if the 32-bit version went out as CDAB where the half-words were swapped, but swapping the bytes inside a 16-bit value does not make sense. I do not know how the McBSP would be capable of this. Can you verify this as the binary bits in the order they go out?
b. byte-order swapping:
Whether it is byte-order swapping or half-word swapping that needs to be done, the EDMA3 can do this. The most efficient way to do it is in one big chunk copying the data for an entire buffer instead of doing it for each 32-bit word one-at-a-time. The more efficient method would be to redefine where the algorithms store the data for the various output channels so that whichever channel was written to byte offset 32*2*n+0 gets written to 32*2*n+2 and whichever channel was written to byte offset 32*2*n+2 gets written to 32*2*n+0, and so on. But a DMA channel can be setup to do that swapping separately if that is the best choice.
Since you are most likely using ping-pong buffers, this swapping channel would be run once the next buffer is full and ready to be sent out. It would be the last step before the data is actually ready to be sent out. The depth of the buffer may affect how we write this.
c. 151 DSP/3 clock cycles:
This is easily plenty of time to service the McBSP. If there was nothing else going on to interfere with it, then every transfer would work every time. But if there is a 1KB transfer in front of it, that might use up a bunch of those clocks. And if memory serves, the McBSP requires the new data to get there 3 CLKX cycles before the next event (DXR->XSR transfer), which cuts the number of available clocks even more.
Unfortunately, this is a common problem for servicing real-time peripherals. It is solvable, and that is where we need to look now.
Improving the servicing of a real-time peripheral:
Since this was a random event, it was a problem just waiting to fail. Then when something else got in the way, either ahead of of the McBSP Tx in the Transfer Controller, or conflicting with its access of the data from the memory buffer, or conflicting with its access to the McBSP output port (least likely if you are using the data port address), suddenly the data gets missed.
By moving to 32-bit data, we at least double the amount of time that the EDMA3 has to respond to the McBSP Tx event. If that is good enough, then we are set. But there is still the possibility that you have other, longer delays that could fight with the EDMA3 or memory endpoint or McBSP and still cause a stall long enough to hit this problem again.
Hopefully this simple solution will work. If we need to add more margin, the things to consider will be:
1. Put the data buffers in L1D or L2 SRAM instead of in DDR2 memory. This is the second-best solution after moving to 32-bit data, mainly because it can be easy if the internal memory is available. This solves the problem if the stall is due to memory endpoint conflicts, usually DDR2. Sections 2.8.1-2.8.3 in the DDR2 User's Guide talk about the possible read/write command conflicts that can occur.
2. Keep one of the three TCs set aside for short real-time access like these for the McBSP data servicing. Depending on how much short real-time activity you have and how much DMA traffic you have, this could be an easy solution or it could be a difficult one. It will solve the problem if TC conflicts are causing the problem.
Hey Randy,
Re a.: I apologise for not being precise in my description. In my example of ABCD, each letter represents one 16-bit sample (A is the first sample, B is the second, etc.).
This sounds more like what I was assuming, except that data ABCD are 16-bits instead of 8-bits. So A is at byte offsets 10, B is at 32, C is at 54, and D is at 76, right?
Since the McBSP is configured to send ms-bit first, bit 31 of a 32-bit value will go first, and that is the msb of B in byte position 3. So the solution for this is the solution b in my post above. And the best choice performance-wise is for the writing algorithm to put the half-words into the alternate half-word offset in the ping-pong output buffer. Or, if you use EDMA to build the output buffer from 32 different buffers, that step could be modified slightly to get the output buffer in the right arrangement.
Getting back to the issue of word-order swapping using EDMA:
I have the system working as desired using software word-swapping. For performance reasons, I'd like EDMA to do this for me. I've implemented a 3-stage chaining algorithm as follows:
1. McBSP TX event triggers the first PaRAMset (for channel 2) which moves the high word (offset 0 in my output ping buffer) into the low word of a temp buffer (at byte offset 2), for each 32-bit word in the buffer. This PaRAMset is chained to the second stage PaRAMset.
2. The second stage PaRAMset (which uses channel 4) moves the low word of the output ping buffer to the high word of the temp buffer, for each word in the buffer. This PaRAMset is chained to the third stage PaRAMset.
3. The third stage PaRAMset (which uses channel 5) moves the entire temp buffer out to the McBSP DXR.
All three PaRAMsets are also linked to equivalent pong buffer PaRAMsets. The code that configures these PaRAMsets is at the end of this post.
When I enable this setup, I find that the RX ping and pong buffers are unexpectedly incorrect. I see that the RX data is misaligned within the ping and pong buffers, as if the RX event is not firing in sync with the FSR or we aren't servicing McBSP0 DRR quickly enough or something. I tried reading DRR 32 bits at a time, but that didn't help this time. If I simply disable the McBSP0 TX event in the EER, the RX data looks fine.
In your experience, how can enabling chaining on channels 2, 4, & 5 affect the behavior of channel 3?
void ParamSetup_mcbspRxTransfer(void){ PaRAM_SET PaRAMset;
PaRAMset.SRC = MCBSP_DRR_REG; PaRAMset.DST = (uint32) & (pEdmaRxTxBuffs->edmaRxPingBuff[0][0]); PaRAMset.ACNT = 2; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES); PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 0; PaRAMset.DSTBIDX = 2; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = 0; PaRAMset.LINK = PaRAM_offset_McBSP0RxLinkPong & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 0; //A-sync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = Tcc_McBSP0Rx_Ping; PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable PaRAMset.OPT.ITCINTEN = 0; //Intermediate Transfer Completion Interrupt Disable PaRAMset.OPT.TCCHEN = 0; //0; //Transfer Co1mplete Chaining Disable PaRAMset.OPT.ITCCHEN = 0; //Intermediate Transfer Completion Chaining Disable
// Program the Rx channel PaRAM PaRAM_RegSetup(PaRAM_offset_McBSP0Rx, &PaRAMset);
/////// PING// Same thing in the Rx Ping PaRAM PaRAM_RegSetup(PaRAM_offset_McBSP0RxLinkPing, &PaRAMset);
//////// PONG PaRAMset.SRC = MCBSP_DRR_REG; PaRAMset.DST = (uint32) & (pEdmaRxTxBuffs->edmaRxPongBuff[0][0]); PaRAMset.ACNT = 2; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES); PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 0; PaRAMset.DSTBIDX = 2; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = 0; PaRAMset.LINK = PaRAM_offset_McBSP0RxLinkPing & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 0; PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = Tcc_McBSP0Rx_Pong; PaRAMset.OPT.TCINTEN = 1; PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 0; PaRAMset.OPT.ITCCHEN = 0;
// Program the Rx Pong PaRAM PaRAM_RegSetup(PaRAM_offset_McBSP0RxLinkPong, &PaRAMset); //PaRAMentryIdx_McBSP0RxLinkPong = 1}
void ParamSetup_mcbspTxTransfer(void){ PaRAM_SET PaRAMset;
#ifdef EDMA_WORD_SWAPPING
// ******************** Ping HiWord PaRAM_SET ********************************************************** /////// word-swap high word into tmp buffer
PaRAMset.SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPingBuff[0]);
PaRAMset.DST = (uint32) &edmaTmpBuffer[1];
PaRAMset.ACNT = 2; //bytes PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 4; PaRAMset.DSTBIDX = 4; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongHiWord & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 1; //ABsync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = chNum_McBSP0TxLoWord; PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable PaRAMset.OPT.ITCCHEN = 0;
// Program the Tx channel PaRAM PaRAM_RegSetup(PaRAM_offset_McBSP0Tx, &PaRAMset);// Same thing in Tx Ping Hi-word PaRAM PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingHiWord, &PaRAMset);
// ******************** Ping LoWord PaRAM_SET **********************************************************/////// word-swap low word into tmp buffer PaRAMset.SRC = (uint32) (& (pEdmaRxTxBuffs->edmaTxPingBuff[0]))+2; PaRAMset.DST = (uint32) &edmaTmpBuffer[0]; PaRAMset.ACNT = 2; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 4; PaRAMset.DSTBIDX = 4; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongLoWord & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 1; //ABsync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = chNum_McBSP0TxXmit; PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable PaRAMset.OPT.ITCCHEN = 0;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingLoWord, &PaRAMset); PaRAM_RegSetup(PaRAM_offset_McBSP0TxLoWord, &PaRAMset);
// ******************** Ping XMIT PaRAM_SET ********************************************************** PaRAMset.SRC = (uint32) edmaTmpBuffer; PaRAMset.DST = MCBSP_DXR_REG; PaRAMset.ACNT = 4; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 4; PaRAMset.DSTBIDX = 0; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongXmit & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 0; //A-sync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Ping; PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable PaRAMset.OPT.ITCCHEN = 0;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingXmit, &PaRAMset); PaRAM_RegSetup(PaRAM_offset_McBSP0TxXmit, &PaRAMset);
// ******************** Pong HiWord PaRAM_SET **********************************************************/////// word-swap high word into tmp buffer PaRAMset.SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPongBuff[0]); PaRAMset.DST = (uint32) &edmaTmpBuffer[1]; PaRAMset.ACNT = 2; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 4; PaRAMset.DSTBIDX = 4; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPingHiWord & 0xFFFF;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPongHiWord, &PaRAMset);
// ******************** Pong LoWord PaRAM_SET **********************************************************/////// word-swap low word into tmp buffer PaRAMset.SRC = (uint32) & (pEdmaRxTxBuffs->edmaTxPongBuff[0])+2; PaRAMset.DST = (uint32) edmaTmpBuffer; PaRAMset.ACNT = 2; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 4; PaRAMset.DSTBIDX = 4; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPingLoWord & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 1; //ABsync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = chNum_McBSP0TxXmit; PaRAMset.OPT.TCINTEN = 0; // Transfer Complete Interrupt Disable PaRAMset.OPT.ITCINTEN = 0; //Intermediate Transfer Completion Interrupt Disable PaRAMset.OPT.TCCHEN = 1; //Transfer Complete Chaining Enable PaRAMset.OPT.ITCCHEN = 0; //Intermediate Transfer Completion Chaining Disable
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPongLoWord, &PaRAMset);
// ******************** Pong XMIT PaRAM_SET ********************************************************** PaRAMset.SRC = (uint32) edmaTmpBuffer; PaRAMset.DST = MCBSP_DXR_REG; PaRAMset.ACNT = 4; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = 4; PaRAMset.DSTBIDX = 0; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPingXmit & 0xFFFF;
/ PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 0; //A-sync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Pong; PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable PaRAMset.OPT.ITCCHEN = 0;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPongXmit, &PaRAMset);
#else
PaRAMset.SRC = (uint32)&(pEdmaRxTxBuffs->edmaTxPingBuff[0][0]); PaRAMset.DST = MCBSP_DXR_REG; PaRAMset.ACNT = 4; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES) >>1; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = PaRAMset.ACNT; PaRAMset.DSTBIDX = 0; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = PaRAMset.BCNT; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPong & 0xFFFF;
//TPCC_OPT PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 0; //A-sync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Ping; PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable PaRAMset.OPT.ITCCHEN = 0;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0Tx, &PaRAMset);
////////// PING PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPing, &PaRAMset);
///////// PONG PaRAMset.SRC = (uint32)&(pEdmaRxTxBuffs->edmaTxPongBuff[0][0]); PaRAMset.DST = MCBSP_DXR_REG; PaRAMset.ACNT = 4; PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES) >> 1; PaRAMset.CCNT = 1; PaRAMset.SRCBIDX = PaRAMset.ACNT; PaRAMset.DSTBIDX = 0; PaRAMset.SRCCIDX = 0; PaRAMset.DSTCIDX = 0; PaRAMset.BCNTRLD = PaRAMset.BCNT; PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPing & 0xFFFF;
PaRAMset.OPT.SAM = 0; PaRAMset.OPT.DAM = 0; PaRAMset.OPT.SYNCDIM = 0; //A-sync PaRAMset.OPT.STATIC = 0; PaRAMset.OPT.FWID = 0; PaRAMset.OPT.TCCMODE = 0; PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Pong; PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt Enable PaRAMset.OPT.ITCINTEN = 0; PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining Disable PaRAMset.OPT.ITCCHEN = 0;
// PaRAM Register settings PaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPong, &PaRAMset);
#endif // EDMA_WORD_SWAPPING}
void PaRAM_RegSetup(int32 PaRAM_offset, PaRAM_SET *pPaRAM){ REG_SETVAL((EDMA_PaRAM_SRCm + PaRAM_offset),pPaRAM->SRC); REG_SETVAL((EDMA_PaRAM_DSTm + PaRAM_offset),pPaRAM->DST); REG_SETVAL((EDMA_PaRAM_ABCNTm + PaRAM_offset),((pPaRAM->ACNT) + (pPaRAM->BCNT << 16))); REG_SETVAL((EDMA_PaRAM_BIDXm + PaRAM_offset),((pPaRAM->SRCBIDX) + (pPaRAM->DSTBIDX << 16))); REG_SETVAL((EDMA_PaRAM_CIDXm + PaRAM_offset),((pPaRAM->SRCCIDX) + (pPaRAM->DSTCIDX << 16))); REG_SETVAL((EDMA_PaRAM_CCNTm + PaRAM_offset),pPaRAM->CCNT); REG_SETVAL((EDMA_PaRAM_LNKm + PaRAM_offset),((pPaRAM->BCNTRLD << 16) | (pPaRAM->LINK))); REG_SETVAL((EDMA_PaRAM_OPTm + PaRAM_offset), ((pPaRAM->OPT.SAM) + (pPaRAM->OPT.DAM << 1) + (pPaRAM->OPT.SYNCDIM << 2) + (pPaRAM->OPT.STATIC << 3) + (pPaRAM->OPT.FWID << 8) + (pPaRAM->OPT.TCCMODE << 11) + (pPaRAM->OPT.TCC << 12) + (pPaRAM->OPT.TCINTEN << 20) + (pPaRAM->OPT.ITCINTEN << 21) + (pPaRAM->OPT.TCCHEN << 22) + (pPaRAM->OPT.ITCCHEN << 23)));
Bear with me because this is a long thread and I am just looking at the latest post instead of going back and re-reading the entire thing. So if I ask some dumb questions, that will be why.
You have some things right as far as chaining and linking, but I would like to step back and offer some suggestions to the whole flow.
First, I understand that the reason for doing this is that when you configure McBSP0 Tx for a 32-bit transmit word that the two 16-bit half-words within that are swapped from how you need them to be ordered. And you want to have the EDMA do the byte-swapping for you instead of having your algorithm have to be edited just to support this system performance issue.
Second, for the McBSP0 Rx side I believe you also need to be configured for 32-bit receives for the same system performance reasons as why you moved to the Tx being in 32-bit Tx word length mode. If you configure the McBSP0 Rx side for 32-bit Rx word length, do you also get the 16-bit half-words being swapped from how you want them to be ordered?
Opinion 1: You do not want to do the byte swapping every time you get a McBSP0 Tx event. All you want to happen in response to the Tx event on channel 2 is that a new 32-bit word is sent to the DXR with the half-words in the right order for the serial line as you expect it to be.
Opinion 2: The way you are doing the half-word swapping is very efficient. It would be possible to do it with one channel chaining to itself but there would be nothing to gain other than using fewer channels. Your method is probably faster than how I have often done it with 3D transfers. It definitely improves the EDMA channel thrashing over some other methods.
So, channel 2 is the McBSP0 Tx event channel and it needs to be setup like this for the ping side:
// ******************** Ping XMIT PaRAM_SET **********************************************************PaRAMset.SRC = (uint32) edmaTmpBufferTxPing; // need a full tmp buffer for both ping and pongPaRAMset.DST = MCBSP_DXR_REG;PaRAMset.ACNT = 4;PaRAMset.BCNT = (NUM_SLOTS * NUM_SAMPLES)/2;PaRAMset.CCNT = 1;PaRAMset.SRCBIDX = 4;PaRAMset.DSTBIDX = 0;PaRAMset.SRCCIDX = 0;PaRAMset.DSTCIDX = 0;PaRAMset.BCNTRLD = (NUM_SLOTS * NUM_SAMPLES)/2;PaRAMset.LINK = PaRAM_offset_McBSP0TxLinkPongXmit & 0xFFFF;
PaRAMset.OPT.SAM = 0;PaRAMset.OPT.DAM = 0;PaRAMset.OPT.SYNCDIM = 0; //A-syncPaRAMset.OPT.STATIC = 0;PaRAMset.OPT.FWID = 0;PaRAMset.OPT.TCCMODE = 0;PaRAMset.OPT.TCC = Tcc_McBSP0Tx_Ping;PaRAMset.OPT.TCINTEN = 1; // Transfer Complete Interrupt EnablePaRAMset.OPT.ITCINTEN = 0;PaRAMset.OPT.TCCHEN = 0; //Transfer Complete Chaining DisablePaRAMset.OPT.ITCCHEN = 0;
// PaRAM Register settingsPaRAM_RegSetup(PaRAM_offset_McBSP0TxLinkPingXmit, &PaRAMset);PaRAM_RegSetup(PaRAM_offset_McBSP0TxXmit, &PaRAMset);
You can take care of the pong side, but in particular there needs to be a full tmp buffer for both ping and pong.
Channel 4 can be like you have it now other than the tmp buffer. Ch 4 will do the Hi half-word transfers through the whole buffer and then chain to Channel 5. Channel 5 can be like you have it now other than the tmp buffer, and now it will not chain anywhere and will not generate an interrupt so clear the TCCHEN bit in Ch 5's OPT.
The data flow for the transmit side is:
I think this will solve the Rx corruption which is due to too much activity on each Tx event.
If you need to do the half-word swapping on the Rx side, then a similar process will be needed, but in reverse. The McBSP0 Rx event on Channel 3 will copy data from DRR to a temp buffer like edmaTmpBufferRxPing, then when the whole RxPing buffer is filled, Channel 3 will chain to something like Channel 6 to do the Hi half-word swapping, Ch 6 will chain to Channel 7 for the Lo half-word swapping, and then Ch 7 will generate an interrupt using TCC=Tcc_McBSP0Rx_Ping.
I hope this makes some sense with how I have written it. I believe this will work for you, whether you need the Rx swapping or not.
Ok, I implemented the design you laid out (including the 32-bit Rx), but am still seeing misalignment/skewing in the Rx buffer when Tx word-swap chain is manually triggered. I tried moving the word-swap to channels 28 & 29, thinking that maybe getting onto a different Transfer Controller would help, but (a) I don't know which TCs handle which channels, and (b) it didn't change anything.
Any other ideas?
You should be using CSL to initialize the EDMA3 at the beginning of your program. If you are doing everything through direct register writes, it is much more difficult to get everything initialized correctly or cleanly. [end of lecture.]
That said, the DMAQNUMn registers are what you use to set the Channel Controller's Queue number for each DMA channel. Every four bits in DMAQNUMn contains a 3-bit field that sets the Queue number, which also corresponds directly with the TC number. You can set this field to 001b for channels 28 & 29 and see if that improves the Rx performance.
Can you describe the misalignment/skewing? Does it look like dropped data?