This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SCI using DMA in a polled mode



Our main code for the TMS570 is an event loop and it got too bogged down interfacing directly with the SCI (due to the 19-25 cycle access time to peripheral registers).  So, we implemented DMA to copy byte by byte to buffer.  The idea was to call something in our event loop that would look at the Current Destination Address Register CDADDR and then read up to that point in the DMA buffer.  Seemed like an simple, eloquent solution.

However, it seems that reading CDADDR does not give us consistent results.  We have the DMA FIFO turned off and are using 2 input SCI ports with 2 different channels.

Right now, we are running with a much less smooth design - enabling all 4 DMA interrupts and a 4 byte buffer that makes these trigger..makes for a very small buffer and more complication than needed.

The key thing is that we must know when every byte is in - because we have odd and even length data packets coming in - we don't want to have a delay waiting for other bytes to arrive before processing a data packet.

Is there an issue using the CDADDR for indexing in the DMA buffer?  Is there a register we could hit to refresh this value?

Thank you

  • ...also, if someone has a solution that does not use DMA and allows byte granularity with any type of buffer, that would be great.

    The SCI buffer that exists on the device can only be setup for muli-byte buffering at fixed byte packet sizes...and it seems like you must service the interrupt before the next byte comes in: eg: if you set it up for 4 bytes, you better service before the 5th comes in.  We looked at using the counter register that shows you where the SCI port is writing, but its not available to the user.  It would be nice to be able to use the SCI buffer as just a simple circular buffer using the "3-bit Counter" in Figure 13-8 of the TRM - but that is not possible.

    I don't think there is any way to set up the SCI as a stand-alone interface that allows the user to NOT have to service the interface between consecutive bytes at some point.....??

  • Hi Thomas,

    We have received your post and someone will get back with you soon.  Sorry for the delay.

  • Thank you for looking into this.

    If I had a guess, if the CDADDR is not behaving as expected, maybe it's that the DMA FIFO is not really being disabled (even though we have set the bit) - maybe an order of operations issue setting up the interface?  (although we have tried quite a few orders of setting interfaces up)

  • Thomas,

    The CDADDR field of the working control packet only gets updated once that DMA channel loses arbitration to another DMA channel. I suspect that you have only one DMA channel configured to service the SCI RX event. Is that true?

    Regards, Sunil

  • Sunil,

    We have 2 total DMA channels going.  Each one attached to one SCI.   So yes, one DMA channel configured to serve the SCI RX even on a given port.

    We saw some text describing something about arbitration - but reading through the document we could not figure out exactly what that meant.  Do you have another document that could explain some of these terms better?  I also thought if we disabled the DMA FIFO that much of this arbitration did not happen..so we did not have to worry about it.

    If disabling the DMA FIFO does not stop the arbitration process, there has to be some register somewhere that tells the DMA channel where to put that next byte....and the DMA channel itself can't be buffering...so the data has to be somewhere..is there another register that tells us what the Current Destination Address is regardless of the arbitration process?

    Thank you,

    Tom

  • I'm having a similar problem with LIN/SCI using the DMA on the TMS570 MCU Kit (TMS570LS3137ZWT Chip). I've spent about a week with it with mixed success.  It seems different from the documentation, which makes it seem like it should be easier.

    It was possible to get the DMA working with it on the transmit side, but the parameters had to be set for element size of 1 (byte) and Frame Size set to the number of characters sent.  Frame Transfer worked but Block Transfer didn't seem possible with any combination of parameters.  I guess this is because the uart sends an interrupt after each character is sent (see SPNU499/Figure 16-2).  To Trigger the send, three registers had to be set: HWCHENAS_UL, GCHIENAS_UL, and SWCHENAS_UL.  To detect its completion, it was not possible to use the current address (CSADDR) or counters (CTCOUNT) registers, but the end of the send was indicated by the HWCHENAS_UL with appriate channel bit==0.

    I can receive bytes by polling in a continuous loop the FLR,   unsigned int theFLR=sciREG1->FLR; if (theFLR&bit9) .... , and then reading the sci receive register, which had to hard coded to the address...  unsigned int rCInt=*((unsigned int *)0xFFF7E434); unsigned char receivedChar=(0xFF)&rCInt;  The byte level polling is inefficient and it would be nice to use the receive DMA to fill a buffer.  I cannot get the DMA to fill up receive memory array buffer in any case, and I've tried a lot of combinations of DMA parameter combinations and channels.  If there is some source code that works to fill a receive buffer, I would appreciate it.  Also, it would be nice to know if a DMA channel can be chained to itself so that a circular receive queue/buffer could be used.  Maybe two DMA channels could be looped to each other.

    A polled SCI/DMA example from TI would be great.

     

  • Dennis - sorry I did not get back to this post right away.  After seeing your post I had planned on it - but got bogged down - was hoping maybe someone else would chime in - but you are stuck with my explanation.

    So, here is the issue - peeling away all the "arbitration" wording - until another DMA channel becomes active and transfers data - the CDADDR does not get updated. (even with DMA FIFO disabled)  The real address (because the bytes have to be somewhere!) is not memory mapped - similar to the 3 bit counter in the SCI port.

    Our solution is to create a dummy DMA channel that just transfers one byte at the end (or beginning) of our event loop - this will serve as the "register we hit to refresh this value".  However, we have not implemented this because our solution with 4 interrupts per 4 bytes (FTC, Half Block & Frame, Last Frame Started and Block Transfer Complete) is actually working enough to get us through this round of testing.  Even though the DMA register address does inflict the same 19-25 cycle penalty (140Mhz = 6Mhz)  - this is working for us.

    And yes, it appears there is no way to set this serial port up that allows buffering prior to the next byte arriving (even when you set up for 4 byte transfers, you better get in before the 5th byte).  What I would give for a FIFO.

    So, yes, when you take into account buffering, for serial I/O, this ARM processor is about 100 times slower than our SH4 7760 we implemented 8 years ago.

  • Thomas,

    Thanks for the reponse and the help, I'll try to implement something based on that approach.  Ouch.  We are evaluating this chip for use in a safety critical environment, not so much speed as an issue in our case, but wow.  Thanks again.

  • Tom,

    The DMA control registers PBACSADDR, PBACDADDR, and PBACTC show the current source address, current destination address and current transfer count of the active channel. I believe this should help your application get the information necessary. You can tell the number of pending and active DMA channels by reading the DMASTAT register.

    If there is no case of arbitration between the two channels that service the individual SCI ports, then you can always use these registers to determine the current status of the DMA transfers. If one channel does get arbitrated by the other, then you can use the information from the working control packets.

    Regards, Sunil

  • Sunil,

    The problem would be reading all these PB*** registers (and DMASTAT) - is that it would have to be an Atomic operation because they could change if an arbitration occurred between the register reads.  The only way we saw to do this would be to suspend DMA and then perform these tasks.  Our team agreed suspending DMA and doing all these additional register reads would be more computationally intensive and more complex than setting up a 3rd DMA channel.

    However, I hate to say we are well overbudget on this task - so it does not look like we will get a chance to try the dummy DMA channel concept out.  We are stuck with the 4 byte buffer with 4 interrupts for now....if anyone has tried please let us know how it worked out..if it works I would say that the real "answer" to this question would be creating a dummy DMA transfer to refresh the DMA count registers.

    Thanks,

    Tom

  • A fact that might have added to "being overbudget" is the misleading text in the TRM, Section "DMA Channel Control Packets". What is written before the first subsection "Initial Source Address" and in the subsections "Current Source Address" etc. pretends to be on the Working Packets while in fact it is on the PBAC* registers.

    My SW Design Spec should be based on the TRM, but is mostly based on my red-framed remarks in it. I hope the Safety Assessor won't ask for a look into that document.
  • Tom,

    I used a simple method to update a local “counter” at the expense of an additional (or a dummy) DMA channel and some memory. This method allows you to chase the circular buffer by polling this counter.

    1. Create a (const) table of sequential numbers [0:n], limited to 8192 entries, with a 13 bit FRCNT.

    2. Configure a dma channel to transfer a byte/word from the table above to a fixed destination, the index “counter” and increment only source.

    3. Chain this DMA channel to your SCI/LIN/whatever RX dma channel to trigger a dma transfer after each event trigger. Note that this will also induce arbitration, so the working registers are also updated, except at the buffer boundary.

    4. Your polling routine can check the index counter value and process the new bytes received correctly.

       

    Please note that this “should” work for any situation where a frame transfer trigger type per hw request is used. You can buffer up to 8192 frames of whatever element type without any cpu overhead!

    You can avoid the index table/counter transfer and just use a “dummy” chained channel to induce arbitration. I don’t remember exactly but, I believe working register will not update on the last frame. It will reflect the correct values on the next event trigger (wraps).

    Chaining another dma channel is the key and no interrupts needed!

     

    Update: Just noticed the original post date, Feb 2012!

    Joe 

  • Hi Joe,
    Thanks for sharing your great idea with the community. I will also try it out myself. There are several forum posts asking the same question. In the past we have recommended adding dummy channels to cause arbitration so the work control packets can be updated.