This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6748 UART driver using EDMA for higher performance

Other Parts Discussed in Thread: SYSBIOS

Good day experts,

I was hoping you could provide me with some advice with regards to using the EDMA3 together with the UART on the C6748.

We are currently connecting a C6748 DSP to a C66x DSP through the UART peripheral. These DSPs will be located on the same PCB and will be directly connected and therefore it is not necessary for RS-232 level conversion. Consequently we can use much higher baud rates ( > 460800 bps) as we will not be limited by typical PC-based port expanders.

I have implemented our current C6748 UART driver from scratch to meet our high performance demands. Currently the UART receive is interrupt based to ensure that no bytes are missed during reception. I am using the UART FIFO, which triggers an interrupt if more than 8 unread bytes are in the FIFO or a timeout occurs (when less than 8 unread bytes are in the FIFO).  

Since we have more control over the UART transmission,  I have placed this in a DSP/BIOS task, which can execute when no other higher priority tasks are executing.

This model works very well for our current situation, but for the new configuration with the two DSPs connected via UART at much higher data rates, we foresee that the UART receive interrupt rate on the C6748 will be much higher and consequently reduce the performance of the system because too much time would be spent on servicing the frequent interrupts.

My question is thus: would we gain much by using EDMA for the C6748 UART driver?

I can remember I used the C6748 BIOSPSP drivers a couple of years ago with EDMA and it simply did not meet our performance requirements and we frequently lost received bytes on continuous data transfers. 

Thanks in advance for your assistance.

  • Hi,

    Thanks for your post.

    There is no uncertainty that the UART with EDMA will perform better or not and always, it will complete Rx/Tx DMA transaction faster than the CPU interrupt/polling method, but it all depends how you configure UART interrupt registers and configuring EDMA priority on event queues in Queue priority registers (QUEPRI) on a particular peripheral. 

    In general, UART with EDMA will perform better rather than any other CPU interrupt methodology. May be, we would recommend you to walkthrough the UART-EDMA reference code on the TI C6BIOS SDK package to check for C6748 starterware example code and check how the UART -EDMA is configured for DMA data transer. You could download the C6BIOS SDK as below:

    http://www.ti.com/tool/biossw-c6748

    After installation, you could find the uart edma sample project from the below specified path:

    \Texas Instruments\pdk_C6748_2_0_0_0\C6748_StarterWare_1_20_03_03\build\c674x\cgt_ccs\c6748\lcdkC6748\uart_edma

    You could get the uart edma source code from the below path:

    \Texas Instruments\pdk_C6748_2_0_0_0\C6748_StarterWare_1_20_03_03\examples\lcdkC6748\uart_edma

    In UART, there are different interrupt types and ofcourse, receiver line status would be serviced with highest priority followed by, receiver data-ready& receiver time-out and transmitter holding register empty. We recommend you to ensure the priority of interrrupt sources and interrupt type, so that, you could find a way to clear all pending interrupts. Ensuring all this, we could make UART DMA transfer rate performs better to meet its standard benchmarks

    For more details, please refer Table 30-11 & 30-5 and Section 30.2.8 in the c6748 TRM as below:

    http://www.ti.com/lit/ug/spruh79a/spruh79a.pdf

    Note: Please check the status of interrupt ID (INTID) in IIR register to know the interrrupt type and ensure whether FIFO enabled or not. Also, you could ensure the priorities of interrupts configured in the code.

    Thanks & regards,
    Sivaraj K

    ----------------------------------------------------------------------------------------------------------

    Please click the Verify Answer button on this post if it answers your question.

    ----------------------------------------------------------------------------------------------------------

     

     

  • Hi Sivaraj,

    Thank you for your detailed response.

    I just have further couple of questions:

    - I will be using DSP/BIOS for my application, so I am assuming the EDMA3 LLD will be a better option than the EDMA drivers implemented by the C6748 Starterware? In fact I am already using the EDMA3 LLD for the McASP/McBSP BIOSPSP drivers.

    - My current UART driver only uses CPU interrupts for the receiver line status and data-ready& receiver time-out signals. I am also using the FIFO and the FIFO Rx trigger level is set to 8 bytes. My UART receive ISR essentially only packs the received bytes in a large circular buffer and advances the buffer write index to indicate that new bytes have been written to the buffer. I then have a DSP/BIOS task which periodically checks if new bytes have been written to the circular buffer and processes it only if there are no higher priority tasks executing.

    By using this approach I can pretty much guarantee that no received bytes (of any variable length) are missed, but I can defer the actual processing of the bytes to whenever the CPU becomes available.

    At the moment it is not immediately apparent to me how I can achieve something similar with the EDMA.

    In a related post below, one of your colleagues explained that the C6748 PSPBIOS EDMA UART driver should be configured for a trigger level of 1 to ensure that any variable length of received bytes are not missed, i.e. the PaRAM sets for DMA should be programmed based on the UART FIFO trigger level.

    http://e2e.ti.com/support/embedded/tirtos/f/355/t/52673.aspx

    If this is the case, does it not mean that the CPU will actually be interrupted more using EDMA, compared to my current approach?

  • Hi,

    Yes, you are right. EDMA3LLD would be the better option.

    Without CPU intervention, EDMA data transfer transaction would run at the back end on a task priority basis. So, in your case, CPU will not be interrupted more often when you use EDMA data transfer which mean, CPU can be engaged simulataneously on other tasks when EDMA priority tasks are running at the back end on a event queue mechanism from different peripheral requests.

    Thanks & regards,

    Sivaraj K

    -------------------------------------------------------------------------------------------------------

    Please click the Verify Answer button on this post if it answers your question

    -------------------------------------------------------------------------------------------------------

  • Sivaraj,

    Sorry, I do not quite follow what you are trying to say in that paragraph.

    The actual receiving of UART bytes should be on a high priority basis, as to ensure that no bytes are ever missed even when the CPU load is high while executing something else. For this reason I am currently using a HWI in my non-EDMA UART driver. As I mentioned earlier, my UART Rx ISR is very simple, it basically just copies the available bytes from the FIFO to a circular buffer and manages the buffer so it is possible to determine how many bytes are available and to wrap properly. Under ideal circumstance, the CPU is only interrupted for every 8 bytes received. 

    What I want to achieve with using EDMA is to have something like a second level FIFO in external memory, i.e. as any variable length bytes are received, it is simply copied to a buffer in external memory without CPU intervention. However, I am unsure how this external buffer can be "managed" by the EDMA without any CPU intervention, i.e. how is the buffer writing index advanced or when the CPU eventually gets around to processing the received bytes it can somehow determine "Ah, there are 10 new bytes in the buffer" or "Ah, there are 25 bytes but the buffer has wrapped".

    Even a "replicate" of the UART peripheral FIFO would be very useful, e.g. a large 1024-byte FIFO in external memory to which the EDMA copies any received UART bytes, but only at, say, 512-bytes is the CPU interrupted. If less than 512-bytes are present in this buffer, the CPU will not be interrupted but I can "manually" check with a periodic task if there are any unread bytes available.

    Can you please advise if something like this is achievable? 

  • Hi,

    Thanks for your update.

    You have to convince yourself that EDMA can be used to manage the external buffer without CPU intervention. Actually, yes it is possible and the EDMA Rx/Tx requests from different peripheral are event triggered and for instance, in your case, the UART peripheral will trigger an edma Rx. request event to receive UART data bytes  and again after the UART receives data, it triggers the EDMA Tx. event request to copy the same data to a buffer in external memory with out CPU intervention.

    Any received UART bytes can be copied by EDMA to a second level FIFO buffer in external memory is possible  with out CPU intervention.

    Thanks & regards,

    Sivaraj K

    -------------------------------------------------------------------------------------------------------

    Please click the Verify Answer button on this post if it answers your question

    -------------------------------------------------------------------------------------------------------

  • Hi Sivaraj,

    Since I have not explicitly used EDMA peripheral in the past, I am still figuring out how to program it, so at this point I would just like to go a step back and maybe first start with using the EDMA in the simpler case for UART transmissions.

    I have managed to get very simple example up and running by syncing to the UART Tx EDMA event and then transferring 16 bytes (the depth of the UART FIFO) from external memory to the UART peripheral using AB-sync. I also registered a callback, which successfully executed after the transfer was completed.

    EDIT: what exactly should be performed in the EDMA callback to enable future EDMA transfers on the same channel?

    My question now is, how do I transmit more than 16 bytes with the least possible amount of CPU intervention?

    The simplest way I can think of now is to program the EDMA to transfer the first 16 bytes from external RAM to the UART peripheral, wait for the transfer complete callback to execute, program the EDMA again to transfer the next 16 bytes, and so on until all of the bytes have been transferred.

    With this approach the CPU is interrupted every 16-bytes, so I was wondering if it is somehow possible to do this more efficiently? I saw in the UART EDMA example for SYSBIOS that EDMA channel linking is used, but I have not yet exactly figured out how it works.

  • Reinier,

    Please go to the TI Wiki Pages and find the C6000 Embedded Design Workshop, which is a follow-on to the Intro to the TI-RTOS Kernel Workshop. You need to learn the basics and then the details of using the C6748, and this course material is designed to bring you up to the level you need to be. There is material on using the EDMA3 for accessing peripherals. You will learn how to make the EDMA3 fully automated so you will not need to do anything in the callback other than clearing the interrupt from the EDMA3, which should be done automatically in the LLD anyway.

    Regards,
    RandyP

  • Hi Randy,

    Thank you for the links.

    I am very familiar with the C6748, however I have not previously worked with the EDMA peripheral in much detail.

    I went thoroughly through most of the course material which explains the various ways how the EDMA can be programmed with the LLD. However, most of the examples provided for using EDMA with peripherals did not really cover  a peripheral such as the UART.

    In any case, I think I managed to properly configure the EDMA for UART transmissions, but it appears as though my current problem is more with the UART generating EDMA events. I configure/use the UART+EDMA as follows:

    void Rm_c6748_uart_edma::_init_edma() {
    
    
    	EDMA3_DRV_Result edma3Result = EDMA3_DRV_SOK;  // return value for some driver calls so they can return an error
    
    	EDMA3_RM_EventQueue event_q = 0; // both used below in _requestChannel()
    	EDMA3_RM_TccCallback tcc_cb = &_uart_edma_tx_complete;
    
    
    	_edma_tx_ping_channel = CSL_EDMA3_CHA_UART0_TX;
    	_edma_tx_ping_tcc = CSL_EDMA3_CHA_UART0_TX;
    
    	// get channel for transfer - iChannel
    	edma3Result = EDMA3_DRV_requestChannel(
    			_edma_handle,
    		    &_edma_tx_ping_channel,
    		    &_edma_tx_ping_tcc,
    		    event_q,
    		    tcc_cb,
    		    (void *)this
    		    );
    
    	_default_edma_tx_params.aCnt = 1;
    	_default_edma_tx_params.bCnt = 16;
    	_default_edma_tx_params.bCntReload = 0;
    	_default_edma_tx_params.cCnt = 64;
    	_default_edma_tx_params.destAddr = dest_addr;
    	_default_edma_tx_params.destBIdx = 0;
    	_default_edma_tx_params.destCIdx = 0;
    	_default_edma_tx_params.linkAddr = 0xFFFF;
    	_default_edma_tx_params.opt = 0x01309004;	// AB-sync, TCINTEN enabled, ITCINTEN enabled
    	_default_edma_tx_params.srcAddr = 0x00000000;
    	_default_edma_tx_params.srcBIdx = 1;
    	_default_edma_tx_params.srcCIdx = 16;
    }
    
    void  Rm_c6748_uart_edma::_start_uart_dma_tx(
    		unsigned char *tx_buf_ptr,
    		unsigned int num_bytes_to_tx
    		) {
    
    	CSL_UartRegsOvly uart_regs = (CSL_UartRegsOvly)_uart_base_addr;
    
    	EDMA3_DRV_Result edma3Result = EDMA3_DRV_SOK;
    
    	EDMA3_DRV_PaRAMRegs current_param = _default_edma_tx_params;
    
    	uart_regs->FCR &= ~CSL_UART_FCR_DMAMODE1_MASK;
    
    	tx_bytes_remaining = num_bytes_to_tx;
    
    	current_param.srcAddr = (unsigned int)tx_buf_ptr;
    	current_param.cCnt = tx_bytes_remaining / 16;
    
    	BCACHE_wb(
    			(Ptr)tx_buf_ptr,
    			tx_bytes_remaining,
    			TRUE
    			);
    
    	 EDMA3_DRV_setPaRAM(
    			_edma_handle,
    			_edma_tx_ping_channel,
    
    	edma3Result = EDMA3_DRV_enableTransfer(_edma_handle,
    										   _edma_tx_ping_channel,
    										   EDMA3_DRV_TRIG_MODE_EVENT
    										   );
    
    	uart_regs->FCR |= CSL_UART_FCR_DMAMODE1_MASK;
    }
    
    void _uart_edma_tx_complete(
    		unsigned int tcc,
    		EDMA3_RM_TccStatus status,
    		void *app_data
    		) {
    
    	unsigned int key = (unsigned int)_disable_interrupts();
    	Rm_c6748_uart_edma *uart_inst_ptr = (Rm_c6748_uart_edma *)app_data;
    	unsigned int uart_base_addr = uart_inst_ptr->get_uart_base_addr();
    	CSL_UartRegsOvly uart_regs = (CSL_UartRegsOvly)uart_base_addr;
    
    	uart_regs->FCR &= ~CSL_UART_FCR_DMAMODE1_MASK;
    
    	uart_inst_ptr->tx_bytes_remaining -= 16;
    
    	if(uart_inst_ptr->tx_bytes_remaining > 0) {
    
    		uart_regs->FCR |= CSL_UART_FCR_DMAMODE1_MASK;
    	}
    	else {
    
    		SEM_post(uart_inst_ptr->tx_edma_complete_sem_handle);
    	}
    
    	_restore_interrupts(key);
    }

    The problem I am seeing, seems to be that the UART Tx EDMA signal is not automatically re-triggered after an EDMA transfer, which means that I have to "manually" trigger it again in the EDMA callback function after every 16 bytes have been transferred to the UART FIFO. This is achieved by first clearing the DMAMODE1 bit the FCR register and the setting it again.

    It appears as though the C6748 BIOSPSP UART driver that came with the latest C6748 Starterware does something similar.

    Surely there must be a better way enable automatic re-triggering of this EDMA event?

  • Reinier,

    Can you show some code from the BIOSPSP that clears DMAMODE1? The UART User Guide says this must always be set to 1, and that this needs to be done after any device reset. In fact, it may be that after a device reset that you need to set it to 0 and then set it to 1, then leave it at 1 forever.

    I do not know if this could be part of the problem, but I am concerned if the BIOSPSP is clearing this bit and expecting the EDMA to work with the UART.

    Regards,
    RandyP