QMSS/PKTDMA - Tx packet queue and sending packet outs

Aamir Husain

Hi TI Folks,

Duplicate post as inadvertently posted this on the BIOS forum originally though it would appear to belong on the multicore c66x forum.

On our current TI c64x based product we make use of the EDMA engine to send packets of the DSP. The packets for multiple channels are stored in buffers in external memory and then each of these buffers is sent out one at a time with a DMA wait before the next EDMA send happens if the previous EDMA has not completed sending the packet out from the DSP to an FPGA. These buffers are setup at initialization and are used by each channel for the duration of the time the DSP is running. Going forward to our new product and having these packets sent out through the TX queues using the PKTDMA to the Ethernet switch etc, I see one design issue. I would like to have this buffers fixed per Channel for the duration and so I could grab at initialization a TX buffer descriptors (by popping them of the TX Free Queue) and buffers to hold the Ethernet packet have those pushed onto the TX queue and have them sent to another core or out through the DSP switch. However, when the sending is complete then I do not want them popped back to the TXFreeQueue as I want the buffers to physically be at the same location for subsequnet use by that channel. Can I do that? This avoids transferring these packet buffers from DDR3 for each channel making use of the EMDA engine to a set of TX descriptors/TX buffers that can be used for sending packets out the chip and then pushing the descriptors to the TX queue for sending via the PKTDMA engine.

Any thoughts.

Thanks, Aamir

over 12 years ago

0 Aamir Husain over 12 years ago

Expert 2715 points

One way I could do that is to have each of the channels on each of the cores have a unique queue # and so when the first tramsmit completes, it pushes the descriptor back to a particular channel's Tx FDQ then this channel subsequently pops this descriptor from the channel FDQ. So for 8 cores and 128 channels, I would need to have 1024 queues. I would prefer not to have to do that but want to know if that is the only way I can do it.

Thanks, Aamir

0 Javier Malave-Bonet over 12 years ago

TI__Expert 5805 points

Hi Aamir,

Let me address one of your concerns which I believe is the primary reason for the question. When packets are pushed/popped from a queue to another queue managed by the Multicore Navigator, the descriptor memory (as well as the payload buffer it contains or points to (Monolithic vs Host Descriptor) does not move in memory. The only thing that happens is that the PKTDMA of the particular IP being accessed to move data will write to the QM to update the Linking Ram reflecting the new location (i.e. queue) of the descriptor.

Having said that, I would also clarify that TX queues are mapped to specific channels of their respective PKTDMAs (i.e. for NetCP, SRIO, QMSS). This means that if you select the same TX Q (ex: Q640 -> CH0 PKTDMA NetCP) as the recycle queue, it would create an "infinite loop" of operation. As long as a TX Q has packets and the channels are active they will continue to pop and process the descriptors accordingly. You will need to recycle the packet to a general purpose queue. As said above, this will not move the descriptor and buffer data in memory so you are not "flodding" the SCR with big memory transactions.

So I think, what you want is to have the cores monitor the recycle queues (the recycle queue can be the same FDQ used before), process them (i.e. their might be protocol specific information, error flags etc. that need checking) and then push them back to the TX Q's.

Are you planning to use the same packets again and again to send them over ethernet? A little bit of more information on your application scenario would help to steer you in the right direction.

Regards,

Javier

0 Aamir Husain over 12 years ago in reply to Javier Malave-Bonet

Expert 2715 points

Javier,

Thanks for your note. Yes, I realize the actual data is not moved from the buffers once a descriptor is popped and pushed onto a Tx Q but it is copied using the infrastructure PKTDMA or another IP PKTDMA like the PA if it is going out through the NETCP.

Maybe I was not clear enough in my explanation above. Here is a short summary of how I want to use it. I want to have multiple channels, say 128, each having a buffer that is filled every 10ms with RTP data that will be transmitted out of the DSP device over the Ethernet. Once it is transmitted, I can then overwrite the data in the buffer with newer data for the next 10ms and repeat the process. I do this by making use of the EDMA engine in the c64x processor and I currently have this multiple channel buffers. I want to do the same stuff for the c6678. However, I can do it in two ways. One, I can pop a descriptor from a TX FDQ, copy the RTP data from the channel buffer to the buffer pointed to by the free descriptor, push it to the PA tx queue i.e. 640 say. Once it finishes, it will push back to the Tx FDQ. In the meantime, I can start processing the tx of the second channel and pop another descriptor, copy the second channel RTP data from its channel buffer to the buffer pointed to by this free descriptor and push it to the PA tx queue. We could have a few of these for the 128 channels, probably do not need 128, because as long as the transmit is fast enough we will not starve the queues i.e. cannot pop a descriptor as none are free. The second scenario is to have this RTP data buffer for each channel actually stored in the buffers that are pointed to by descriptors in the Tx FDQ. Each RTP data channel grabs these at the beginning, has these filled every 10ms and then can transmit by pushing them onto the TX Q for 640 say. The problem now is that once the transmit completes, I want subsequent RTP data for this channel to write to this same buffer and so if I were to pop a descriptor I want it to be the same descriptor as the one I used in the previous Txmit but I obviously will not be able to guarantee that. So to deal with this each of the channels would have their own TX FDQ and so when you pop a descriptor for use the next transmit, it would be the same set of descriptors unless when the first transmit completes and the descriptor is free to be reused, we do not push it back to the Tx FDQ and instead reuse it for the next transmit. So my question was can I do that somehow? How can I check contents within the descriptor that tell me the previous transmit has completed so I can overwrite it with more incoming data instead of teh customary way of knowing which is popping from a Tx FDQ.

Thanks, Aamir

0 Aamir Husain over 12 years ago in reply to Aamir Husain

Expert 2715 points

Hello Folks at TI,

Can someone from TI please respond to this. Much appreciated. Thanks, Aamir

Processors

Processors forum

QMSS/PKTDMA - Tx packet queue and sending packet outs