QMMS in Infrastructure Mode

asimburak1984

Hello,

I've been studying the multicore navigator using the Multicore Navigator User Guide, CPPI/QMSS LLD Spec and the infrastructure mode (on a single core) example in the "\pdk_C6670_1_0_0_21\packages\ti\drv\qmss\example" folder.Here are my questions:

1) As far as I understood is that if we push a queue on the INFRASTRUCTURE_QUEUE it s automatically transfered to the Rx Queue where we can afterwards pop it and get the data. What I dont understand in the example is what is the role of the Rx Flow if the tx channel and the rx channel are wired and the rx queue receives automatically the transmitted tx packages?

2) Ok, so tx completion queues are used to recycle the tx queues. In the example i am referring to has the tx completion queue the number 1000, but the tx free queue pool has 736. Why are the descriptors not recycled back to queue 736 like it is done with the rx free queues(before reception # of rx queue = 8 and # of rx free queue = 16, after reception # of rx free queue = 24 at queue number 737 where rx free queues are allocated)?

Thanks in advance!!!

Burak

over 13 years ago

0 ArunMani over 13 years ago

TI__Genius 9915 points

The RX channel and TX channel are hard wired. But the RX flow is need to direct the descripto coming in the RX channel to a queue . As for the 2. I assume the descriptor in the completion queue is automatically recycled to the free queue. i will let a QMSS expert confirm that.

Thanks,

Arun.

0 asimburak1984 over 13 years ago in reply to ArunMani

Intellectual 515 points

Ok, I understand that rx flow like the name indicates controls the flow between Rx Port and Rx Queues. In the Multicore Navigator Guide Section 6.5 Questions and Answers on 6-15 it says "Note that the Tx descriptor defines the Tx FDQ (and Rx Flow) to be used, and the Rx Flow defines the Rx Q and Rx FDQ. If I look into the example I can clearly see in the rxFlowCfg how the Rx Q and Rx FDQ are set but what I dont understand is how the tx descriptor defines the rx flow. I couldnt find any cfgParam that tells the rx Flow "Hey, we are operating in loopback mode cuz this is the qmss pktdma".

Thanks,

Burak

0 Javier Malave-Bonet over 13 years ago in reply to asimburak1984

TI__Expert 5805 points

Hi Burak,

The TX FDQ is actually defined by the user. It is basically the Q that contains the descriptors with empty buffers ready to be used. The user pops a descriptor from the TX FDQ and fills up with data. The RX flow used is determined by the fact that the TX Q chosen is tied to a TX Channel N which will trigger a reception on RX channel N and thus will use RX flow N. The return ("recycle") queue for the TX descriptor is set by by updating the return queue field of the TX descriptor used. A CPPI API call to CPPI_returnQueue() before pushing the descriptor will also do that. I have not looked at the QMSS example, but I assume that it might be setting this field to a different queue other than the TX FDQ.

Regards,

Javier

0 asimburak1984 over 13 years ago in reply to Javier Malave-Bonet

Intellectual 515 points

Hello,

first of all thank you for your answers! I have almost understood the way the MultNav works. In the descriptor cfg struct the return queue is set to 1000 although the TX FDQ # is 736. What remains unanswered for me is, why a different Q# is used as the FDQ pool. And how can the TX Completion Queues be used again? In the example the Q Counts are:

--------------------------------------------------------

Before transmission

TX Free = 8, RX Free = 24, TX Comp = 0,

After transmiting (before reception)

TX Free = 0, RX Free = 16, TX Comp = 8

After reception

TX Free = 0, RX Free = 24, TX Comp = 8

-------------------------------------------------------

So, the rx queues are recycled back to the RX Free pool but this is not the case for the TX Queues. What if we want to make a new transmission, is there a way to use the TX Comp Queues as the free TX pool or do we have to allocate TX FDQ new?

Best Regards,

Burak

0 Javier Malave-Bonet over 13 years ago in reply to asimburak1984

TI__Expert 5805 points

Hi Burak,

Your understanding is correct. Rembmer, this is just an example, you can set the TX FDQ as your TX Comp Q. Since a transmit operation is initiated by the core it is software's job to recycle the packets and repopulate the TX FDQ (if you did not set it up to be the TX Comp Q).

So, then why use a TX comp Q different to the TX FDQ? At first glance it does not make sense, you just want to pop it to inspect it for potential error flags etc., and then relink the buffer if necessary and push to the IP's TX Q. Why an extra push from TX comp Q to FDQ, just to pop it back again and push it to the TX Q?

Well, imagine a real scenario where you may have a thread constantly generating packets and populating the FDQ and another (maybe running in another core) popping the FDQ and pushing to the TX Q. You don't want the recycled packets inter-mixed in the same FDQ queue with the ready packets, you will need to sort them out and reorginize the FDQ.

The idea is then, you can have a main FDQ which you use to populate all TX & RX FDQs and for each TX and RX FDQ's you can have a respective Completion/Recycle queue. As an example, you can partition your software into three types of threads:

1) queue management thread (manages the memory region, and keeps track of the main FDQ)

2) recycle thread (pops from FDQ or Comp Q, inspects and repopulates TX & RX FDQ)

3) TX/RX thread (pops from the TX FDQ and pushes to the TX Q, pops from RX Q and process data)

You can allocate the threads across the cores and/or you can instantiate multiple instances of these threads to manage multiple memory regions and PKTDMA's (i..e QMSS, NetCP, SRIO etc.).

Hope this simple example illustrates the concept behind FDQ and completion queues.

Regards,

Javier

0 asimburak1984 over 13 years ago in reply to Javier Malave-Bonet

Intellectual 515 points

Hello Javier,

THANK YOU for your detailed answer! Especially the examples/scenarios cleared all the clouds in my mind:)

Best regards,

Burak

0 Javier Malave-Bonet over 13 years ago in reply to asimburak1984

TI__Expert 5805 points

You are most welcomed. :)

0 Graham Chow over 12 years ago in reply to Javier Malave-Bonet

Intellectual 265 points

Thanks, that makes sense.

0 Aamir Husain over 12 years ago in reply to Javier Malave-Bonet

Expert 2715 points

Javier,

I had a related question on infrastructure PKTDMA queues across multiple cores. I want to be able to transfer packets between cores using the infrastructure PKTDMA.

As there are 32 channels available for use and each TX channel is associated with the corresponding RX channel. Does one assign a particular channel for use by each core from 800-831, say 800 for core 0, 801 for core 1 and so on? So if core 0 wants to transmit to core 5 it pops a descriptor form the core 0 TX FDQ ((general purpose queue) then it pushes a packet to 805, when that completes it puts that descriptor back to core 0 TX FDQ for use again in future Txmits.

Thanks, Aamir

0 Aamir Husain over 12 years ago in reply to Aamir Husain

Expert 2715 points

Can someone from TI please respond to this. Much appreciated. Thanks, Aamir

0 Javier Malave-Bonet over 12 years ago in reply to Aamir Husain

TI__Expert 5805 points

Hi Aamir,

I understand it is good to know how the TX channel and RX channel works. Especially for debugging purposes this allows you to know the hardware mechanism in order to isolate issues.

However, from a software development point of view the processes/threads only need to be conscious of the queue mechanism. That is the intent of the QMSS/PKTDMA partition. The TX Q are tied to tx channels, but the rx channels are not tied to any particular queue. The RX flow determines the destination (RX) Q so that you can use any general purpose queue (or other special Q i.e. accumulator Q). So you can give each process/thread 1TXQ and 1RX Q and you can fix these processes/threads to particular cores. That way you end up with each core managing 1-N TX/RX Q's (depending on application latency/throughput needs).

Regards,

Javier

0 Aamir Husain over 12 years ago in reply to Javier Malave-Bonet

Expert 2715 points

Javier,

I think I get the point about the rx channels not being tied to a particular queue and the Tx q being tied to tx channels but for my example I mentioned how do I go doing what I want. I have a packet on core 0 that I know needs to be sent to core 5 for example. Do I send it to core 0 Tx queue say 800 and set the Rx flow to be 5 where 5 corresponds to the rx flow used by core 5 so the infrastructure pktdma makes the determination of where it needs to end up going based on the flow.

A little background on what I am trying to do. I am receiving packets via UDP with a range of ports dedicated for each core so I have the PA programmed to ensure that packets get directed to the appropriate core. On a given core after processing, I might decide I need to send out a packet to some other port on some other core or for that matter some other DSP chip. This then gets sent out via the PA and network coprocessor out through the switch. However, when we know that the destination is within the same DSP chip, I would like to make use of the packet infrastructure DMA to just get the packet to the destination core instead bypassing the PA for sake of efficiency.

Thanks, Aamir

0 Aamir Husain over 12 years ago in reply to Aamir Husain

Expert 2715 points

Javier,

I am curently trying to implement the QMSS infrastructure core-to-core transfers for multicore.

I was able to get it to work for a single core where I did the following steps 1) initialize QMSS cpdma witha cppi_open call 2) Opened a single Tx channel using cppi_txChannelopen with the Qmss handle from the first step 3) Opened 32 Rx channls using Cppi_rxChannelOpen 4) Then do Qmss_queueOpen for instrastructure qmss queue i.e. 800 5) then configure the Rx flow where the rxFlowCfg sets up the Rx free q to grab a descriptor from and also the Rx queue to put it into. This is done by:

if ((gRxQmssCommandFlowHnd = Cppi_configureRxFlow (gQmssCpdmaHnd, &rxFlowCfg, &isAllocated)) == NULL)

6) When I want to send a packet out from the core 0 using the infrastructure QMSS, I push to the queue 800 i..e. handle from step 4

Qmss_queuePushDescSize (gQmssTxQHnd[0], pCppiDesc, SIZE_HOST_DESC);

This all works and I am able to send a packet out of core 0 and it is received by my receivepacket function on the same core.

However, when I move to the multicore (two cores actually) I am doing the following steps 1) as above, 2) for two cores, creating for two tx channels 3) as above, 4) done on core 0 still but now doing for two queues 800 and 801. 5) As is

When I trying running this on the second core it fails on the if check as part of step 5) i.e.

if ((gRxQmssCommandFlowHnd = Cppi_configureRxFlow (gQmssCpdmaHnd, &rxFlowCfg, &isAllocated)) == NULL)

Every core has its own rxFlowCfg but the QmssCpdmaHnd should still be the same so I am not sure why it fails.

Any ideas?

Thanks, Aamir

Processors

Processors forum

QMMS in Infrastructure Mode