This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

QMMS in Infrastructure Mode

Hello,

I've been studying the multicore navigator using the Multicore Navigator User Guide, CPPI/QMSS LLD Spec and the infrastructure mode (on a single core) example in the "\pdk_C6670_1_0_0_21\packages\ti\drv\qmss\example" folder.Here are my questions:

1) As far as I understood is that if we push a queue on the INFRASTRUCTURE_QUEUE it s automatically transfered to the Rx Queue where we can afterwards pop it and   get the data. What I dont understand in the example is what is the role of the Rx Flow if the tx channel and the rx channel are wired and the rx queue receives automatically the transmitted tx packages?

2) Ok, so tx completion queues are used to recycle the tx queues. In the example i am referring to has the tx completion queue the number 1000, but the tx free queue pool has 736. Why are the descriptors not recycled back to queue 736 like it is done with the rx free queues(before reception # of rx queue = 8 and # of rx free queue = 16, after reception # of rx free queue = 24 at queue number 737 where rx free queues are allocated)?

Thanks in advance!!!

Burak

  • The RX channel and TX channel are hard wired. But the RX flow is need to direct the descripto coming in the RX channel to a queue . As for the 2. I assume the descriptor in the completion queue is automatically recycled to the free queue. i will let a QMSS expert confirm that.

    Thanks,

    Arun.

  • Ok, I understand that rx flow like the name indicates controls the flow between Rx Port and Rx Queues. In the Multicore Navigator Guide Section 6.5 Questions and Answers on 6-15 it says "Note that the Tx descriptor defines the Tx FDQ (and Rx Flow) to be used, and the Rx Flow defines the Rx Q and Rx FDQ. If I look into the example I can clearly see in the rxFlowCfg how the Rx Q and Rx FDQ are set but what I dont understand is how the tx descriptor defines the rx flow. I couldnt find any cfgParam that tells the rx Flow "Hey, we are operating in loopback mode cuz this is the qmss pktdma".

    Thanks,

    Burak

  • Hi Burak,

    The TX FDQ is actually defined by the user. It is basically the Q that contains the descriptors with empty buffers ready to be used. The user pops a descriptor from the TX FDQ and fills up with data. The RX flow used is determined by the fact that the TX Q chosen is tied to a TX Channel N which will trigger a reception on RX channel N and thus will use RX flow N. The return ("recycle") queue for the TX descriptor is set by by updating the return queue field of the TX descriptor used. A CPPI API call to CPPI_returnQueue() before pushing the descriptor will also do that. I have not looked at the QMSS example, but I assume that it might be setting this field to a different queue other than the TX FDQ.

    Regards,

    Javier

  • Hello,

    first of all thank you for your answers! I have almost understood the way the MultNav works. In the descriptor cfg struct the return queue is set to 1000 although the TX FDQ # is 736. What remains unanswered for me is, why a different Q# is used as the FDQ pool. And how can the TX Completion Queues be used again? In the example the Q Counts are:

    --------------------------------------------------------

    Before transmission

    TX Free = 8, RX Free = 24, TX Comp = 0,

    After transmiting (before reception)

    TX Free = 0, RX Free = 16, TX Comp = 8

    After reception

    TX Free = 0, RX Free = 24, TX Comp = 8

    -------------------------------------------------------

    So, the rx queues are recycled back to the RX Free pool but this is not the case for the TX Queues. What if we want to make a new transmission, is there a way to use the TX Comp Queues as the free TX pool or do we have to allocate TX FDQ new?

    Best Regards,

    Burak

  • Hi Burak,

    Your understanding is correct. Rembmer, this is just an example, you can set the TX FDQ as your TX Comp Q. Since a transmit operation is initiated by the core it is software's job to recycle the packets and repopulate the TX FDQ (if you did not set it up to be the TX Comp Q).

    So, then why use a TX comp Q different to the TX FDQ? At first glance it does not make sense, you just want to pop it to inspect it for potential error flags etc., and then relink the buffer if necessary and push to the IP's TX Q. Why an extra push from TX comp Q to FDQ, just to pop it back again and push it to the TX Q?

    Well, imagine a real scenario where you may have a thread constantly generating packets and populating the FDQ and another (maybe running in another core) popping the FDQ and pushing to the TX Q. You don't want the recycled packets inter-mixed in the same FDQ queue with the ready packets, you will need to sort them out and reorginize the FDQ.

    The idea is then, you can have a main FDQ which you use to populate all TX & RX FDQs and for each TX and RX FDQ's you can have a respective Completion/Recycle queue. As an example, you can partition your software into three types of threads:

    1) queue management thread (manages the memory region, and keeps track of the main FDQ)

    2) recycle thread (pops from FDQ or Comp Q, inspects and repopulates TX & RX FDQ) 

    3) TX/RX thread (pops from the TX FDQ and pushes to the TX Q, pops from RX Q and process data)

    You can allocate the threads across the cores and/or you can instantiate multiple instances of these threads to manage multiple memory regions and PKTDMA's (i..e QMSS, NetCP, SRIO etc.).

    Hope this simple example illustrates the concept behind FDQ and completion queues.

      Regards,

    Javier

     

     

  • Hello Javier,

    THANK YOU for your detailed answer! Especially the examples/scenarios cleared all the clouds in my mind:)

    Best regards,

    Burak

  • You are most welcomed. :)

  • Thanks, that makes sense.

  • Javier,

    I had a related question on infrastructure PKTDMA queues across multiple cores. I want to be able to transfer packets between cores using the infrastructure PKTDMA.

    As there are 32 channels available for use and each TX channel is associated with the corresponding RX channel. Does one assign a particular channel for use by each core from 800-831, say 800 for core 0, 801 for core 1 and so on? So if core 0 wants to transmit to core 5 it pops a descriptor form the core 0 TX FDQ ((general purpose queue) then it pushes a packet to 805, when that completes it puts that descriptor back to core 0 TX FDQ for use again in future Txmits.

    Thanks, Aamir 

  • Can someone from TI please respond to this. Much appreciated. Thanks, Aamir

  • Hi Aamir,

    I understand it is good to know how the TX channel and RX channel works. Especially for debugging purposes this allows you to know the hardware mechanism in order to isolate issues.

    However, from a software development point of view the processes/threads only need to be conscious of the queue mechanism. That is the intent of the QMSS/PKTDMA partition. The TX Q are tied to tx channels, but the rx channels are not tied to any particular queue. The RX flow determines the destination (RX) Q so that you can use any general purpose queue (or other special Q i.e. accumulator Q). So you can give each process/thread 1TXQ and 1RX Q and you can fix these processes/threads to particular cores. That way you end up with each core managing 1-N TX/RX Q's (depending on application latency/throughput needs).

    Regards,

    Javier 

     

  • Javier,

    I think I get the point about the rx channels not being tied to a particular queue and the Tx q being tied to tx channels but for my example I mentioned how do I go doing what I want. I have a packet on core 0 that I know needs to be sent to core 5 for example. Do I send it to core 0 Tx queue say 800 and set the Rx flow to be 5 where 5 corresponds to the rx flow used by core 5 so the infrastructure pktdma makes the determination of where it needs to end up going based on the flow.

    A little background on what I am trying to do. I am receiving packets via UDP with a range of ports dedicated for each core so I have the PA programmed to ensure that packets get directed to the appropriate core. On a given core after processing, I might decide I need to send out a packet to some other port on some other core or for that matter some other DSP chip. This then gets sent out via the PA and network coprocessor out through the switch. However, when we know that the destination is within the same DSP chip, I would like to make use of the packet infrastructure DMA to just get the packet to the destination core instead bypassing the PA for sake of efficiency.

    Thanks, Aamir

  • Javier,

    I am curently trying to implement the QMSS infrastructure core-to-core transfers for multicore.

    I was able to get it to work for a single core where I did the following steps 1) initialize QMSS cpdma witha  cppi_open call 2) Opened a single Tx channel using cppi_txChannelopen with the Qmss handle from the first step 3) Opened 32 Rx channls using Cppi_rxChannelOpen 4) Then do Qmss_queueOpen for instrastructure qmss queue i.e. 800 5) then configure the Rx flow where the rxFlowCfg sets up the Rx free q to grab a descriptor from and also the Rx queue to put it into. This is done by:

     if ((gRxQmssCommandFlowHnd = Cppi_configureRxFlow (gQmssCpdmaHnd, &rxFlowCfg, &isAllocated)) == NULL)

    6) When I want to send a packet out from the core 0 using the infrastructure QMSS, I push to the queue 800 i..e. handle from step 4

          Qmss_queuePushDescSize (gQmssTxQHnd[0], pCppiDesc, SIZE_HOST_DESC);

    This all works and I am able to send a packet out of core 0 and it is received by my receivepacket function on the same core.

    However, when I move to the multicore (two cores actually) I am doing the following steps 1) as above, 2) for two cores, creating for two tx channels 3) as above, 4) done on core 0 still but now doing for two queues 800 and 801. 5) As is

    When I trying running this on the second core it fails on the if check as part of step 5) i.e.

     if ((gRxQmssCommandFlowHnd = Cppi_configureRxFlow (gQmssCpdmaHnd, &rxFlowCfg, &isAllocated)) == NULL)

    Every core has its own rxFlowCfg but the QmssCpdmaHnd should still be the same so I am not sure why it fails.

    Any ideas?

    Thanks, Aamir