[TDA4M]How to design pipelined graphs in avp sample.

Yongsig Do

I have a copule of question about pipelining in openVX.

In case of AVP sample, there are some replicate function and tivxSetNodeParameterNumBufByIndex function, and it look like the functions for pipelining.

But I don't understatnd exactly about relation input parameter in tivx function and replicated number.

Could you please explain details in APP_TIDL_AVP ?

Best regards,

Yongsig

over 5 years ago

0 Shyam Jagannathan over 5 years ago

TI__Genius 10355 points

Hi Yongsig,

The upcoming release will have some details on OpenVx pipeline in general which should help you understand a few portions of the code.

For AVP in general, kindly allow me a couple of days to draft a design doc and share it with you.

Regards,
Shyam

0 Shyam Jagannathan over 5 years ago in reply to Shyam Jagannathan

TI__Genius 10355 points

Yongsig,

I realized that I would require more time to draft a complete design doc. So I am gonna attempt to explain a few concepts right here. Eventually we will have a detailed developer note as part of a user guide.

Firstly, doing multi-channel has nothing to do with OpenVx pipelining. Any node can be configured to execute in a multi-channel mode by using the vxReplicate functionality in OpenVx. By replicating a node, the user can configure which input/output parameters of a node needs to be replicated. The replicate factor is simple the number of channels. This is a very useful way of running an OpenVx node on multiple camera inputs and keep the graph compact.

Now lets say we have an OpenVx graph with 3 nodes and 3 compute cores with one node executing on each core. By default the graph execution execution will be sequential in nature, so even if then nodes are executing on different cores they will have to wait till one node completes its execution and generates the data required for the next connected node. This is largely a waste of compute because the other cores have to wait (remain in IDLE) till one core finishes.

In a pipeline execution model, all the cores are executing its respective node irrespective of the data being valid or not. The OpenVx framework which the help of some user inputs makes sure that data propagated from one node to the next is up-to-date so that compute does not go waste.

These user inputs are some of the things you have mentioned. Apart from creating a graph (connecting different nodes together) we need to specify two key things.

1. A graph parameter which can be enqueued/dequeued by the host.

2. Intermediate buffer depths between successive nodes.

Any node parameter (input/output parameter) can be made a graph parameter. In AVP demo for example, the input to the scalar node is made a graph parameter. This is because the host A72 need to read a file from SD card and put in a buffer in DDR before executing the graph. If the user is interested in writing the output to a file, he can make the final node (mosaic node) output as a graph parameter and then dump the output from a buffer to a file on SD card. In the AVP demo as the output is routed to a display there is only one graph parameter which is the input to the scalar node. This is achieved by add_graph_parameter_by_node_index(<graph>, <node>, <node parameter index>). We can make a list of such graph parameters and configure the graph schedule by calling vxSetGraphScheduleConfig(). This will help configure the graph to look out for parameters which the host will manage (manual enqueue/dequeue)

Now we need to set appropriate buffer depths between nodes so that all nodes can run concurrently on the different cores. As a general rule, for two adjacent nodes to run in parallel on different cores we can set the buffer depth as 2. But this will again be a function of how fast the producer node is generating outputs and consumer node is consuming the generated outputs. This is done by tivxSetNodeParameterNumBufByIndex(<node>, <node parameter index>, <buffer depth>)

In our example of 3 nodes, lets say the 3rd node accepts output from both 1st node and 2nd node. So we would setup the intermediate buffer depths as,

tivxSetNodePrarameterNumBufByIndex(<node1>, <node1 output parameter index>, 4)

tivxSetNodePrarameterNumBufByIndex(<node2>, <node2 output parameter index>, 2)

So between node1 and node3 the output buffer has a depth of 4 (can be 3 as well) and between node2 and node3 the buffer depth is 2.

The user is aware of how the nodes are connected in the graph so he can control buffer depths required between the nodes.

I hope this brief explanation has helped explain some of your questions. You can try and look at some more simpler pipelining examples in tiovx and get a better understanding of the same. Please feel free to discuss if something is not clear

Regards,
Shyam

Processors

Processors forum

[TDA4M]How to design pipelined graphs in avp sample.