This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

J784S4XEVM: some question about openvx pipeline depth

Part Number: J784S4XEVM

Hello,

The TIOVX user guide has a description of the pipeline depth, but I don't understand why does it need multiple graph instance, it is only one node in an entity can run at any time?

I think it can be used the analogy of a production line in a factory. Workers are equivalent to nodes or targets. Multiple workers collaborate on the production line to complete a product. Multiple workers work in pipeline line mode, but there is only one production line instance.

Thanks.

  • Hi,

    The TIOVX user guide has a description of the pipeline depth, but I don't understand why does it need multiple graph instance, it is only one node in an entity can run at any time?

    Here, pipelining means that a second frame can not start graph execution on this same graph until the first graph execution is completed. 

    Hence in order to increase the throughput, we would have to consider multiple instances of the graph so that once the node has processed the frame, it could take in the second frame and not wait until all the nodes completes the first frame.

    You could refer the below documentation as well

    The OpenVXTm Graph Pipelining, Streaming, and Batch Processing Extension to OpenVX 1.1 and 1.2 (khronos.org)

    Regards,

    Nikhil

  • Hi,

    I did not understand:

        Here, pipelining means that a second frame can not start graph execution on this same graph until the first graph execution is completed. 

    if there are 3 Nodes in graph, could it work like this:

    On T1:
      +---------+      +---------+      +---------+
      |         |      |         |      |         |
      | frame 3 |----->| frame 2 |----->| frame 1 |
      |         |      |         |      |         |
      +---------+      +---------+      +---------+
       Node1(A72)       Node2(C71)       Node3(MCU)
       
    On T2:
      +---------+      +---------+      +---------+
      |         |      |         |      |         |
      | frame 4 |----->| frame 3 |----->| frame 2 |
      |         |      |         |      |         |
      +---------+      +---------+      +---------+
       Node1(A72)       Node2(C71)       Node3(MCU)
       

    if it could be work, it need one graph instance only, any problem?

    Thanks.

  • Hi,

    The standard concept of OpenVX does not support pipelining, which means that the graph can be retriggered only when all the node execution for one frame is complete. Hence, node retriggering is not possible.

    Hence, in order to retrigger the graph, TI's extension/support was to create instances of the graph (i.e. pipeline depth) that would be optimally equal to the number of nodes, so that retriggering of the each graph would be similar to retriggering of each node.

    Hence, we require multiple instances.

    Regards,

    Nikhil

  • Hi,

    That doesn't sound particularly perfect! Will it waste memory space? Why not TI improve its implementation based on OpenVX function interface ? 

    Thanks.

  • Hi,

    If compared to not pipelining, then yes, there is more memory used in both multiple buffering and in node object descriptors. 

    However, keeping track of when to execute a node for which buffers from which frame takes control logic to execute at the right time the right buffers.  Scaling from nonpipeling to pipelining by replicating the graph/node objects is a scalable and efficient way of using existing logic with minimal memory increase.

    To use the analogy of an assembly line, you could consider each factory worker as a separate node. In the case of 3 nodes, there is still only 3 factory workers whether you pipeline or not.  The pipeline depth can be something similar to a container of the item being worked on.  If you only work on 1 item at a time, you only need one container, and when the item is done, you can reuse it for the next item. If you want to work on 3 at a time, you need 3 containers, reusing the one that finished each time for the next one to start. This container is like a graph context that we call a graph instance.

    Hope this helps in clarifying your query.

    Regards,

    Nikhil

  • Hi,

    you said:

    To use the analogy of an assembly line, you could consider each factory worker as a separate node. In the case of 3 nodes, there is still only 3 factory workers whether you pipeline or not.  The pipeline depth can be something similar to a container of the item being worked on.  If you only work on 1 item at a time, you only need one container, and when the item is done, you can reuse it for the next item. If you want to work on 3 at a time, you need 3 containers, reusing the one that finished each time for the next one to start. This container is like a graph context that we call a graph instance.

    But a graph instance looks more like a assembly lines, it has its own nodes and buffers , Maybe my understanding is wrong.

    Thanks.

  • Hi,

    Graph instance cannot be compared to the assembly line as the next input does not come in until the line is empty and done with processing the current product. (i.e. the graph can be retriggered only once all the nodes are done with the execution.

    Regards,

    Nikhil

  • Hi,

    This is exactly my question. There is only one worker busy on an assembly line, and the others are idle and waiting. Isn’t it a waste of resources?

    Thansk.

  • Hi

    the graph can be retriggered only once all the nodes are done with the execution

    This is the current implementation of a graph as per the OpenVX standard where it cannot be retriggered unless it is completed all the nodes.
    Hence, to extend this implementation to pipelining mode, (as node level triggering is not available), TI's implementation was to extend it to multiple graph instances. 

    Hence, scaling from non-pipelining to pipelining by replicating the graph/node objects is a scalable and efficient way of using existing logic with minimal memory increase which is currently supported in the SDK.

    Regards,

    Nikhil