This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: Capture timings different for the same application

Part Number: TDA4VM


Tool/software:

Hi TI

I was working on an application with following perf stats:

GRAPH: app_btc_seg_cam_graph (#nodes = 7, #executions = 371)
NODE: CAPTURE1: capture_node: avg = 20597 usecs, min/max = 116 / 75501 usecs, #executions = 371
NODE: A72-0: colorConv_node: avg = 3204 usecs, min/max = 2953 / 3723 usecs, #executions = 371
NODE: DSS_M2M2: colorConv_NV12_RGBnode: avg = 9599 usecs, min/max = 9298 / 30952 usecs, #executions = 371
NODE: A72-0: Custom : avg = 25857 usecs, min/max = 25372 / 133676 usecs, #executions = 371
NODE: DSS_M2M1: colorConv_RGB_NV12node: avg = 1143 usecs, min/max = 1100 / 2120 usecs, #executions = 371
NODE: VPAC_MSC1: mosaic_node: avg = 1882 usecs, min/max = 1554 / 28407 usecs, #executions = 371
NODE: DISPLAY1: DisplayNode: avg = 10667 usecs, min/max = 93 / 16545 usecs, #executions = 371

PERF: FILEIO: avg = 0 usecs, min/max = 4294967295 / 0 usecs, #executions = 0
PERF: TOTAL: avg = 66663 usecs, min/max = 66028 / 67369 usecs, #executions = 37

PERF: TOTAL: 15. 0 FPS

The same app I tried to commented few nodes and ran again 

GRAPH: app_btc_seg_cam_graph (#nodes = 4, #executions = 419)
NODE: CAPTURE1: capture_node: avg = 31976 usecs, min/max = 30285 / 35427 usecs, #executions = 419
NODE: A72-0: colorConv_node: avg = 2316 usecs, min/max = 2260 / 2685 usecs, #executions = 419
NODE: VPAC_MSC1: mosaic_node: avg = 3160 usecs, min/max = 2665 / 33944 usecs, #executions = 419
NODE: DISPLAY1: DisplayNode: avg = 16305 usecs, min/max = 93 / 16651 usecs, #executions = 419

PERF: FILEIO: avg = 0 usecs, min/max = 4294967295 / 0 usecs, #executions = 0
PERF: TOTAL: avg = 66664 usecs, min/max = 66419 / 66924 usecs, #executions = 172

PERF: TOTAL: 15. 0 FPS

I have observed that the time taken for my capture node has increased from 20597 usecs to 31976 usecs.

I have experimented with the BUFFERQ depth as well for the subsequent custom node after capture node.But still the same behaviour.

I have observed the same issue with other applications as well.Kindly find the perf stats for other application in the below text:

/cfs-file/__key/communityserver-discussions-components-files/791/ti_5F00_query.txt

What could be the reason for this behaviour?For the application (in .txt)we have observed 5738 usecs for capture node

I found a similar post in which its mentioned to change the captureObj->params.instCfg[id].numPixels = 1,which file I can change this?

Kindly guide what could have gone wrong in this case?

  • Hi,

    This time depends on how many frames is the capture node getting to process. In case of lesser nodes, it seems that the empty buffers are available at a faster rate to be filled. Thus leading to Multiple buffers being pushed to the IP to get it filled and vice versa the other case.

     captureObj->params.instCfg[id].numPixels = 1 means that you are increasing the number of pixel to output per clock cycle. You can find the description below 

    PDK API Guide for J784S4: Csirx_InstCfg Struct Reference (ti.com)

    Regards,

    Nikhil

  • HI Nikhil

    Thank you for the reply.

    I have used PIPELINE DEPTH as 7,BUFQDEPTH as 8 and set num buffers for the nodes as below:

    tivxSetNodeParameterNumBufByIndex(obj->colorConvObj.node, 1, 1);
    tivxSetNodeParameterNumBufByIndex(obj->colorConvNV12RGBObj.node, 2, 1);
    tivxSetNodeParameterNumBufByIndex(obj->customobj.Customnode, 1, 1);
    tivxSetNodeParameterNumBufByIndex(obj->colorConvRGBNV12Obj.node, 2,1);
    tivxSetNodeParameterNumBufByIndex(obj->imgMosaicObj.node, 1, 4);

    Please find the perf stats for the same

    GRAPH: app_btc_seg_cam_graph (#nodes =   7, #executions =    528)
     NODE:       CAPTURE1:             capture_node: avg =  15785 usecs, min/max =    154 /  47685 usecs, #executions =        528
     NODE:          A72-0:           colorConv_node: avg =   2206 usecs, min/max =   2151 /   3083 usecs, #executions =        528
     NODE:       DSS_M2M2:   colorConv_NV12_RGBnode: avg =   8796 usecs, min/max =   7432 /  32628 usecs, #executions =        528
     NODE:          A72-0:               Custom : avg =  24156 usecs, min/max =  23877 / 133660 usecs, #executions =        528
     NODE:       DSS_M2M1:   colorConv_RGB_NV12node: avg =   2508 usecs, min/max =   1152 /   2927 usecs, #executions =        528
     NODE:      VPAC_MSC1:              mosaic_node: avg =   2255 usecs, min/max =   1653 /  27827 usecs, #executions =        528
     NODE:       DISPLAY1:              DisplayNode: avg =   9068 usecs, min/max =     94 /  33140 usecs, #executions =        528

     PERF:           FILEIO: avg =      0 usecs, min/max = 4294967295 /      0 usecs, #executions =          0
     PERF:            TOTAL: avg =  70381 usecs, min/max =  70012 /  71634 usecs, #executions =         69

     PERF:            TOTAL:   14.20 FPS

    I believe that once the capture time improves, the application can run at a faster FPS. Would changing the number of buffers using tivxSetNodeParameterNumBufByIndex improve the FPS?

    Currently, no matter what experiments I try, I'm only able to achieve around 15 FPS. Is there any way to resolve this?

    Additionally, the time should only add up to 64,774 microseconds, but it shows 70,381 microseconds.

    Regards

    Sithara Tresa Chacko

  • Hi,

    capture_node: avg =  15785 usecs,

    You capture node seems to be running fine. For 30FPS, it should be 33ms but you are already way below that.

     PERF:            TOTAL:   14.20 FPS

    This is being printed from the application. I currently see only this  Custom : avg =  24156 usecs as bottle neck.

    Regards,

    Nikhil

  • Hi Nikhil

    Thankyou for your input.

    I will try to optimise the custom node.

    But as i mentioned earlier also in the first post ,even after i have removed certain nodes in the graph  as below ,my fps was still coming at 15 fps only.This  has actually caused me confusion.

    In the below app ,colorConv node is custom .Used  APP_BUFFER_Q_DEPTH  4 and APP_PIPELINE_DEPTH  4 and set 

    tivxSetNodeParameterNumBufByIndex(obj->colorConvObj.node, 1, APP_BUFFER_Q_DEPTH);
    tivxSetNodeParameterNumBufByIndex(obj->imgMosaicObj.node, 1, APP_BUFFER_Q_DEPTH);

    GRAPH: app_btc_seg_cam_graph (#nodes = 4, #executions = 215)
    NODE: CAPTURE1: capture_node: avg = 32002 usecs, min/max = 30395 / 35846 usecs, #executions = 215
    NODE: A72-0: colorConv_node: avg = 2317 usecs, min/max = 2260 / 3373 usecs, #executions = 215
    NODE: VPAC_MSC1: mosaic_node: avg = 3220 usecs, min/max = 2639 / 28075 usecs, #executions = 215
    NODE: DISPLAY1: DisplayNode: avg = 7268 usecs, min/max = 87 / 16570 usecs, #executions = 215

    PERF: FILEIO: avg = 0 usecs, min/max = 4294967295 / 0 usecs, #executions = 0
    PERF: TOTAL: avg = 66663 usecs, min/max = 66404 / 66919 usecs, #executions = 44

    PERF: TOTAL: 15. 0 FPS

    Still the FPS was coming 15.What could be the possible explanation for this case in which all other nodes are taking less time.

    Regards

    Sithara Tresa Chacko

  • Hi Sithara,

    NODE: CAPTURE1: capture_node: avg = 32002 usecs, min/max = 30395 / 35846 usecs, #executions = 215
    NODE: A72-0: colorConv_node: avg = 2317 usecs, min/max = 2260 / 3373 usecs, #executions = 215
    NODE: VPAC_MSC1: mosaic_node: avg = 3220 usecs, min/max = 2639 / 28075 usecs, #executions = 215
    NODE: DISPLAY1: DisplayNode: avg = 7268 usecs, min/max = 87 / 16570 usecs, #executions = 215

    You could clearly see here that none of the nodes are taking much time

    whereas your application is taking 66 msec to execute the graph. Could you please check in the application why is it taking so much time to execute the graph? 

    PERF: TOTAL: avg = 66663 usecs

    Regards,

    Nikhil