Using psdk_rtos_auto_j7_06_02_00_21 and TI has released patch of Tidl and mmalib to us using CDDS.
We have used
TIDL patch : tidl_j7_01_01_01_01
MMA Library : mmalib_01_01_00_02
We are trying to run (16,16,3)model and have modified our application to support the same, by using the suggestions suggested by Mr. Shyam on https://e2e.ti.com/support/processors/f/791/p/898935/3365185#pi320966=1
GRAPH: OpenVxGraph (#nodes = 10, #executions = 51) NODE: CAPTURE1: capture_node: avg = 2519 usecs, min/max = 65 / 57684 usecs, #executions = 51 NODE: DSP-1: colorConvert_node: avg = 12448 usecs, min/max = 12293 / 12608 usecs, #executions = 51 NODE: VPAC_MSC1: ScalerNode: avg = 2553 usecs, min/max = 2514 / 2668 usecs, #executions = 51 NODE: DSP-1: PreProcNode: avg = 32792 usecs, min/max = 32774 / 32851 usecs, #executions = 51 NODE: DSP_C7-1: TIDLNode: avg = 38589 usecs, min/max = 38561 / 38692 usecs, #executions = 51 NODE: DSP-1: tracker_node: avg = 215 usecs, min/max = 192 / 628 usecs, #executions = 51 NODE: DSP-2: DrawBoxDetectionsNode: avg = 3693 usecs, min/max = 3395 / 3748 usecs, #executions = 51 NODE: VPAC_MSC1: MosaicNode: avg = 8437 usecs, min/max = 6067 / 23143 usecs, #executions = 51 NODE: DISPLAY1: DisplayNode: avg = 8352 usecs, min/max = 105 / 16750 usecs, #executions = 51 NODE: DSP-2: op_signal_node: avg = 2058 usecs, min/max = 740 / 2160 usecs, #executions = 51 PERF: FILEIO: avg = 0 usecs, min/max = 4294967295 / 0 usecs, #executions = 0 PERF: TOTAL: avg = 90729 usecs, min/max = 33052 / 103703 usecs, #executions = 54 PERF: TOTAL: 11. 2 FPS
Overall there is degradation of FPS by a factor of 3 compared to application running (8,8,3) model.
We see that capture node is showing a reduction in timing. Any specific reason for that.
On Earlier sdk version capture node was
NODE: CAPTURE1: Capture_node: avg = 33228 usecs, min/max = 30035 / 37423 usecs, #executions = 125
We are also see that the TIDL node and the preproc node are taking longer time due to 16 16 3 based processing.
How can we reduce the processing time on these 2 nodes.