This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM:[TDA4ISP performance problems

Part Number: TDA4VM

HI Expert

 Question

 1: is there any debug method to check the ISP occupancy and usage?

 2: We use a 5-channel 2m camera (1920*1200*raw12). It is found that the frame rate is less than 30fps, and 4 channels are normal. Our calculation bandwidth is sufficient. Experts are required to give explanations. Thank you.

  • Hi,

    1: is there any debug method to check the ISP occupancy and usage?

    When you press 'p' on the console, it prints all of these logs. Can you please share this performance log? 

    Regards,

    Brijesh

  • HI

    The occupancy rate of viss is between 46% and 60%. The best frame rate is not 30fps

  • hi:

    It may be normal to re initialize capture. What is the reason for this

  • Hi,

    But there are lot frame drops for both the CSI instances. This typically comes when next module in the chains is not running realtime or not returning buffers in time.. So can you please share the complete performance stats log to understand which module is not running in realtime? 

    You could also try increasing the number of buffers between CSIRX and VISS and see if it makes difference.

    Regards,

    Brijesh 

  • Hi:

    I understand it is to increase the array of viss, right?

  • Yes, it is most likely due to VISS, but still need to check performance numbers to confirm.

  • root@sixi_slave:/opt/data# tda4_performance
    APP: Init ... !!!
    MEM: Init ... !!!
    MEM: Initialized DMA HEAP (fd=4) !!!
    MEM: Init ... Done !!!
    IPC: Init ... !!!
    IPC: Init ... Done !!!
    REMOTE_SERVICE: Init ... !!!
    REMOTE_SERVICE: Init ... Done !!!
    APP: Init ... Done !!!
    7976.857258 s: VX_ZONE_INIT:Enabled
    7976.857280 s: VX_ZONE_ERROR:Enabled
    7976.857294 s: VX_ZONE_WARNING:Enabled
    7976.857856 s: VX_ZONE_INIT:[tivxInit:71] Initialization Done !!!
    7976.858166 s: VX_ZONE_INIT:[tivxHostInit:48] Initialization Done for HOST !!!

    Summary of CPU load,
    ====================

    CPU: mpu1_0: TOTAL LOAD = 0. 0 % ( HWI = 0. 0 %, SWI = 0. 0 % )
    CPU: mcu2_0: TOTAL LOAD = 18.54 % ( HWI = 2.20 %, SWI = 0.82 % )
    CPU: c6x_1: TOTAL LOAD = 0. 5 % ( HWI = 0. 2 %, SWI = 0. 1 % )
    CPU: c6x_2: TOTAL LOAD = 0. 5 % ( HWI = 0. 2 %, SWI = 0. 1 % )
    CPU: c7x_1: TOTAL LOAD = 0. 7 % ( HWI = 0. 3 %, SWI = 0. 2 % )


    HWA performance statistics,
    ===========================

    HWA: VISS: LOAD = 47.92 % ( 601 MP/s )
    REMOTE_SERVICE: ERROR: CPU 4 is not enabled or invalid CPU ID


    DDR performance statistics,
    ===========================

    DDR: READ BW: AVG = 744 MB/s, PEAK = 1209 MB/s
    DDR: WRITE BW: AVG = 1620 MB/s, PEAK = 2599 MB/s
    DDR: TOTAL BW: AVG = 2364 MB/s, PEAK = 3808 MB/s


    Detailed CPU performance/memory statistics,
    ===========================================

    CPU: mcu2_0: TASK: IPC_RX: 0. 7 %
    CPU: mcu2_0: TASK: REMOTE_SRV: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_CPU: 6.24 %
    CPU: mcu2_0: TASK: TIVX_NF: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_LDC1: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_MSC1: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_MSC2: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_VISS1: 4. 4 %
    CPU: mcu2_0: TASK: TIVX_CAPT1: 0.75 %
    CPU: mcu2_0: TASK: TIVX_CAPT2: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_DISP1: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_DISP2: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_CSITX: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_CAPT3: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_CAPT4: 0. 0 %
    CPU: mcu2_0: TASK: TIVX_CAPT5: 0. 0 %

    CPU: mcu2_0: HEAP: DDR_SHARED_MEM: size = 8388608 B, free = 8042496 B ( 95 % unused)
    CPU: mcu2_0: HEAP: L3_MEM: size = 131072 B, free = 131072 B (100 % unused)
    CPU: mcu2_0: HEAP: DDR_NON_CACHE_M: size = 65536 B, free = 65536 B (100 % unused)

    CPU: c6x_1: TASK: IPC_RX: 0. 0 %
    CPU: c6x_1: TASK: REMOTE_SRV: 0. 0 %
    CPU: c6x_1: TASK: TIVX_CPU: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_RX: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_1: TASK: IPC_TEST_TX: 0. 0 %

    CPU: c6x_1: HEAP: DDR_SHARED_MEM: size = 16777216 B, free = 16774912 B ( 99 % unused)
    CPU: c6x_1: HEAP: L2_MEM: size = 229376 B, free = 229376 B (100 % unused)
    CPU: c6x_1: HEAP: DDR_SCRATCH_MEM: size = 50331648 B, free = 50331648 B ( 14 % unused)

    CPU: c6x_2: TASK: IPC_RX: 0. 0 %
    CPU: c6x_2: TASK: REMOTE_SRV: 0. 0 %
    CPU: c6x_2: TASK: TIVX_CPU: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_RX: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c6x_2: TASK: IPC_TEST_TX: 0. 0 %

    CPU: c6x_2: HEAP: DDR_SHARED_MEM: size = 16777216 B, free = 16774912 B ( 99 % unused)
    CPU: c6x_2: HEAP: L2_MEM: size = 229376 B, free = 229376 B (100 % unused)
    CPU: c6x_2: HEAP: DDR_SCRATCH_MEM: size = 50331648 B, free = 50331648 B ( 14 % unused)

    CPU: c7x_1: TASK: IPC_RX: 0. 0 %
    CPU: c7x_1: TASK: REMOTE_SRV: 0. 0 %
    CPU: c7x_1: TASK: TIVX_CPU: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_RX: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_TX: 0. 0 %
    CPU: c7x_1: TASK: IPC_TEST_TX: 0. 0 %

    CPU: c7x_1: HEAP: DDR_SHARED_MEM: size = 335544320 B, free = 335544320 B ( 10 % unused)
    CPU: c7x_1: HEAP: L3_MEM: size = 3964928 B, free = 3964928 B (100 % unused)
    CPU: c7x_1: HEAP: L2_MEM: size = 491520 B, free = 491520 B (100 % unused)
    CPU: c7x_1: HEAP: L1_MEM: size = 16384 B, free = 16384 B (100 % unused)
    CPU: c7x_1: HEAP: DDR_SCRATCH_MEM: size = 167772160 B, free = 167772160 B ( 23 % unused)


    7976.864485 s: VX_ZONE_INIT:[tivxHostDeInit:56] De-Initialization Done for HOST !!!
    7976.868875 s: VX_ZONE_INIT:[tivxDeInit:111] De-Initialization Done !!!
    APP: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... Done !!!
    IPC: Deinit ... !!!
    IPC: DeInit ... Done !!!
    MEM: Deinit ... !!!
    MEM: Alloc's: 0 alloc's of 0 bytes
    MEM: Free's : 0 free's of 0 bytes
    MEM: Open's : 0 allocs of 0 bytes
    MEM: Deinit ... Done !!!
    APP: Deinit ... Done !!!

  • HI:

    I found that after opening the 5-channel video stream, the occupancy of viss will decrease. In addition to viss, what other parameters need to be related

  • Hi,

    If you are using multi-camera example, it uses LDC and MSC also in the chain. LDC is used for distortion correction and MSC for creating mosaic from 5 frames.. 

    This is why i wanted to check performance states, can you please press 'p' when the usecase is running and share the log?

    Regards,

    Brijesh

  • Hi

    The use case is not adapted, and it takes a long time to modify. I hope that the problem can be resolved in a positive way. If I need to print something, I can transplant it to my program. Thank you

  • HI

    I checked the DDR status, viss status, and MCU task status. There was no exception, and when I started channel 5. The occupancy ratio has decreased. Is there anything else you need to pay attention to? Where is the bottleneck?

  • Hi,

    It would be bit difficult to figure out from the above image, there are just three nodes and nodes are running in realtime, all of them are taking less than 33ms.

    Could you please add below in your code and get us the complete graph performance? You could refer to any of existing vision apps application to integrate it. 

      case 'p':
    appPerfStatsPrintAll();
    status = tivx_utils_graph_perf_print(obj->graph);
    appPerfPointPrint(&obj->total_perf);
    printf("\n");
    appPerfPointPrintFPS(&obj->total_perf);
    appPerfPointReset(&obj->total_perf);
    printf("\n");
    break;
     

    From the first image that you have shared, most likely, capture is running fine, but graph is running at slower fps, because some other component in the graph is slowed, so capture is dropping frames.. 

    What all other components are running in the graph? Can we get performance of these components?

    Regards,

    Brijesh

  • My components should be capture, viss and aewb nodes,Printing is arranged in demo and transplanted into my program. There were several PFS before. And time are not ported. Use our own. Do you mean the performance of viss? Where do we use tda4_ Performance, is it OK to print it out

  • Hi,

    Have you tried increased number of buffers at the input and output of VISS? If yes, what is the current number of buffers with which you are measuring performance?

    Also what is connected at the output of VISS? Is there a scalar or LDC ? or VISS output directly goes to the display? 

    Regards,

    Brijesh

  • Hi。

    We only have 3 nodes。I communicated with my Shanghai Ti colleague that there is only 25FPS camera. Can you help me verify the test results of 5-way 30fps (1920*1200) through ISP

  • Hi,

    Are you using single capture node to capture all 5 cameras? Also have you tried increasing number of buffers between csirx and viss? 

    VISS is still running fine, it is taking 19.8ms to process 5 camera, which means 3.96ms for a camera. This seems to matching with the actual performance.. 

    Are you using multiple graphs? Can you please explain your setup details?

    Regards,

    Brijesh

  • Yes, I now use a capture node to capture 5 channels. I added the buf of csirx, which has no effect. After communicating with the Ti colleagues in Shanghai, I found that the scheduling of csi1 node data to the viss node will be slow. I am trying to create two pipelines to process it. Hope Ti colleagues can conduct synchronous tests