This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: TIVX VISS node cannot run parallelly on TDA4VH with SDK 0805

Part Number: TDA4VM
Other Parts Discussed in Thread: TDA4VH,

Hello, TI experts,

I'm working on a TDA4VH development board, with sdk0805. I found that two VISS node with different parameters cannot run parallelly even in sdk0805.

Here is what I did:

I wrote two programs, each creating a TIVX graph containing only 1 VISS node. The graph reads raw images from local file system, and output YUV images. The difference between these two programs is thart they use raw images of different resolution, and different VISS node parameters.

When program1 set its node target to "TIVX_TARGET_VPAC_VISS1" and program2 set its node target to "TIVX_TARGET_VPAC2_VISS1", they can run parallelly and work well.

But, when both of these two programs use "TIVX_TARGET_VPAC_VISS1" as their node target, they cannot run parallelly.  The attachment is thelog from "vx_app_arm_remote_log.out", TIVX_ZONE_INFO is enabled for debug. The log implies that it is blocked in the process function of VISS node.

Could you help check this case? And is there any example demonstrating this feature? Thanks.

rproc-log.one-viss-target.log

  • Hi haijun,

    I see few errors like below when submitting frames for the second viss node, can you please check why these are coming? 

    [MCU2_0]    134.722943 s:  VX_ZONE_ERROR:[tivxVpacVissProcess:1050] Failed to Submit Request

    Are there any changes in configuration from one frame to another for the same node? I see node51 does not run fine for all iterations.. it returns above error for few times and does also run fine for few times. But looks like hang happens only node51 is executed. 

    what all parameters are different between two nodes? Are you using LSC and H3A in the DCC parameters? Have you made sure that the LSC and H3A parameters are different, since resolution is different? 

    Regards,

    Brijesh

  • Also is possible to share this test case to check it on EVM? 

    Regards,

    Brijesh

  • I see few errors like below when submitting frames for the second viss node, can you please check why these are coming? 

    This implies the function "Fvid2_processRequest" returned failure, but I'm not familiar with fvid2, so this is what I want to figure out, too.

    Are there any changes in configuration from one frame to another for the same node?

    No! node parameters kept the same during the test.

    But looks like hang happens only node51 is executed. 

    According to the log, node16 is of the 1st program, node51 is of the 2nd program. Once node51 started running, node16 hangs on kernel process function. node16 did not run successfully not even once, this is the key point of this problem.

    what all parameters are different between two nodes? Are you using LSC and H3A in the DCC parameters? Have you made sure that the LSC and H3A parameters are different, since resolution is different? 

    this is how I created VISS node, the only input parameters are configuration and dcc buffer.

    this is the configuration of the first program

    and this is the second one

    and dcc buffers are different, definitely.

  • Hi @haijun 

    This implies the function "Fvid2_processRequest" returned failure, but I'm not familiar with fvid2, so this is what I want to figure out, too.

    Typically this api returns the error when one of the buffer pointer of fvid2_frame is set null. So can you please check if they are not set to null? Also can you please confirm if configThroughUdmaFlag flag is set to true when initializing VISS? 

    Regards,

    Brijesh

  • So can you please check if they are not set to null?

    Yes, one of them was set to null. But I think it doesn't matter, I see this error when I run application separately sometimes, too.

    if configThroughUdmaFlag flag is set to true when initializing VISS? 

    It is set to true.

    rproc-log.one-viss-node.1.log

    The log in the original post may not be complete, due to buffer mode. The attachment is another log, the end of this log says:

    [MCU2_0]    441.108251 s:  VX_ZONE_INFO:[ownTargetKernelExecute:380] Executing process callback for kernel [com.ti.hwa.vpac_viss]
    [MCU2_0]    441.108445 s: UDMA : ERROR: TR Response not completed!!
    [MCU2_0]    441.108499 s:  VX_ZONE_ERROR:[vhwaVissRestoreCtx:1847] Failed to restore Context !!!

    This error log may help a lot.

  • Hi Haijun,

    Looking at the code in the VISS, the return value from vhwaVissRestoreCtx is not really being used in ti-processor-sdk-rtos-j784s4-evm-08_05_00_11\tiovx\kernels_j7\hwa\vpac_viss\vx_vpac_viss_target.c file, so even if it prints this error, it should not cause VISS to halt. 

    For the time being, can you please comment out macro VHWA_VISS_CTX_SAVE_RESTORE_USE_DMA in ti-processor-sdk-rtos-j784s4-evm-08_05_00_11\tiovx\kernels_j7\hwa\vpac_viss\vx_vpac_viss_target.c file and try it? In this case, code will use memcpy, instead of DMA copy for context save/restore?

    Regards,

    Brijesh

  • Dear Brijesh,

         Haijun told me that after he applied your suggestion it is still possible that MCU2_0 will halt. He runs vx_app_arm_ipc.out however cannot get response from MCU2_0. Here are their test code and data:

         https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/raw_5F00_images.tgzhttps://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/viss_5F00_test.tgz

         Their test is simply run ./app_viss.out for first app and ./app_viss.out 1 for second app.

         Could you help check if you can reproduce it on your side? Otherwise shall we have an online chat with Haijun for better communication?

    BR

    Sikai

  • Hi Sikai,

    ok, i will try to check it out by this week, or early next week.

    Regards,

    Brijesh

  • Thanks so much, if you need any extra information please let me know.

    BR

    Sikai

  • Hi Sikai,

    I checked this application on SDK8.05, I see it failing after few iterations.. Its also returning error that "[MCU2_0]     49.990069 s:  VX_ZONE_ERROR:[tivxVpacVissProcess:1050] Failed to Submit Request". Seems it is aborting, after this error and so handing. 

    I will check further and update you.

    Regards,

    Brijesh

  • Hi, Brijesh,

    I wonder any update on this?

  • Hi Haijun,

    not yet, we are still looking into it.

    Rgds,

    Brijesh

  • Haijun, can you please add an arg parameter to your program to differentiate the "obj->viss_params.channel_id" as 0,1,2..., to see if it helps?

    thanks. 

  • Hello, 

    I tried this, both on J721E and J784s4, with sdk0805, and finally found it made no difference.

    Here is the terminal output of my test.

    MobaXterm_SerialCOM_20230208_142813.txt

  • Hi Haijun,

    Do you see this issue even on J721E? I thought this was more related to J784s4.

    Regards,

    Brijesh

  • Haijun, from this log, i didn't see how to separate the chanid in the vissobj. please check. and also always run chanid 0 as the first program and then chanid 1 for the second run.  

  • Hello,  Xu(SH) Liu and Brijesh,

    I'm sorry that I muddled my application.

    But the test result did not change, this did not works.

    I added this line in my code:

    Here is my terminal output for J721E and J784s4, you can see lines like this:

    app_viss_module.c[87](dbg): viss parameters channel id is 1

    which indecates the channel id has been changed.

    VISS-multi_params_test-J784s4.txtVISS-multi_params_test-J721E.txt

     

    Many thanks to your support!

  • if (vissObj->o_enable_replicate){
    vx_bool replicate[] = {vx_false_e, vx_false_e, vx_false_e, vx_true_e, vx_false_e, vx_false_e, vx_true_e, vx_false_e, vx_false_e, vx_true_e, vx_false_e, vx_false_e, vx_false_e};
    status = vxReplicateNode(graph, vissObj->i_node, replicate, 13);
    LOG_IF_FAIL(status);
    }

    Don't know what is reason to do this in your test code, but please remove it and have a try. 

    thanks. 

  • Yes, if you are using VISS node for single channel/camera, please comment out the call to vxReplicateNode. This API typically used only when node is used in multi-channel mode..

  • I tried it, and it doesn't help, either.

    Actually, I also noticed this error, but it doesn't matter, so I take no action on it, anyway. I have looked into the function vxReplicateNode, it returned error when checking parameters, the graph has not been affected essentially.

  • Yes, I know about this. But as far as this problem, I don't think this is the key point.

  • it seems it is failing even on TDA4VM. On TDA4VM, i see the error "Failed to Submit Request" from VISS. Most likely this is coming due to null pointer for the output buffer..

    If i change the NUM_BUF to 1, this error goes away, so definitely, output buffers are not getting allocated correctly. 

    But even with this change VISS is hanging. I see below error. Seems GLBCE CTX Save/Restore is not working somehow. 

    Also some parameters are not getting restored correct between two channels. I am seeing VISS hang.. 

    [MCU2_0] 126.914522 s: UDMA : ERROR: TR Response not completed!!
    [MCU2_0] 126.914625 s: VX_ZONE_ERROR:[vhwaVissRestoreCtx:1847] Failed to restore Context !!!

    Let me see which parameter is causing this hang issue.. 

    Regards,

    Brijesh

  • Hi,

    The issues are coming because GLBCE is not enabled for the sensor AR0233. GLBCE is a special modules, requiring configuration in order to enable it. Now since it is disabled for one of the sensor, the extr configuration is not properly done. 

    For the time being, can you please enable GLBCE for AR0233? This should solve the issue that you are facing..

    I am trying to see how we can properly fix in the driver and OpenVX Node.

    Regards,

    Brijesh

  • Hi,

    Can you please apply attached patches, rebuild SDK and check again? With these patches, i atleast dont see any hang/crash on the J721E. It should also work on J784S4.

    - Please apply below patch on ti-processor-sdk-rtos-j784s4-evm-08_05_00_11\pdk_j784s4_08_05_00_37 folder 

    /cfs-file/__key/communityserver-discussions-components-files/791/VHWA_5F00_FIX_5F00_For_5F00_GLBCE_5F00_Issue.patch

    - Please apply below patch on ti-processor-sdk-rtos-j784s4-evm-08_05_00_11\tiovx folder.

    /cfs-file/__key/communityserver-discussions-components-files/791/TIOVX_5F00_VHWA_5F00_FIX_5F00_FOR_5F00_GLBCE_5F00_Issue.patch

    Regards,

    Brijesh

  • These two patches works! Both on j721e and j784s4, with no change to channel id or sensor driver!

    Thanks very much for your support !