This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VH-Q1: EDGE-AI-TOOLS | Model Inference Errors

Part Number: TDA4VH-Q1
Other Parts Discussed in Thread: TDA4VH

Tool/software:

Hello Team, 

We have done model compilation and when trying to do model inference on TDA4VH platform we are getting the following errors

root@j784s4-evm:/opt/edgeai-tidl-tools# ./bin/Release/ort_main -f model-artifacts/cl-ort-resnet18-v1/artifacts -i test_data/airshow.jpg -a 1

***** Display run Config: start *****
verbose level set to: 3
accelerated mode set to: 1
device mem set to: 1
loop count set to: 1
model path set to:
model artifacts path set to: model-artifacts/cl-ort-resnet18-v1/artifacts
image path set to: test_data/airshow.jpg
device_type set to: cpu
labels path set to: test_data/labels.txt
num of threads set to: 4
num of results set to: 5
num of warmup runs set to: 2

***** Display run Config: end *****
libtidl_onnxrt_EP loaded 0x3ea086f0
terminate called after throwing an instance of 'Ort::Exception'
what(): /root/onnxruntime/onnxruntime/core/providers/tidl/tidl_execution_provider.cc:94 onnxruntime ::TidlExecutionProvider::TidlExecutionProvider(const onnxruntime::TidlExecutionProviderInfo&) status = = true was false.

Aborted (core dumped)

We tried with python as well which generated the following error

root@j784s4-evm:/opt/edgeai-tidl-tools/examples/osrt_python/ort# python3 onnxrt_ep.py
Available execution providers : ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']


Running 1 Models - ['cl-ort-resnet18-v1_low_latency']


Running_Model : cl-ort-resnet18-v1_low_latency

libtidl_onnxrt_EP loaded 0x332bea60
Final number of subgraphs created are : 1, - Offloaded Nodes - 52, Total Nodes - 52
APP: Init ... !!!
4194.801576 s: MEM: Init ... !!!
4194.801625 s: MEM: Initialized DMA HEAP (fd=5) !!!
4194.801765 s: MEM: Init ... Done !!!
4194.801785 s: IPC: Init ... !!!
4194.833762 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
4194.841211 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
4194.841307 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
4194.841317 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
4194.841324 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
4194.841883 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0
4194.842005 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1
4194.842104 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2
4194.842223 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3
4194.842239 s: VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
4194.842256 s: VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
4194.865933 s: VX_ZONE_ERROR: [ownContextSendCmd:1001] Command ack message returned failure cmd_status: -1
4194.865979 s: VX_ZONE_ERROR: [ownNodeKernelInit:704] Target kernel, TIVX_CMD_NODE_CREATE failed for node node_136
4194.865988 s: VX_ZONE_ERROR: [ownNodeKernelInit:705] Please be sure the target callbacks have been registered for this core
4194.866003 s: VX_ZONE_ERROR: [ownNodeKernelInit:706] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
4194.866016 s: VX_ZONE_ERROR: [ownGraphNodeKernelInit:793] kernel init for node 0, kernel com.ti.tidl:1:1 ... failed !!!
4194.866056 s: VX_ZONE_ERROR: [ graph_116 ] Node kernel init failed
4194.866066 s: VX_ZONE_ERROR: [ graph_116 ] Graph verify failed
4194.866196 s: VX_ZONE_ERROR: [ownContextSendCmd:1001] Command ack message returned failure cmd_status: -1
4194.866215 s: VX_ZONE_ERROR: [ownNodeKernelInit:704] Target kernel, TIVX_CMD_NODE_CREATE failed for node node_83
4194.866226 s: VX_ZONE_ERROR: [ownNodeKernelInit:705] Please be sure the target callbacks have been registered for this core
4194.866236 s: VX_ZONE_ERROR: [ownNodeKernelInit:706] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
4194.866248 s: VX_ZONE_ERROR: [ownGraphNodeKernelInit:793] kernel init for node 0, kernel com.ti.tidl:1:1 ... failed !!!
4194.866261 s: VX_ZONE_ERROR: [ TIDL subgraph 191 ] Node kernel init failed
4194.866270 s: VX_ZONE_ERROR: [ TIDL subgraph 191 ] Graph verify failed
Traceback (most recent call last):
File "/opt/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py", line 602, in <module>
run_model(model, mIdx)
File "/opt/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py", line 366, in run_model
sess = rt.InferenceSession(
^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 387, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 439, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Create state function failed. Return value:-1

 

How to run the model using TIDLExecutionProvider?

Best Regards

Reshma

 

  • Hi Reshma, 

    Could you share your SDK version, model configuration, artifacts and model so I can investigate and run on our end? 

    Warm regards,

    Christina

  • Hi Chrstina,

    We are using SDK version 11_01_06_00. PFB version summary from model compilation

    ============================== [Version Summary] ==============================

    -------------------------------------------------------------------------------
    | TIDL Tools Version | 11_01_06_00 |
    -------------------------------------------------------------------------------
    | C7x Firmware Version | 11_01_06_00 |
    -------------------------------------------------------------------------------
    | Runtime Version | 1.15.0 |
    -------------------------------------------------------------------------------
    | Model Opset Version | 9 |
    -------------------------------------------------------------------------------

    This model is being compiled and ran by directly cloning from github (GitHub - TexasInstruments/edgeai-tidl-tools at 11_01_06_00) without any modifications. Model configurations are kept same as in edgeai-tidl-tools/examples/osrt_python/model_configs.py at 11_01_06_00 · TexasInstruments/edgeai-tidl-tools · GitHub .

    The required directories and model-artefacts are copied into target device and used edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py at 11_01_06_00 · TexasInstruments/edgeai-tidl-tools · GitHub script to run the example. Pasting output once again below

    root@j784s4-evm:/opt/reshma/pre-compiled/examples/osrt_python/ort# python3 onnxrt_ep.py -m cl-ort-resnet18-v1_low_latency
    Available execution providers : ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

    Running 1 Models - ['cl-ort-resnet18-v1_low_latency']


    Running_Model : cl-ort-resnet18-v1_low_latency

    libtidl_onnxrt_EP loaded 0x1b279e20
    Final number of subgraphs created are : 1, - Offloaded Nodes - 52, Total Nodes - 52
    APP: Init ... !!!
    4862.965412 s: MEM: Init ... !!!
    4862.965465 s: MEM: Initialized DMA HEAP (fd=5) !!!
    4862.965622 s: MEM: Init ... Done !!!
    4862.965643 s: IPC: Init ... !!!
    4862.993986 s: IPC: Init ... Done !!!
    REMOTE_SERVICE: Init ... !!!
    REMOTE_SERVICE: Init ... Done !!!
    4863.003184 s: GTC Frequency = 200 MHz
    APP: Init ... Done !!!
    4863.005382 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
    4863.005401 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
    4863.005411 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
    4863.008029 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0
    4863.008143 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1
    4863.008235 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2
    4863.008322 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3
    4863.008333 s: VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
    4863.008347 s: VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
    4863.040473 s: VX_ZONE_ERROR: [ownContextSendCmd:1001] Command ack message returned failure cmd_status: -1
    4863.040524 s: VX_ZONE_ERROR: [ownNodeKernelInit:704] Target kernel, TIVX_CMD_NODE_CREATE failed for node node_136
    4863.040536 s: VX_ZONE_ERROR: [ownNodeKernelInit:705] Please be sure the target callbacks have been registered for this core
    4863.040546 s: VX_ZONE_ERROR: [ownNodeKernelInit:706] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
    4863.040557 s: VX_ZONE_ERROR: [ownGraphNodeKernelInit:793] kernel init for node 0, kernel com.ti.tidl:1:1 ... failed !!!
    4863.040591 s: VX_ZONE_ERROR: [ graph_116 ] Node kernel init failed
    4863.040600 s: VX_ZONE_ERROR: [ graph_116 ] Graph verify failed
    4863.040718 s: VX_ZONE_ERROR: [ownContextSendCmd:1001] Command ack message returned failure cmd_status: -1
    4863.040734 s: VX_ZONE_ERROR: [ownNodeKernelInit:704] Target kernel, TIVX_CMD_NODE_CREATE failed for node node_83
    4863.040743 s: VX_ZONE_ERROR: [ownNodeKernelInit:705] Please be sure the target callbacks have been registered for this core
    4863.040751 s: VX_ZONE_ERROR: [ownNodeKernelInit:706] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
    4863.040760 s: VX_ZONE_ERROR: [ownGraphNodeKernelInit:793] kernel init for node 0, kernel com.ti.tidl:1:1 ... failed !!!
    4863.040772 s: VX_ZONE_ERROR: [ TIDL subgraph 191 ] Node kernel init failed
    4863.040780 s: VX_ZONE_ERROR: [ TIDL subgraph 191 ] Graph verify failed
    Traceback (most recent call last):
    File "/opt/reshma/pre-compiled/examples/osrt_python/ort/onnxrt_ep.py", line 550, in <module>
    run_model(model, mIdx)
    File "/opt/reshma/pre-compiled/examples/osrt_python/ort/onnxrt_ep.py", line 348, in run_model
    sess = rt.InferenceSession(
    ^^^^^^^^^^^^^^^^^^^^
    File "/usr/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 387, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
    File "/usr/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 439, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
    onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Create state function failed. Return value:-1
    root@j784s4-evm:/opt/reshma/pre-compiled/examples/osrt_python/ort#

    Best Regards
    Reshma

  • Hi Reshma,

    Did you copy over both the model artifacts dir and the models dir? I tried running it on my side following the instructions on the Github: https://github.com/TexasInstruments/edgeai-tidl-tools/tree/master?tab=readme-ov-file#benchmark-on-ti-soc and was able to complete inference on device, as seen below

     

    root@j784s4-evm:/opt/edgeai-tidl-tools/examples/osrt_python/ort#
    t_ep.py -m cl-ort-resnet18-v1_low_latencyamples/osrt_python/ort# python3 ./onnxrt
    Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']
    
    Running 1 Models - ['cl-ort-resnet18-v1_low_latency']
    
    
    Running_Model :  cl-ort-resnet18-v1_low_latency
    
    libtidl_onnxrt_EP loaded 0x3b6e9470
    Final number of subgraphs created are : 1, - Offloaded Nodes - 52, Total Nodes - 52
    APP: Init ... !!!
      1951.309448 s: MEM: Init ... !!!
      1951.309503 s: MEM: Initialized DMA HEAP (fd=5) !!!
      1951.309635 s: MEM: Init ... Done !!!
      1951.309654 s: IPC: Init ... !!!
      1951.336821 s: IPC: Init ... Done !!!
    REMOTE_SERVICE: Init ... !!!
    REMOTE_SERVICE: Init ... Done !!!
      1951.344748 s: GTC Frequency = 200 MHz
    APP: Init ... Done !!!
      1951.344846 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
      1951.344856 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
      1951.344865 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
      1951.345502 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0
      1951.345633 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1
      1951.345733 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2
      1951.345833 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3
      1951.345847 s:  VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
      1951.345860 s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
    
     ,  0  23.846985  warplane, military plane ,,  1  22.416164  aircraft carrier, carrier, flattop, attack aircraft carrier ,,  2  19.077587  projectile, missile ,,  3  18.600647  missile ,,  4  16.215950  airliner ,
    
    Saving image to  ../../../output_images/
    
    Saving output tensor to  ../../../output_binaries/
    
    
    Completed_Model :     1, Name : cl-ort-resnet18-v1_low_latency                    , Total time :       3.14, Offload Time :       2.75 , DDR RW MBs : 48, Output Image File : py_out_cl-ort-resnet18-v1_low_latency_airshow.jpg, Output Bin File : py_out_cl-ort-resnet18-v1_low_latency_airshow.bin
    
    
    APP: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... Done !!!
      1952.573432 s: IPC: Deinit ... !!!
      1952.574433 s: IPC: DeInit ... Done !!!
      1952.574458 s: MEM: Deinit ... !!!
      1952.574538 s: DDR_SHARED_MEM: Alloc's: 26 alloc's of 17457248 bytes
      1952.574548 s: DDR_SHARED_MEM: Free's : 26 free's  of 17457248 bytes
      1952.574556 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes
      1952.574568 s: MEM: Deinit ... Done !!!
    APP: Deinit ... Done !!!
    

    I also recommend first creating you artifacts under the Docker instructions found here: https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/advanced_setup.md#docker-based-setup-for-x86_pc

    If this, plus a clean build doesn't help, please share your model artifacts with me.

    Warm regards,

    Christina

  • Hi Christina,

    I have used the prebuilt sdk from PROCESSOR-SDK-RTOS-J784S4 Software development kit (SDK) | TI.com (11.00.00.06 version) for flashing. For edgeai-tools the following version is used GitHub - TexasInstruments/edgeai-tidl-tools at 11_01_06_00. I have done the setup in docker environment and able to compile and run the model inference successfully on the docker environment. 

    After this, I have copied the models and artifacts into the evm and tried running and getting same error. Please tell me what I am missing here. I cannot upload artifacts due to IT restrictions. Apologies for that. But I have followed the exact steps given in github page.

    Can you please tell me the SDK versions and edgeai-tidl-tools version you have used for testing this?

    Best Regards,

    Reshma

  • Adding to the above,

    We have tried with edgeai-tidl-tools version 11_00_08_00 with c7x version set to 11_00_08_00 also. But results are same. Even I have tried updating the firmware on the target by following edgeai-tidl-tools/docs/update_target.md at master · TexasInstruments/edgeai-tidl-tools · GitHub.

    Please help me in resolving this.

  • Hi Reshma,

     (11.00.00.06 version) for flashing. For edgeai-tools the following version is used GitHub - TexasInstruments/edgeai-tidl-tools at 11_01_06_00.

    For the 11.00.00.06 image, the edgeai-tidl-tools version needs to be either 11.00.06.00 or after updating firmware using the directions you saw on the Github, for 11.00.08.00.

    Edgeai-tidl-tools version 11.01.06.00 will only work for Processor SDK LINUX 11.01.00.03 or Processor SDK RTOS 11.01.00.04

    You mentioned earlier you were using SDK 11.01.06.00 however, there is no SDK of that version. Just wanted to clarify this, as the versions of edgeai-tidl-tools need to be paired with the appropriate version. https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/version_compatibility_table.md

    Could you clarify which prebuilt image you used (version) for the 11.00.08.00 ?

    Warm regards,

    Christina