SK-AM62A-LP: CPP DL inference on board throws exception when running with acceleration enabled

Part Number: SK-AM62A-LP

Tool/software:

I'm trying to run a custom ONNX model with a program built from adapting the examples available at https://github.com/TexasInstruments/edgeai-tidl-tools/tree/master/examples/osrt_cpp/ort.

This runs fine with no acceleration ("-a 0"), but when running with "-a 1" it throws the following exception:
"
terminate called after throwing an instance of 'Ort::Exception'
  what():  /root/onnxruntime/onnxruntime/core/providers/tidl/tidl_execution_provider.cc:94 onnxruntime::TidlExecutionProvider::TidlExecutionProvider(const onnxruntime::TidlExecutionProviderInfo&) status == true was false.

Aborted (core dumped)
"

This is also the same behavior I encountered when running this option in x86 (I think I didn't provide the flag and it just defaulted to 1).
I tried also exported this variable before running the application:
export TIDL_RT_ONNX_VARDIM=1

I was also able to run this model using the Python API, with acceleration.

  • Hello Jose,

    Please supply the following additional context:

    • Is your application running on the target or on x86 host w/ emulation? Is this happening for one environment and not the other? 
      • sometimes target will fail due to memory-allocations that are not modeled in x86 emulation environment
    • which SDK are you using? What is the current tag of the edgeai-tidl-tools repo in your local clone?
    • Can you supply the full log (with any sensitive details omitted) ? Please set the debug_level [1] to 1 or 2?

    [1] https://github.com/TexasInstruments/edgeai-tidl-tools/blob/95ba2c7ec62bbedeb637d7a5c0273fcede21cac9/examples/osrt_cpp/ort/onnx_main.cpp#L346 

    BR,
    Reese

  • Hello Reese

    - This happens both on x86 host and running on the target.

    - I'm using  SDK "11_00_07_00". It was updated on the target using this approach: https://github.com/TexasInstruments/edgeai-tidl-tools/blob/11_00_07_00/docs/update_target.md

    - Changing the debug_level to "1" or "2" didn't affect anything, but I changed both the log level in Ort::Env to verbose and Ort::SessionOptions severity level to 0, and that provided some more info. Both outputs seem pretty similar to me, but I'm posting them both anyway. I only redacted a couple of paths, but whatever doesn't show a value did not have one in the config. The labels path is defaulted but I'm not using it, so it doesn't exist. I also tried running this with device_type set to gpu, but the output is the same.

    x86 host output
    ***** Display run Config: start *****
    verbose level set to: 1
    accelerated mode set to: 1
    device mem set to: 1
    loop count set to: 1
    model path set to:
    model artifacts path set to: MODEL_ARTIFACTS_PATH
    image path set to: IMAGE_PATH
    device_type set to: cpu
    labels path set to: test_data/labels.txt
    num of threads set to: 4
    num of results set to: 5
    num of warmup runs set to: 2

    ***** Display run Config: end *****
    [09:39:38.000.000000]:INFO:[runInference:0279] accelerated mode
    [09:39:38.000.000092]:INFO:[runInference:0282] artifacts: MODEL_ARTIFACTS_PATH
    2025-08-14 09:39:38.618467384 [I:onnxruntime:, inference_session.cc:284 operator()] Flush-to-zero and denormal-as-zero are off
    2025-08-14 09:39:38.618517598 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
    2025-08-14 09:39:38.618545015 [I:onnxruntime:, inference_session.cc:310 ConstructorCommon] Dynamic block base set to 0
    2025-08-14 09:39:38.648160234 [I:onnxruntime:test, bfc_arena.cc:27 BFCArena] Creating BFCArena for Tidl with following configs: initial_chunk_size_bytes: 1048576 max_dead_bytes_per_chunk: 134217728 initial_growth_chunk_size_bytes: 2097152 memory limit: 18446744073709551615 arena_extend_strategy: 0
    2025-08-14 09:39:38.648212805 [V:onnxruntime:test, bfc_arena.cc:63 BFCArena] Creating 21 bins of max chunk size 256 to 268435456
    2025-08-14 09:39:38.648242160 [I:onnxruntime:test, bfc_arena.cc:27 BFCArena] Creating BFCArena for TidlCpu with following configs: initial_chunk_size_bytes: 1048576 max_dead_bytes_per_chunk: 134217728 initial_growth_chunk_size_bytes: 2097152 memory limit: 18446744073709551615 arena_extend_strategy: 0
    2025-08-14 09:39:38.648259048 [V:onnxruntime:test, bfc_arena.cc:63 BFCArena] Creating 21 bins of max chunk size 256 to 268435456
    libtidl_onnxrt_EP loaded 0x5635c4ed1370
    terminate called after throwing an instance of 'Ort::Exception'
      what():  /root/onnxruntime/onnxruntime/core/providers/tidl/tidl_execution_provider.cc:94 onnxruntime::TidlExecutionProvider::TidlExecutionProvider(const onnxruntime::TidlExecutionProviderInfo&) status == true was false.

    Aborted (core dumped)

    am62a target output
    ***** Display run Config: start *****
    verbose level set to: 1
    accelerated mode set to: 1
    device mem set to: 1
    loop count set to: 1
    model path set to:
    model artifacts path set to: MODEL_ARTIFACTS_PATH
    image path set to: IMAGE_PATH
    device_type set to: cpu
    labels path set to: test_data/labels.txt
    num of threads set to: 4
    num of results set to: 5
    num of warmup runs set to: 2

    ***** Display run Config: end *****
    [23:48:31.000.000000]:INFO:[runInference:0279] accelerated mode
    [23:48:31.000.000115]:INFO:[runInference:0282] artifacts: MODEL_ARTIFACTS_PATH
    1970-01-03 23:48:31.941629815 [I:onnxruntime:, inference_session.cc:284 operator()] Flush-to-zero and denormal-as-zero are off
    1970-01-03 23:48:31.941756785 [I:onnxruntime:, inference_session.cc:292 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
    1970-01-03 23:48:31.941814355 [I:onnxruntime:, inference_session.cc:310 ConstructorCommon] Dynamic block base set to 0
    1970-01-03 23:48:32.053108260 [I:onnxruntime:test, bfc_arena.cc:27 BFCArena] Creating BFCArena for Tidl with following configs: initial_chunk_size_bytes: 1048576 max_dead_bytes_per_chunk: 134217728 initial_growth_chunk_size_bytes: 2097152 memory limit: 18446744073709551615 arena_extend_strategy: 0
    1970-01-03 23:48:32.053204835 [V:onnxruntime:test, bfc_arena.cc:63 BFCArena] Creating 21 bins of max chunk size 256 to 268435456
    1970-01-03 23:48:32.053266650 [I:onnxruntime:test, bfc_arena.cc:27 BFCArena] Creating BFCArena for TidlCpu with following configs: initial_chunk_size_bytes: 1048576 max_dead_bytes_per_chunk: 134217728 initial_growth_chunk_size_bytes: 2097152 memory limit: 18446744073709551615 arena_extend_strategy: 0
    1970-01-03 23:48:32.053298135 [V:onnxruntime:test, bfc_arena.cc:63 BFCArena] Creating 21 bins of max chunk size 256 to 268435456
    libtidl_onnxrt_EP loaded 0x403cd100
    terminate called after throwing an instance of 'Ort::Exception'
      what():  /root/onnxruntime/onnxruntime/core/providers/tidl/tidl_execution_provider.cc:94 onnxruntime::TidlExecutionProvider::TidlExecutionProvider(const onnxruntime::TidlExecutionProviderInfo&) status == true was false.

    Aborted (core dumped)

  • After experimenting with the CPP inference of the TIDL examples, I found out I have the same issue with those. With both of these tags for edgeai-tidl-tools: 10_01_04_00, 11_00_07_00.

    Again, the Python inference works fine in both cases. I assume something must be wrong with my setup somewhere, but it's not completely broken.
     

  • Hi Jose,

    I'm willing to bet something is wrong with the configuration of the TIDL-specific elements of the ONNXRuntime API. Since the debug_level didn't make a difference, it is probably something basic, like a valid path to their the model itself or the model-artifacts directory. In edgeai-tidl-tools, those artifacts would nominally be at edgeai-tidl-tools/model-artifacts/$MODEL_NAME/artifacts, and will contain files like below. 


    cl-ort-resnet18-v1$ tree -L 2
    
    ├── artifacts
    │  ├── allowedNode.txt
    │  ├── onnxrtMetaData.txt
    │   ├── subgraph_0_tidl_io_1.bin
    │   ├── subgraph_0_tidl_net.bin
    │   └── tempDir
    ├── dataset.yaml
    ├── model
    │   └── resnet18_opset9.onnx
    └── param.yaml

    What TIDL wants is there in the artifacts directory. The binaries encode the actual network and IO configuration, and the other two TXT files help ONNX understand how to treat the layers within the .ONNX file as it pertains to running them on the C7xMMA accelerator.  

    [23:48:31.000.000115]:INFO:[runInference:0282] artifacts: MODEL_ARTIFACTS_PATH

    Did you edit the log to obfuscate sensitive information, or is this what you set it to when running the application? If the latter is true, then this is most likely the cause of your error. 

    Otherwise, I may need to see a few snippets of code for how you are configuring the runtime, assuming it is not the same as the ort_main.cpp example's code. 

    BR,
    Reese

  • Hello Reese

    I did set a valid path there. Having invalid paths will result in different errors.

    Like I mentioned in a follow-up message, I also have the same issue running the TIDL example, using the commands provided in the examples (which set these paths).  This is in a clean edgeai-tidl-tools, after Python example models compilation, tag 10_01_04_00.
    Here is a full, non-obfuscated example, with both command and output:


    ./bin/Release/ort_main -f model-artifacts/cl-ort-resnet18-v1/artifacts  -i test_data/airshow.jpg

    ***** Display run Config: start *****
    verbose level set to: 3
    accelerated mode set to: 1
    device mem set to: 1
    loop count set to: 1
    model path set to:
    model artifacts path set to: model-artifacts/cl-ort-resnet18-v1/artifacts
    image path set to: test_data/airshow.jpg
    device_type set to: cpu
    labels path set to: test_data/labels.txt
    num of threads set to: 4
    num of results set to: 5
    num of warmup runs set to: 2

    ***** Display run Config: end *****
    libtidl_onnxrt_EP loaded 0x55b6a0cd8df0
    terminate called after throwing an instance of 'Ort::Exception'
      what():  /root/onnxruntime/onnxruntime/core/providers/tidl/tidl_execution_provider.cc:94 onnxruntime::TidlExecutionProvider::TidlExecutionProvider(const onnxruntime::TidlExecutionProviderInfo&) status == true was false.

    Aborted (core dumped)

  • I've now tried running the examples with the Docker setup, and I still cannot run the acceleration emulation mode in CPP for the TI examples.

    edgeai-tidl-tools tag 10_01_04_00
    export SOC=am62a
    source setup.sh
    source ./setup_env.sh ${SOC}
    mkdir build && cd build
    cmake ../examples && make -j2 && cd ..
    source ./scripts/run_python_examples.sh

    root@e41c1556dbe8:/home/root# ./bin/Release/ort_main -f model-artifacts/cl-ort-resnet18-v1/artifacts  -i test_data/airshow.jpg

    ***** Display run Config: start *****
    verbose level set to: 3
    accelerated mode set to: 1
    device mem set to: 1
    loop count set to: 1
    model path set to:
    model artifacts path set to: model-artifacts/cl-ort-resnet18-v1/artifacts
    image path set to: test_data/airshow.jpg
    device_type set to: cpu
    labels path set to: test_data/labels.txt
    num of threads set to: 4
    num of results set to: 5
    num of warmup runs set to: 2

    ***** Display run Config: end *****
    libtidl_onnxrt_EP loaded 0x556a01a68df0
    terminate called after throwing an instance of 'Ort::Exception'
      what():  /root/onnxruntime/onnxruntime/core/providers/tidl/tidl_execution_provider.cc:94 onnxruntime::TidlExecutionProvider::TidlExecutionProvider(const onnxruntime::TidlExecutionProviderInfo&) status == true was false.


    Are these steps correct? I'm using the Docker setup, so I assume the setup itself to be correct. Can you successfully run the CPP examples? Should I try Ubuntu 24.04?

  • Hello Jose,

    I see this error on the target when running this same command, but not when running in Ubuntu 22.04 (native) -- host environment does not run into this issue. 

    So you are seeing the same ONNX error when running on target and host with TIDLExecutionProvider targeted via "-a 1" cli-arg. I only see the behavior on target

    Your steps are correct, as is your command. I am not sure the source of this issue. It is happening within the call from ONNX into TIDLExecutionProvider -- it is returning false for some reason. Let me seek an answer internally.

    BR,

    Reese

  • Hello Reese

    Thank you for the feedback.

    I'm running into the error on a Ubuntu 22.04 based VM distribution. This also happens on the Docker setup using the Dockerfile provided in edgeai-tidl-tools. Of course, the source could very well be tied to the Virtualbox emulation in both cases.

  • I've raised this issue with the development team (TIDL-8013). The source of the error is within our ONNXRuntime fork . If it were part of the arm-tidl portion that interfaces Onnxruntime to the accelerator via TIDL-RT and TIOVX, then I would have more direct suggestions. 

    It is odd that we have different observation for the x86 host-side of this. Since it is the same error on target and host in your case, I think it is likely the same issue. 

    BR,
    Reese

  • Thank you, Reese.

    Apparently it's possible to perform inference with the pre-buit app that uses the low-level API  (/opt/tidl_test/TI_DEVICE_armv8_test_dl_algo_host_rt.out), at least for SDK 10.01.00.05.

    TIDL-8013 is an internal ticket, correct? Or is it something I can track?

  • Hi Jose,

    Yes, that binary is a valid way to run the network.

    This uses the TIDLRT interface [1], and is also a lower-levelinterface to run networks using about the tidl_net.bin and tidl_io_1.bin files only. This interface does not require .ONNX model file or other portions of the artifacts. This interface is there for each SDK, and ONNXRuntime utilizes this under-the-hood. 

    TIDLRT is not appropriate if you have unsupported layers that need to run on the CPU. Otherwise, a fully accelerated network can be run either through ONNXRuntime or TIDLRT interfaces.

    Correct, TIDL-8013 is an internal ticket

    [1] https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-j721s2/08_05_00_11/exports/docs/tidl_j721s2_08_05_00_16/ti_dl/docs/user_guide_html/md_tidl_model_import.html

    • PSDK-RTOS documentation (for TDA4x devices, but still applies) and ti-firmware-builder (for AM62A) have more up-to-date user-guide, but documentation is generally more limited on this interface. 

    BR,
    Reese