This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[FAQ] EDGE-AI-STUDIO: Is SDK version important for Edge AI and TI Deep learning (TIDL) with C7x on AM6xA SoCs [AM62A, AM67A, AM68A, AM68PA, AM69A]?

Part Number: EDGE-AI-STUDIO
Other Parts Discussed in Thread: AM68A, AM67A, SK-AM62A-LP

Tool/software:

We receive frequent queries related to SDK versions and TI Deep Learning (TIDL). Please find a series of relevant FAQ topic below. Note that these topics are most relevant for the Linux / Edge AI SDK on Am6xA SoCs like AM62A and AM68A, which use the C7xMMA AI Accelerator with TIDL. Some topics are also true for TDA4x SoCs.

  • In some places, SoCs are referred to by J7x software names, such as j721e, j721s2, j784s4, and j722s

 

Too long; didn’t read (TL;DR) :

  • Yes, the SDK version is important for the TIDL stack.
  • A model is compiled into TIDL artifacts (a series of files). These artifacts are linked against and optimized for a specific major +minor SDK version, e.g. 09.02, 10.01, and SoC.
    • These model artifacts will work in an installed SDK for that version.
    • Model artifacts are only usable on the SoC it was compiled for.

 

Queries covered here:

  1. how to know if you have the wrong SDK version vs. TIDL artifacts
  2. Is there any backwards compatibility for TIDL versions within an SDK?
  3. Which libraries within the Edge AI SDK are part of this version-control?
  4. Recommendations for managing multiple TIDL versions for separate projects

 

Note that you may check the SDK version in Edge AI SDKs (AM6xA SoCs, e.g. AM62A) with the EDGEAI_SDK_VERSION environment variable, like so:

echo $EDGEAI_SDK_VERSION
10_00_00

It is often advised to run the TI OpenVX logger in the background when debugging TIDL and other vision-related tasks:

/opt/vx_app_arm_remote_log.out & 

The most up-to-date source for TIDL-related tools, documentation, and other information is the edgeai-tidl-tools repo on Github. Please see the version_compatibility_table.md for the most complete information on SDK versions and supported combinations. Note that this repo is tightly version controlled and has tags associated with an SDK release and the TIDL bugfix release. These releases also explain new features, fixed bugs, and known issues.

e.g. tag 10_00_08_00 --> 10_00 for SDK 10.0 and _08_00 for bugfix release 8

  • Q1: I compiled my model and it isn’t running on the target.

    I see strange output and don’t know how to interpret. What do either of these printouts mean?

    From stdout during model initialization:

    ```

    TIDL_RT_OVX: ERROR: Config file size (94616 bytes) does not match size of sTIDL_IOBufDesc_t (37272 bytes)

    ```

    and from the TI OpenVX logger /opt/vx_app_arm_remote_log.out’s printout:

    ```

    VX_ZONE_ERROR:[tivxKernelTIDLCreate:910] Network version - 0x20240215, Expected version - 0x20240401

    ```

    A1: This means the model artifacts files you have are not compatible with the SDK installed.

    To solve, you may either modify your SDK version installed to the device or recompile the model artifacts to target the SDK you have installed.

    The config file size in the first printout refers to a file ending with “tidl_io_1.bin” in your model artifacts. It is comparing the size of this file to a predefined struct, sTIDL_IOBufDesc_t. The size of this struct has increased as the software matures and more features added. A higher value for the struct size means you have installed a later SDK version than your artifacts are target to. If the second value is smaller, as shown above, then the SDK installed is older than the version targeted by your artifacts. The above shows artifacts from 10.0 SDK being used on 9.1 SDK.

    One other possibility is that the provided network artifacts were compiled for a different device, e.g. AM62A vs. AM67A. Depending on which SoC and SDK you are targeting vs. using, the errors may show differently

     

    Similarly, the VX_ZONE_ERROR indicates a mismatch in the provided network files vs. the version expected by the tools.

    If the Network Version string is 0x00000000, then the model compilation / import process did not complete. This is not a problem on the target device / SDK. Please investigate logs from compilation to diagnose the issue.

     

    Please ensure you are compiling/downloading model artifacts for the right SDK version and SoC.

  • Q2: Is there any backwards compatibility for TIDL versions within an SDK?

    I encountered a bug and am waiting on a fix. This fix is available in the latest SDK, but I have designed around the previous SDK version and cannot upgrade the entire SDK. How do I get these bugfixes in my SDK version?

    A2: Beginning in the 10.0 SDK, we support backwards compatibility with the previous SDK release

    • For TIDL fixes in 10.0 releases, these can be applied to SDK 9.2 or 10.1
    • For TIDL fixes in 10.1 releases, these can be applied to SDK 10.0 or 10.1
    • bugfix versions found on edgeai-tidl-tools repo follow this convention based on the bugfix number (last number in version string). These are released in pairs:
      • Odd-valued are compatible with previous SDK
      • Even-valued are compatible with current SDK
    • Up until the v9.1 SDK, there is no backwards/cross compatibility with another SDK as it pertains to TIDL. In that case, the only solution is to port the TIDL stack (not covered in this FAQ)
    • See exact details in the version_compatibility_table.md page

    Enabling backwards compatibility requires steps described in the update_target.md document. These steps explain how to use the latest bugfix release (odd value) on the previous SDK release.  The steps will describe how to modify various software components via a supporting script. Components to update include:

    • Open source runtime (OSRT) libraries – tensorflow-lite, onnxruntime, dlr
      • includes headers (.H files), the main library (.SO or .A files)
      • python libraries (.WHL files) installed via pip through files downloaded from TI
    • TI libraries
      • libraries on Arm that control the accelerator and other components (.SO files in linux)
        • /usr/lib/libtivision_apps.so, /usr/lib/libvx_tidl_rt.so, /usr/lib/libtidl_onnxrt_EP.so, /usr/lib/libtidl_tfl_delegate.so
      • Firmware(s) for C7x (binaries under /lib/firmware
        • Depending on the SoC, there may be multiple instances of C7x, and therefore multiple firmware binaries

    Note that the firmware and TI libraries provided are compatible with the TI evaluation hardware, like the SK-AM62A-LP starter kit board.

    Firmware and libraries are linked against the memory map used across the entire Soc. For custom hardware using differently sized RAM/DDR or different memory map partitions, these firmwares must be rebuilt by the user.

    • In this case, it may be necessary to use the PSDK RTOS or firmware-builder software package (see your SoC’s software under the product page) and the latest version of the c7x-mma-tidl and MMALIB packages. Please contact your field/sales representative in this scenario or for porting across >1 major release version.
  • Q3: Which TI libraries in the Edge AI SDK are version-sensitive and relevant here?

    A3: There are several libraries that are tied to the SDK version as part of the Edge AI software stack.

    At the most fundamental level, these are:

    • /usr/lib/libtivision_apps.so
    • /usr/lib/libvx_tidl_rt.so

    These libraries control TIDL and TIOVX, the latter of which manages IPC on the SoC and other fundamental APIs for accessing hardware accelerators on the SoC. These libraries are included within the SDK.

    Also relevant are open source runtime libraries, which will have some dependence on the two above:

    • /usr/lib/libtidl_onnxrt_EP.so,
    • /usr/lib/libtidl_tfl_delegate.so

     

    Beyond these .SO files included in the SDK, there are several supporting tools and libraries in Linux Userspace. These are all included under /opt in the target filesystem, and start with ‘edgeai-’.

    These edgeai- projects may also be found on TI’s github or ti.com/cgit.

    While these are not version-locked against an SDK, they are validated and tagged to an SDK to aid tracking and reproducibility. These repos are all referenced from the edgeai-app-stack repo. This repo supports cross-compilation so that user-level gstreamer, tiovx, and other supporting tools may be built together.

     

    Similarly, model training and development tools are version tracked from the edgeai-tensorlab repo. Please note that these model training tools have more external dependencies than other edgeai- tools, and that it is best to start from a clean environment (e.g. docker container or python virtual environment).

    While a trained model is not version-dependent, the TIDL-compiled model is.

  • Q4: I have multiple projects and have a hard time with the managing the different versions and dependencies. Do you have any recommendations?

    I am finding the versioning to be frustrating and I keep creating dependency conflicts for myself. What's a good way to manage this?

    A4: There are several ways to address this -- please see the recommendations below

    To start, please understand the following points for context:

    • Models are compiled using a set of tidl_tools, which includes a small collection of binaries, shared object libraries (.SO files), and a few headers.
      • A tidl_tools release is specific to an SDK release version and SoC
        • These tools are downloaded from a TI site during repository setup
      • We release bugfix tidl_tools and edgeai-tidl-tools, and multiple of these are compatible with a single SDK release
        • g. 10_00_06_00 and 10_00_08_00 are both usable for SDK 10.0
      • Open Source runtimes like Tensorflow-lite and ONNXRuntime provide APIs that also support TIDL. These are specific to an SDK release version, but not to the SoC
        • These are generally managed through python versioning

    Therefore, while tidl_tools and python versions for OSRT are related, they are not perfectly 1:1. You may use one set of python dependencies for multiple tidl_tools bugfix releases and/or SoC targets.

    Recommendation 1: Use a separate environment for each SDK version X.Y (e.g. 9.2, 10.1), since some dependencies may change. Docker and virtual python environments are both good choices here

    • find docker information in edgeai-tidl-tools here
    • For python, there are multiple virtual environment managers. VENV is built-in to Python and is perfectly acceptable
      • python3 -m venv ./venv-tidl-tools-SDKvX.Y #create venv
      • source ./venv-tidl-tools-SDKvX.Y/bin/activate #activate virtual environment

    Once the virtual environment is created and active, run the typical setup scripts (edgeai-tidl-tools/setup.sh)

    Recommendation 2: If working with multiple tidl_tool for multiple SoCs or versions, rename to designate the tag/version of the tools and which SoC it targets

    tidl_tools is a generically named directory when the tools are installed, and does not explicitly say the SoC or version string. It is helpful to add this to the directory name

    The actual name of the tools directory is not important. The path to these tools should be exported as TIDL_TOOLS_PATH.

    Recommendation 3: Create a helper script to setup the environment for TIDL.

    The setup.sh can do this (if called using ‘source’), but will also default to downloading and installing other softwares. The main environment variables of note are

    • TIDL_TOOLS_PATH,
    • SOC, and
    • LD_LIBRARY_PATH (for python to find local OSRT libraries)

    An example of such a bash script is as follows:

    #!/bin/bash
    source ./venv-tidl-tools-SDKvX.Y/bin/activate  #setup python virtual environment
    
    export SOC=am62a #or use argument
    #Setup TIDL_TOOLS_PATH; if there is an argument, assume it’s tidl_tools path. If empty, use default
    if [ -z $1  ]; then
        export TIDL_TOOLS_PATH=`pwd`/tidl_tools
    else
        export TIDL_TOOLS_PATH=`pwd`/$1
    fi
    export LD_LIBRARY_PATH=$TIDL_TOOLS_PATH:$TIDL_TOOLS_PATH/osrt_deps:$LD_LIBRARY_PATH
    

  • Q5: I am on SDK 9.x, and it seems like all models are failing to compile. I see a strange error message and artifacts are produced, but they cannot be used on the target SOC. What's happening?

    For an error like:

    ------------------ Network Compiler Traces -----------------------------
    NC running for device: 4
    Running with OTF buffer optimizations
    Segmentation fault (core dumped)
    Could not open /CORRECTED_PATH/edgeai-tidl-tools/model-artifacts/cl-ort-resnet18-v1/tempDir/subgraph_0_tidl_net/perfSimInfo.bin
    WARNING: [TIDL_E_DATAFLOW_INFO_NULL] Network compiler returned with error or didn't executed, this model can only be used on PC/Host emulation mode, it is not expected to work on target/EVM.
    

    This is a known issue found near the end of 2024 for SDK's 9.0, 9.1, 9.2. While the process/API calls do exit without error, an important underlying subprocess failed to complete, yielding incomplete model artifacts.

    This behavior has been linked to the Linux kernel version (check yours with `uname -r`) on the development machine.

    We find that 6.8 kernel (for instance, my machine is 6.8.0-51-generic) shows this behavior, whereas previous linux kernels did not. Kernel 6.5.5 was not found to have this issue -- compilation completes for the exact same tidl-tools, code, models, etc. This issue is related to the linux kernel version. This problem is not present on TIDL tools for SDK 10.0 and 10.1 versions.

    While we work to address this issue, there are a few workarounds:

    • Downgrade the Linux kernel to 6.5
    • upgrade your SDK to 10.x or newer, i.e. version 10.0 or 10.1