This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: Cannot compile any sample model in EdgeAI TIDL Tools using onnxrt_ep.py and tflrt_delegate.py

Part Number: TDA4VM

I have a very similar problem to what exposed in Cannot compile any sample model in EdgeAI TIDL Tools using onnxrt_ep.py - Illegal instruction (core dumped), with the difference that I setup the SDK on Ubuntu 18 native. Still, the onnx script fails with the same error: illegal instruction (core dumped).

 

Instead, while using tflrt_delegate.py for TFLite models compilation, the script freezes with no explanation: 

  

With some print debugs, I found out it gets stuck here at line 177-178.

  

What makes me have no clue is that even if I run the script with no compilation, so entering the other if cases, it still gets stuck there.
As you can see, the files that tries to load are in the right folder:


  

In the above mentioned post, it was asked to try by setting export TIDL_RT_AVX_REF=0 before running the compilation, but still nothing changes.
As mentioned by an user, I will try to run everything on Docker with Windows as an host, but let me know if there's something I'm not considering.

  • Hi,

    Could you please let us know on which branch you are ?

    You can use below command : 

    git branch

    Regards,

    Pratik

  • You may want to look at https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1220814/tda4vm-bug-reports-for-tidl.

    I got dozens of segfaults because I recompiled TIDL with optimisation using newer tools, and they were caused by these bugs (mostly the missing return statements interacting badly with the compiler's optimisation levels).

  • I'm on 'rel_8_6'

  • Hi,

    Thanks for confirmation.

    Could you please disable the AVX support and check again.

    You can set TIDL_RT_AVX_REF to 0 to disable avx support, refer below command for the same.

    export TIDL_RT_AVX_REF=0

    Regards,

    Pratik

  • I tried and nothing changes, I still get both the same outcomes with the two scripts.

  • On a more general note, following the above quoted discussion I tried to use docker over Windows and over Ubuntu 18.04, and in both cases it doesn't work.
    With Windows Docker the setup script fails installing onnx:

    root@81b10238f65f:/home/root# source ./setup.sh --skip_cpp_deps
    X64 Architecture
    Installing python packages...
    Collecting git+https://github.com/kumardesappan/caffe2onnx (from -r ./requirements_pc.txt (line 12))
      Cloning https://github.com/kumardesappan/caffe2onnx to /tmp/pip-co6ws7zn-build
    Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from -r ./requirements_pc.txt (line 1))
    Collecting pyyaml (from -r ./requirements_pc.txt (line 2))
      Using cached https://files.pythonhosted.org/packages/b3/85/79b9e5b4e8d3c0ac657f4e8617713cca8408f6cdc65d2ee6554217cedff1/PyYAML-6.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    Collecting protobuf==3.8.0 (from -r ./requirements_pc.txt (line 3))
      Using cached https://files.pythonhosted.org/packages/d2/fb/29de8d08967f0cce1bb10b39846d836b0f3bf6776ddc36aed7c73498ca7e/protobuf-3.8.0-cp36-cp36m-manylinux1_x86_64.whl
    Collecting onnx==1.9.0 (from -r ./requirements_pc.txt (line 4))
      Using cached https://files.pythonhosted.org/packages/73/e9/5b953497c0e36df589fc60cc6c6b35a65eb67d9ad1e45a9163663e43426e/onnx-1.9.0.tar.gz
        Complete output from command python setup.py egg_info:
        fatal: not a git repository (or any of the parent directories): .git
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/tmp/pip-build-9alp896m/onnx/setup.py", line 86, in <module>
            assert CMAKE, 'Could not find "cmake" executable!'
        AssertionError: Could not find "cmake" executable!
    
        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-9alp896m/onnx/
    Installing python osrt packages...
    protobuf requires Python '>=3.7' but the running Python is 3.6.9
    skipping gcc-arm-9.2-2019.12-x86_64-aarch64-none-linux-gnu download: found /home/root/gcc-arm-9.2-2019.12-x86_64-aarch64-none-linux-gnu

    I'm not being able of fixing that, and also it's weird because protobuf complains about the python version, but I read that with until 3.19.4 it should work with python 3.6. As you can see from the snippet, I tried to lower the version down to 3.8.0 with no success.

    Same happens with Docker on Ubuntu 18:

    root@cce65d232e79:/home/root# source ./setup.sh
    X64 Architecture
    Installing python packages...
    Collecting git+https://github.com/kumardesappan/caffe2onnx (from -r ./requirements_pc.txt (line 12))
      Cloning https://github.com/kumardesappan/caffe2onnx to /tmp/pip-71_rn3kt-build
    Collecting numpy (from -r ./requirements_pc.txt (line 1))
      Downloading https://files.pythonhosted.org/packages/45/b2/6c7545bb7a38754d63048c7696804a0d947328125d81bf12beaa692c3ae3/numpy-1.19.5-cp36-cp36m-manylinux1_x86_64.whl (13.4MB)
        100% |################################| 13.4MB 92kB/s
    Collecting pyyaml (from -r ./requirements_pc.txt (line 2))
      Downloading https://files.pythonhosted.org/packages/b3/85/79b9e5b4e8d3c0ac657f4e8617713cca8408f6cdc65d2ee6554217cedff1/PyYAML-6.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (603kB)
        100% |################################| 604kB 1.8MB/s
    Collecting protobuf==3.19.4 (from -r ./requirements_pc.txt (line 3))
      Downloading https://files.pythonhosted.org/packages/c6/1c/f18d97fc479b4fb6f72bbb0e41188575362e3bbd31014cf294ef0fdec8bf/protobuf-3.19.4-py2.py3-none-any.whl (162kB)
        100% |################################| 163kB 4.6MB/s
    Collecting onnx==1.9.0 (from -r ./requirements_pc.txt (line 4))
      Downloading https://files.pythonhosted.org/packages/73/e9/5b953497c0e36df589fc60cc6c6b35a65eb67d9ad1e45a9163663e43426e/onnx-1.9.0.tar.gz (9.8MB)
        100% |################################| 9.9MB 125kB/s
        Complete output from command python setup.py egg_info:
        fatal: not a git repository (or any of the parent directories): .git
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/tmp/pip-build-ovum_1rs/onnx/setup.py", line 86, in <module>
            assert CMAKE, 'Could not find "cmake" executable!'
        AssertionError: Could not find "cmake" executable!
    
        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-ovum_1rs/onnx/
    Installing python osrt packages...
    protobuf requires Python '>=3.7' but the running Python is 3.6.9
    skipping gcc-arm-9.2-2019.12-x86_64-aarch64-none-linux-gnu download: found /home/root/gcc-arm-9.2-2019.12-x86_64-aarch64-none-linux-gnu
    Installing:onnxruntime
    ln: failed to create symbolic link 'libonnxruntime.so.1.7.0': File exists
    Installing:tflite_2.8
    Installing:opencv
    Installing:dlr
    root@cce65d232e79:/home/root#
    

    (On Ubuntu native I had managed to fix the linker problem with libonnxruntime.so: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1235227/sk-tda4vm-pc-setup-for-model-compilation-error-while-building).

    Basically, I cannot even manage to complete the setup and try to run the example scripts. Let me know if you had experienced something similar.

  • As per posted logs,

    "Illegal Instruction (Core Dump) " 

    edeai-tidl-tools are built using libs which requires AVX/AVX2 support.

    Could you please confirm your machine is AVX supported ?

    You can use below cmd ,

    grep avx /proc/cpuinfo

    OR

    grep avx2 /proc/cpuinfo

    Regards,

    Pratik

  •   

    I would say AVX is supported, while avx2 no because nothing happens. Could this be the reason?

  • Hi,

    From above logs looks like AVX2 support is missing in the current machine you are using.

    Running edgeai-tidl tools on AVX2 supported machine will resolve this issue.

    Regards,

    Pratik

  • Okay, then that must be the reason, thanks.

    Do you know how I can fix the problems I encountered while trying the setup on docker over Windows? (I checked the the host laptop supports AVX and AVX2, thus there should be no problems)

  • Thank you

    Closing this thread.

    Regards,

    Pratik