TDA4VM: ONNX runtime error while importing the onnx model

Aneesh Peram

Expert 1255 points

Part Number: TDA4VM

Hi Team,

I am using the below onnx model. Please rename the file to pfld.onnx

This is a face landmarks model taken from open source.

pfld.onnx.txt

The TI SDK version I am using is 8.1

When I tried to import the model using the TIDL-RT import tool. I am getting errors. Please check the attached log file for errors.

\In put of TIDL_InnerProductLayer layer needs to be Faltten. Please add Flatten layer to import this mdoels

t-1a84591e31034fac832d29ed8584666c479921806_code-editor-fs">Fullscreen //e2e.ti.com/cfs-filesystemfile/__key/communityserver-discussions-components-files/791/pfld_5F00_9.txt?_=637993474720000000" pfld_9.txt">pfld_9.txt //e2e.ti.com/cfs-filesystemfile/__key/communityserver-discussions-components-files/791/pfld_5F00_9.txt?_=637993474720000000" pfld_9.txt">Download 84591e31034fac832d29ed8584666c479921806_code-editor" 40px;min-width:320px;margin:0px;" ue">paneesh@awsmblx404bs017:~/compute/middleware/ti-psdkra/tidl_j7/ti_dl/utils/tidlModelImport$ ./out/tidl_model_import.out /data/home/paneesh/compute/middleware/ti-psdkra/tidl_j7/ti_dl/test/testvecs/config/import/public/onnx/tidl_import_face_landmarks.txt : ../../test/testvecs/models/public/onnx/facelandmarks/pfld.onnx : ../../test/testvecs/config/tidl_models/onnx/face_landmarks_net.bin : ../../test/testvecs/config/tidl_models/onnx/tidl_io_face_landmarks__ const or initializer of layer Reshape_87 !!! const or initializer of layer Reshape_90 !!! Pad layer with Average Pooling layer. This is expected to work but this flow is functionally not validated with ONNX model format. Pad layer with Average Pooling layer. This is expected to work but this flow is functionally not validated with ONNX model format. | |input_1_original | 0| 0| |input_1_original |input_1 | 0| 1| |input_1 |input.4 | 1| 2| |input.4 |onnx::Conv_264 | 2| 3| |onnx::Conv_264 |input.12 | 3| 4| |input.12 |onnx::Conv_267 | 4| 5| |onnx::Conv_267 |input.20 | 5| 6| |input.20 |onnx::Conv_270 | 6| 7| |onnx::Conv_270 |input.28 | 7| 8| |input.28 |onnx::Conv_273 | 8| 9| |onnx::Conv_273 |input.36 | 9| 10| |input.36 |input.44 | 10| 11| |input.44 |onnx::Conv_278 | 11| 12| |onnx::Conv_278 |input.52 | 12| 13| |input.52 |onnx::Conv_281 | 13| 14| |onnx::Conv_281 |onnx::Add_431 | 14| 15| |input.36 |input.60 | 10| 16| |input.60 |input.68 | 16| 17| |input.68 |onnx::Conv_287 | 17| 18| |onnx::Conv_287 |input.76 | 18| 19| |input.76 |onnx::Conv_290 | 19| 20| |onnx::Conv_290 |onnx::Add_440 | 20| 21| |input.60 |input.84 | 16| 22| |input.84 |input.92 | 22| 23| |input.92 |onnx::Conv_296 | 23| 24| |onnx::Conv_296 |input.100 | 24| 25| |input.100 |onnx::Conv_299 | 25| 26| |onnx::Conv_299 |onnx::Add_449 | 26| 27| |input.84 |input.108 | 22| 28| |input.108 |input.116 | 28| 29| |input.116 |onnx::Conv_305 | 29| 30| |onnx::Conv_305 |input.124 | 30| 31| |input.124 |onnx::Conv_308 | 31| 32| |onnx::Conv_308 |onnx::Add_458 | 32| 33| |input.108 |output_1 | 28| 34| |output_1 |input.140 | 34| 35| |input.140 |onnx::Conv_314 | 35| 36| |onnx::Conv_314 |input.148 | 36| 37| |input.148 |onnx::Conv_317 | 37| 38| |onnx::Conv_317 |input.156 | 38| 39| |input.156 |input.164 | 39| 40| |input.164 |onnx::Conv_322 | 40| 41| |onnx::Conv_322 |input.172 | 41| 42| |input.172 |onnx::Conv_325 | 42| 43| |onnx::Conv_325 |input.180 | 43| 44| |input.180 |input.188 | 44| 45| |input.188 |onnx::Conv_330 | 45| 46| |onnx::Conv_330 |input.196 | 46| 47| |input.196 |onnx::Conv_333 | 47| 48| |onnx::Conv_333 |onnx::Add_485 | 48| 49| |input.180 |input.204 | 44| 50| |input.204 |input.212 | 50| 51| |input.212 |onnx::Conv_339 | 51| 52| |onnx::Conv_339 |input.220 | 52| 53| |input.220 |onnx::Conv_342 | 53| 54| |onnx::Conv_342 |onnx::Add_494 | 54| 55| |input.204 |input.228 | 50| 56| |input.228 |input.236 | 56| 57| |input.236 |onnx::Conv_348 | 57| 58| |onnx::Conv_348 |input.244 | 58| 59| |input.244 |onnx::Conv_351 | 59| 60| |onnx::Conv_351 |onnx::Add_503 | 60| 61| |input.228 |input.252 | 56| 62| |input.252 |input.260 | 62| 63| |input.260 |onnx::Conv_357 | 63| 64| |onnx::Conv_357 |input.268 | 64| 65| |input.268 |onnx::Conv_360 | 65| 66| |onnx::Conv_360 |onnx::Add_512 | 66| 67| |input.252 |input.276 | 62| 68| |input.276 |input.284 | 68| 69| |input.284 |onnx::Conv_366 | 69| 70| |onnx::Conv_366 |input.292 | 70| 71| |input.292 |onnx::Conv_369 | 71| 72| |onnx::Conv_369 |onnx::Add_521 | 72| 73| |input.276 |input.300 | 68| 74| |input.300 |input.308 | 74| 75| |input.308 |onnx::Conv_375 | 75| 76| |onnx::Conv_375 |input.316 | 76| 77| |input.316 |onnx::Conv_378 | 77| 78| |onnx::Conv_378 |input.324 | 78| 79| |input.324 |input.332 | 79| 80| |input.332 |onnx::Pad_391 | 80| 81| |input.324 |onnx::Reshape_382 | 79| 82| |onnx::Pad_391 |input.336 | 81| 83| |onnx::Reshape_382 |onnx::Concat_388 | 82| 84| |onnx::Pad_391 |onnx::Reshape_393 | 81| 85| |input.336 |onnx::Reshape_401 | 83| 86| |onnx::Reshape_393 |onnx::Concat_399 | 85| 87| |onnx::Reshape_401 |onnx::Concat_407 | 86| 88| |onnx::Concat_388 |onnx::Gemm_408 | 84| 89| |onnx::Gemm_408 |409 | 89| 90| |409 |409 | 90| 0| Product Layer Gemm_92's coeff cannot be found(or not match) in coef file, Random coeff will be generated! Only for evaluation usage! Results are all random! Product Layer Gemm_92's coeff cannot be found(or not match) in coef file, Random coeff will be generated! Only for evaluation usage! Results are all random! layer needs to be Faltten. Please add Flatten layer to import this mdoels :~/compute/middleware/ti-psdkra/tidl_j7/ti_dl/utils/tidlModelImport$ = j('#fragment-1a84591e31034fac832d29ed8584666c479921806_code-editor-fs'); = j('#fragment-1a84591e31034fac832d29ed8584666c479921806_code-editor'); function(){ evolutionCodeEditor('fullscreen')) { evolutionCodeEditor('fullscreen', false); evolutionCodeEditor('fullscreen', true);

After this, I tried to import an onnx model using the TI edge AI cloud platform, I am getting the following error while running the onnxruntime inference session.

---------------------------------------------------------------------------
RuntimeException                          Traceback (most recent call last)
<ipython-input-10-c8ebf5693570> in <module>
      1 so = rt.SessionOptions()
      2 EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
----> 3 sess = rt.InferenceSession(onnx_model_path ,providers=EP_list, provider_options=[compile_options, {}], sess_options=so)
      4 
      5 input_details = sess.get_inputs()

/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py in __init__(self, path_or_bytes, sess_options, providers, provider_options)
    281 
    282         try:
--> 283             self._create_inference_session(providers, provider_options)
    284         except RuntimeError:
    285             if self._enable_fallback:

/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py in _create_inference_session(self, providers, provider_options)
    313 
    314         # initialize the C++ InferenceSession
--> 315         sess.initialize_session(providers, provider_options)
    316 
    317         self._sess = sess

RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: basic_string::_M_create


After the above error i have tried to add some layers in deny list based on the log. 
'deny_list' : "MaxPool, Pad, Gemm"
Then i am able to generate the binary files but inference is failing. Please check the error message below.

2022-09-20 08:59:53.264647513 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Gemm node. Name:'Gemm_94' 
Status Message: /home/a0133185/ti/GIT_cloud_build_ta/cloud_build_ta/test/onnxruntime/onnxruntime/core/providers/cpu/math/gemm_helper.h:13 
onnxruntime::GemmHelper::GemmHelper(const onnxruntime::TensorShape&, bool, const onnxruntime::TensorShape&, bool, const onnxruntime::TensorShape&) 
left.NumDimensions() == 2 || left.NumDimensions() == 1 was false. 

Could you please help me solving this issue , and port the model on the tda4.

Thanks and Regards,

Aneesh

over 3 years ago

0 Aneesh Peram over 3 years ago

Expert 1255 points

Hi Team,

Could you please help me with this issue?

Thanks and Regards,

Aneesh

0 Anand Pathak over 3 years ago in reply to Aneesh Peram

TI__Genius 9065 points

Hi Aneesh,

Can you please share the python script you are using the run the model through edge AI?

Regards,

Anand

0 Aneesh Peram over 3 years ago in reply to Anand Pathak

Expert 1255 points

Hi Anand,

Thank you for your reply.

Please find the attached python script below.

Fullscreen custom-model-onnx.py.txt Download

#!/usr/bin/env python
# coding: utf-8

# # Custom Model Compilation and Inference using Onnx runtime 
# 
# In this example notebook, we describe how to take a pre-trained classification model and compile it using ***Onnx runtime*** to generate deployable artifacts that can be deployed on the target using the ***Onnx*** interface. 
#  
#  - Pre-trained model: `resnet18v2` model trained on ***ImageNet*** dataset using ***Onnx***  
#  
# In particular, we will show how to
# - compile the model (during heterogenous model compilation, layers that are supported will be offloaded to the`TI-DSP` and artifacts needed for inference are generated)
# - use the generated artifacts for inference
# - perform input preprocessing and output postprocessing
# - enable debug logs
# - use deny-layer compilation option to isolate possible problematic layers and create additional model subgraphs
# - use the generated subgraphs artifacts for inference
# - perform input preprocessing and output postprocessing
#     
# ## Onnx Runtime based work flow
# 
# The diagram below describes the steps for Onnx Runtime based work flow. 
# 
# Note:
#  - The user needs to compile models(sub-graph creation and quantization) on a PC to generate model artifacts.
#  - The generated artifacts can then be used to run inference on the target.
# 
# <img src=docs/images/onnx_work_flow_2.png width="400">

# In[1]:


import os
import tqdm
import cv2
import numpy as np
import onnxruntime as rt
import shutil
from scripts.utils import imagenet_class_to_name, download_model
import matplotlib.pyplot as plt
from pathlib import Path
from IPython.display import Markdown as md
from scripts.utils import loggerWritter
from scripts.utils import get_svg_path
import onnx


# ## Define utility function to preprocess input images
# Below, we define a utility function to preprocess images for `resnet18v2`. This function takes a path as input, loads the image and preprocesses it for generic ***Onnx*** inference. The steps are as follows: 
# 
#  1. load image
#  2. convert BGR image to RGB
#  3. scale image so that the short edge is 256 pixels
#  4. center-crop image to 224x224 pixels
#  5. apply per-channel pixel scaling and mean subtraction
# 
# 
# - Note: If you are using a custom model or a model that was trained using a different framework, please remember to define your own utility function.

# In[2]:


def preprocess_for_onnx_resent18v2(image_path):
    
    # read the image using openCV
    img = cv2.imread(image_path)
    
    # convert to RGB
    img = img[:,:,::-1]
    
    # Most of the onnx models are trained using
    # 224x224 images. The general rule of thumb
    # is to scale the input image while preserving
    # the original aspect ratio so that the
    # short edge is 256 pixels, and then
    # center-crop the scaled image to 224x224
    orig_height, orig_width, _ = img.shape
    short_edge = min(img.shape[:2])
    new_height = (orig_height * 112) // short_edge
    new_width = (orig_width * 112) // short_edge
    img = cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_CUBIC)

    startx = new_width//2 - (112//2)
    starty = new_height//2 - (112//2)
    img = img[starty:starty+112,startx:startx+112]
    
    # apply scaling and mean subtraction.
    # if your model is built with an input
    # normalization layer, then you might
    # need to skip this
    img = img.astype('float32')
    for mean, scale, ch in zip([128, 128, 128], [0.0078125, 0.0078125, 0.0078125], range(img.shape[2])):
            img[:,:,ch] = ((img.astype('float32')[:,:,ch] - mean) * scale)
    img = np.expand_dims(img,axis=0)
    img = np.transpose(img, (0, 3, 1, 2))
    
    return img


# ## Compile the model
# In this step, we create Onnx runtime with `tidl_model_import_onnx` library to generate artifacts that offload supported portion of the DL model to the TI DSP.
#  - `sess` is created with the options below to calibrate the model for 8-bit fixed point inference
#    
#     * **artifacts_folder** - folder where all the compilation artifacts needed for inference are stored 
#     * **tidl_tools_path** - os.getenv('TIDL_TOOLS_PATH'), path to `TIDL` compilation tools 
#     * **tensor_bits** - 8 or 16, is the number of bits to be used for quantization 
#     * **advanced_options:calibration_frames**  - number of images to be used for calibration
#      
#     ``` 
#     compile_options = {
#         'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
#         'artifacts_folder' : output_dir,
#         'tensor_bits' : 16,
#         'accuracy_level' : 0,
#         'advanced_options:calibration_frames' : len(calib_images), 
#         'advanced_options:calibration_iterations' : 3 # used if accuracy_level = 1
#     }
#     ``` 
#     
# - Note: The path to `TIDL` compilation tools and `aarch64` `GCC` compiler is required for model compilation, both of which are accessed by this notebook using predefined environment variables `TIDL_TOOLS_PATH` and `ARM64_GCC_PATH`. The example usage of both the variables is demonstrated in the cell below. 
# - `accuracy_level` is set to 0 in this example. For better accuracy, set `accuracy_level = 1`. This option results in more time for compilation but better inference accuracy. 
# Compilation status log for accuracy_level = 1 is currently not implemented in this notebook. This will be added in future versions. 
# - Please refer to TIDL user guide for further advanced options.

# In[3]:



# calib_images = [
# 'sample-images/elephant.bmp',
# 'sample-images/bus.bmp',
# 'sample-images/bicycle.bmp',
# 'sample-images/zebra.bmp',
# ]
calib_images = [
'sample-images/1.jpg',
'sample-images/2.jpg',
]

output_dir = 'face_landmarks/onnx'
onnx_model_path = 'pfld.onnx'
download_model(onnx_model_path)
onnx.shape_inference.infer_shapes_path(onnx_model_path, onnx_model_path)


# ### Compilation knobs  (optional - In case of debugging accuracy)
# if a model accuracy at 8bits is not good, user's can try compiling same model at 16 bits with accuracy level of 1. This will reduce the performance, but it will give users a good accuracy bar.
# As a second step, user can try to increase 8 bits accuracy by increasing the number of calibration frames and iterations, in order to get closer to 16 bits + accuracy level of 1 results.

# In[4]:


#compilation options - knobs to tweak 
num_bits =8
accuracy =1


# ### Layers debug (optional - In case of debugging)
# Debug_level 3 gives layer information and warnings/erros which could be useful during debug. User's can see logs from compilation inside a giving path to "loggerWritter" helper function.
# 
# Another technique is to use deny_list to exclude layers from running on TIDL and create additional subgraphs, in order to aisolate issues.

# In[5]:


from scripts.utils import loggerWritter

log_dir = Path("logs").mkdir(parents=True, exist_ok=True)

# stdout and stderr saved to a *.log file.  
with loggerWritter("logs/custon-model-onnx"):
    
    # model compilation options
    compile_options = {
        'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
        'artifacts_folder' : output_dir,
        'tensor_bits' : num_bits,
        'accuracy_level' : accuracy,
        'advanced_options:calibration_frames' : len(calib_images), 
        'advanced_options:calibration_iterations' : 3, # used if accuracy_level = 1
        'debug_level' : 3,
        'deny_list' : "MaxPool, Pad, Gemm" #Comma separated string of operator types as defined by ONNX runtime, ex "MaxPool, Concat"
    }


# In[6]:


# create the output dir if not present
# clear the directory
os.makedirs(output_dir, exist_ok=True)
for root, dirs, files in os.walk(output_dir, topdown=False):
    [os.remove(os.path.join(root, f)) for f in files]
    [os.rmdir(os.path.join(root, d)) for d in dirs]


# In[7]:


so = rt.SessionOptions()
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
sess = rt.InferenceSession(onnx_model_path ,providers=EP_list, provider_options=[compile_options, {}], sess_options=so)

input_details = sess.get_inputs()

for num in tqdm.trange(len(calib_images)):
    output = list(sess.run(None, {input_details[0].name : preprocess_for_onnx_resent18v2(calib_images[num])}))[0]


# ### Subgraphs visualization  (optional - In case of debugging models and subgraps)
# Running below cell gives links to complete graph and TIDL subgraphs visualizations. This, along with "deny_list" feature, explained above, offer tools for potencially checking and isolating issues in NN model layers.

# In[ ]:


subgraph_link =get_svg_path(output_dir) 
for sg in subgraph_link:
    hl_text = os.path.join(*Path(sg).parts[4:])
    sg_rel = os.path.join('../', sg)
    display(md("[{}]({})".format(hl_text,sg_rel)))


# ## Use compiled model for inference
# Then using ***Onnx*** with the ***`libtidl_onnxrt_EP`*** inference library we run the model and collect benchmark data.

# In[ ]:


EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']

sess = rt.InferenceSession(onnx_model_path ,providers=EP_list, provider_options=[compile_options, {}], sess_options=so)
#Running inference several times to get an stable performance output
for i in range(5):
    output = list(sess.run(None, {input_details[0].name : preprocess_for_onnx_resent18v2('sample-images/1.bmp')}))

for idx, cls in enumerate(output[0].squeeze().argsort()[-5:][::-1]):
    print('[%d] %s' % (idx, '/'.join(imagenet_class_to_name(cls))))
    
from scripts.utils import plot_TI_performance_data, plot_TI_DDRBW_data, get_benchmark_output
stats = sess.get_TI_benchmark_data()
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(10,5))
plot_TI_performance_data(stats, axis=ax)
plt.show()

tt, st, rb, wb = get_benchmark_output(stats)
print(f'Statistics : \n Inferences Per Second   : {1000.0/tt :7.2f} fps')
print(f' Inference Time Per Image : {tt :7.2f} ms  \n DDR BW Per Image        : {rb+ wb : 7.2f} MB')


# ## Saving custom artifacts in User's workspace (optional)
# custom-artifacts-temp folder would be deleted after logging out for TI EdgeAI. User's can copy artifacts in their workspace by running below cell

# In[ ]:


root_src_dir = output_dir
root_dst_dir = 'custom-artifacts/onnx/resnet18_opset9.onnx'

for src_dir, dirs, files in os.walk(root_src_dir):
    dst_dir = src_dir.replace(root_src_dir, root_dst_dir, 1)
    if not os.path.exists(dst_dir):
        os.makedirs(dst_dir)
    for file_ in files:
        src_file = os.path.join(src_dir, file_)
        dst_file = os.path.join(dst_dir, file_)
        if os.path.exists(dst_file):
            os.remove(dst_file)
        shutil.copy(src_file, dst_dir)


# ## EVM's console logs (optional - in case of inference failure)
# 
# To copy console logs from EVM to TI EdgeAI Cloud user's workspace, go to: "Help -> Troubleshooting -> EVM console log", In TI's EdgeAI Cloud landing page.
# 
# Alternatevely, from workspace, open/run evm-console-log.ipynb

Please rename the file to custom-model-onnx.py from custom-model-onnx.py.txt

Thanks and Regards,

Aneesh

0 Aneesh Peram over 3 years ago in reply to Aneesh Peram

Expert 1255 points

Hi Anand,

Any update on this?

Thanks and Regards,
Aneesh

0 Anand Pathak over 3 years ago in reply to Aneesh Peram

TI__Genius 9065 points

Hi Aneesh,

I am able to reproduce your issue and looking into it. I was trying to check if the model works with the best case (safest) scenario - set deny_list to "Pad, AveragePool, Reshape, Concat, Gemm", i.e. have only Conv and Add layers in the TIDL delegated subgraph. However, I observe a hang in compilation with this scenario as well, so I am suspecting some other issue here. I am checking this further, will keep you posted.

Regards,

Anand

0 Anand Pathak over 3 years ago in reply to Anand Pathak

TI__Genius 9065 points

Hi Aneesh,

I suspect if the output coming from the Add layer is causing issue, possibly in hand-off from TIDL runtime to ONNX runtime, since one output of network is coming from TIDL and the other one from ARM in this case after delegation. Is it possible for you to give a try re-exporting the model without that output and try compiling the model with "Pad, AveragePool, Reshape, Concat, Gemm" in deny_list? That can help narrow down the issue. I checked from TIDL delegation point of view and am not able to spot any obvious issue.

Regards,

Anand

0 Aneesh Peram over 3 years ago in reply to Anand Pathak

Expert 1255 points

Hi Anand, I did not understand the experiment you asked me to do. Could you please elaborate a little more on that?

Regards,

Aneesh

0 Anand Pathak over 3 years ago in reply to Aneesh Peram

TI__Genius 9065 points

Hi Aneesh,

Sure, Is it possible to export the model without this particular output "output_1" shown in snapshot below?

I think the hang is caused due to the presence of this output, just want to confirm my suspicion.

Regards,

Anand

Processors

Processors forum

TDA4VM: ONNX runtime error while importing the onnx model