AM69A: ONNXRuntime error: Expected shape from model of {1,2,5,5} does not match actual shape of {1,1,1,2,5,5} for output

Konstantin Zaytsev

Part Number: AM69A

Tool/software:

I have created a simple model with single convolution layer, two kernels, one input and two output channels:

import onnx
import onnx.helper as helper
import onnx.numpy_helper as numpy_helper
import numpy as np
 
# Create a tensor for the weights of the convolutional layer
weights = np.random.randn(2, 1, 9, 1).astype(np.float32)
weights_tensor = numpy_helper.from_array(weights, name='conv1_weight')

# Create the input tensor value info
input_tensor = helper.make_tensor_value_info('input', onnx.TensorProto.FLOAT, [1, 1, 5, 5])

# Create the output tensor value info
output_tensor = helper.make_tensor_value_info('output', onnx.TensorProto.FLOAT, [1, 2, 5, 5])

# Create the node (layer)
conv_node = helper.make_node(
    'Conv',
    inputs=['input', 'conv1_weight'],
    outputs=['output'],
    kernel_shape=[9, 1],
    pads=[4, 0, 4, 0],
    strides=[1, 1]
)

# Create the graph
graph = helper.make_graph(
    [conv_node],
    'conv_graph',
    [input_tensor],
    [output_tensor],
    [weights_tensor]
)
 
# Create the model
model = helper.make_model(graph, producer_name='onnx-example')
 
# Save the model to a file
onnx.save(model, 'single_conv_layer_2_kernels.onnx')
 
print("ONNX model with one convolution layer producing 2 output channels from 1 input channel has been created and saved as 'single_conv_layer_2_kernels.onnx'")

During model compilation with c7x offload I have received the following message from ONNXRuntime:

[W:onnxruntime:, execution_frame.cc:835 VerifyOutputSizes] Expected shape from model of {1,2,5,5} does not match actual shape of {1,1,1,2,5,5} for output output

At the same time, there is no ONNXRuntime error when I do inference without c7x offload (ARM only mode). Here is the Python code used for model compilation:

import onnxruntime as rt
import numpy as np

##########################################
###          Script parameters         ###
##########################################

model_path = 'single_conv_layer_2_kernels.onnx'
# EP_list = ['CPUExecutionProvider']
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
# EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']
options = {}
options['artifacts_folder'] = './model-artifacts-dir/'
options['tidl_tools_path'] = '/home/root/tidl_tools'
options['debug_level'] = 1

##########################################

so = rt.SessionOptions()
so.graph_optimization_level = rt.GraphOptimizationLevel.ORT_DISABLE_ALL

# Load the ONNX model
# session = rt.InferenceSession(model_path, providers=EP_list, sess_options=so)
session = rt.InferenceSession(model_path, providers=EP_list, provider_options=[options, {}], sess_options=so)

# Create a random input tensor with the same shape as the input tensor defined in the model
input_data = np.random.rand(1, 1, 5, 5).astype(np.float32)

# Run the model
outputs = session.run(None, {'input': input_data})

# Print the output
print("Model output:", outputs[0])

You can find single_conv_layer_2_kernels.onnx file in the archive:

single_conv_layer_2_kernels.zip

Below is complete console log for model compilation with debug_level=3:

tidl_tools_path                                 = /home/root/tidl_tools
artifacts_folder                                = ./model-artifacts-dir/
tidl_tensor_bits                                = 8
debug_level                                     = 3
num_tidl_subgraphs                              = 16
tidl_denylist                                   =
tidl_denylist_layer_name                        =
tidl_denylist_layer_type                        =
tidl_allowlist_layer_name                       =
model_type                                      =
tidl_calibration_accuracy_level                 = 7
tidl_calibration_options:num_frames_calibration = 20
tidl_calibration_options:bias_calibration_iterations = 50
mixed_precision_factor = -1.000000
model_group_id = 0
power_of_2_quantization                         = 2
ONNX QDQ Enabled                                = 0
enable_high_resolution_optimization             = 0
pre_batchnorm_fold                              = 1
add_data_convert_ops                            = 0
output_feature_16bit_names_list                 =
m_params_16bit_names_list                       =
m_single_core_layers_names_list                 =
Inference mode                                  = 0
Number of cores                                 = 1
reserved_compile_constraints_flag               = 1601
partial_init_during_compile                     = 0
ti_internal_reserved_1                          =

========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              10_00_04_00             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              10_00_02_00             |
-------------------------------------------------------------------------------
|            Runtime Version           |            1.14.0+10000005           |
-------------------------------------------------------------------------------
|          Model Opset Version         |                  18                  |
-------------------------------------------------------------------------------

NOTE: The runtime version here specifies ONNXRT_VERSION+TIDL_VERSION
Ex: 1.14.0+1000XXXX -> ONNXRT 1.14.0 and a TIDL_VERSION 10.00.XX.XX

============================== [Parsing Started] ==============================

[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options
[TIDL Import]  WARNING: Parameters - Kernel 9x1, Stride 1x1, dilation 1x1, Pad 4x0, Bias 0 in [output] has gone through limited verification
[TIDL Import] [PARSER] SUPPORTED: Layers type supported by TIDL --- layer type - Conv,  Node name -  -- [tidl_onnxRtImport_core.cpp, 524]

------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
|          Core           |      No. of Nodes       |   Number of Subgraphs   |
-------------------------------------------------------------------------------
| C7x                     |                       1 |                       1 |
| CPU                     |                       0 |                       x |
-------------------------------------------------------------------------------
Running Runtimes GraphViz - /home/root/tidl_tools/tidl_graphVisualiser_runtimes.out ./model-artifacts-dir//allowedNode.txt ./model-artifacts-dir//tempDir/graphvizInfo.txt ./model-artifacts-dir//tempDir/runtimes_visualization.svg
============================= [Parsing Completed] =============================

TIDL_createStateImportFunc Started:
Compute on node : TIDLExecutionProvider_TIDL_0_0
  0,            Conv, 2, 1, input, output

Input tensor name -  input
Output tensor name - output
In TIDL_onnxRtImportInit subgraph_name=subgraph_0
Layer 0, subgraph id subgraph_0, name=output
Layer 1, subgraph id subgraph_0, name=input
==================== [Optimization for subgraph_0 Started] ====================

In TIDL_runtimesOptimizeNet: LayerIndex = 3, dataIndex = 2
----------------------------- Optimization Summary -----------------------------
--------------------------------------------------------------------------------
|         Layer         | Nodes before optimization | Nodes after optimization |
--------------------------------------------------------------------------------
| TIDL_ConvolutionLayer |                         1 |                        1 |
--------------------------------------------------------------------------------

=================== [Optimization for subgraph_0 Completed] ===================

In TIDL_runtimesPostProcessNet
[TIDL Import]  WARNING: Parameters - Kernel 9x1, Stride 1x1, dilation 1x1, Pad 4x0, Bias 0 in [] has gone through limited verification
************ in TIDL_subgraphRtCreate ************
 The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INIT:Enabled
 0.2s:  VX_ZONE_ERROR:Enabled
 0.2s:  VX_ZONE_WARNING:Enabled
 0.4412s:  VX_ZONE_INIT:[tivxInit:190] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************
 ============= [Quantization & Calibration for subgraph_0 Started] =============

2024-09-16 11:07:17.981512612 [W:onnxruntime:, execution_frame.cc:835 VerifyOutputSizes] Expected shape from model of {1,2,5,5} does not match actual shape of {1,1,1,2,5,5} for output output
*******   In TIDL_subgraphRtInvoke  ********
   0         1.00000         0.00103         0.99637 6
   1         1.00000        -1.20857         3.14400 6
 Layer,   Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger,    paddingWait,LayerWithoutPad,LayerHandleCopy,   BackupCycles,  RestoreCycles,Multic7xContextCopyCycles,
     1,              0,              0,              0,              0,              0,                 0,
0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
 Sum of Layer Cycles 0
Sub Graph Stats 11.000000 2203.000000 90.000000
*******  TIDL_subgraphRtInvoke done  ********
Model output: [[[[[[ 0.00820085  0.00886303 -0.54941744 -0.16496155 -1.1038206 ]
     [-0.18181708 -0.9722068  -0.7312772  -1.2085706  -0.19888027]
     [ 0.14048597  0.6771251   0.319242   -0.80483294  0.07669117]
     [-0.42998666  0.09482104  0.42958534  0.17192125  0.00922517]
     [ 0.10692582 -0.13820928  0.27003202  1.1216784   0.22250456]]

    [[ 1.5228472  -0.5873079  -0.3525843  -0.3002947   1.2096493 ]
     [ 1.1365042   3.144003    2.0103579   0.6627136   2.2352157 ]
     [ 0.44892883  1.3533359   1.2602484   1.9661076   0.32214844]
     [-0.6333883  -1.034955   -0.25269186  1.7660313   0.21569622]
     [ 1.2422765  -0.08347851 -0.7523635   0.23331156 -0.19755787]]]]]]
************ in TIDL_subgraphRtDelete ************
 MEM: Deinit ... !!!
MEM: Alloc's: 27 alloc's of 115313733 bytes
MEM: Free's : 27 free's  of 115313733 bytes
MEM: Open's : 0 allocs  of 0 bytes
MEM: Deinit ... Done !!!

over 1 year ago

0 Pratik Kedar over 1 year ago

TI__Mastermind 24041 points

Hi,

We have assigned the thread to our analytics expert please expect a response from him/her.

Thanks

0 Asha Bhandarkar over 1 year ago in reply to Pratik Kedar

TI__Genius 10170 points

Hi Konstantin,

Internally TIDL works in 6 dimensional flow (1x1xNCHW) which is why that onnx runtime warning will pop up. It looks like otherwise your compilation logs look fine. Can you clarify if you are having any issue with inference?

Best,

Asha

0 Konstantin Zaytsev over 1 year ago in reply to Asha Bhandarkar

Prodigy 50 points

Hi Asha,

Yes, inference does not work after the compilation and exits with error. Here is the full inference log:

libtidl_onnxrt_EP loaded 0x5621dfd9c190
artifacts_folder                                = ./model-artifacts-dir/
debug_level                                     = 3
target_priority                                 = 0
max_pre_empt_delay                              = 340282346638528859811704183484516925440.000000
Final number of subgraphs created are : 1, - Offloaded Nodes - 1, Total Nodes - 1
In TIDL_createStateInfer
Compute on node : TIDLExecutionProvider_TIDL_0_0
************ in TIDL_subgraphRtCreate ************
 Invoke  : ERROR: Unable to open network file ./model-artifacts-dir//subgraph_0_tidl_net.bin
************ in TIDL_subgraphRtDelete ************
 Traceback (most recent call last):
  File "/home/root/custom_models/derivative_model/py/infer_onnx_convolution_test_2_kernels.py", line 24, in <module>
    session = rt.InferenceSession(model_path, providers=EP_list, provider_options=[options, {}], sess_options=so)
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 362, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 410, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Create state function failed. Return value:-1

And here is how artifacts directory looks like after the compilation:

root@1ae02087d69a:/home/root/custom_models/model-artifacts-dir# ls
allowedNode.txt  onnxrtMetaData.txt  tempDir
root@1ae02087d69a:/home/root/custom_models/model-artifacts-dir# cd tempDir/
root@1ae02087d69a:/home/root/custom_models/model-artifacts-dir/tempDir# ls
graphvizInfo.txt               subgraph_0_tidl_io_1.bin                subgraph_0_tidl_net.bin.svg
runtimes_visualization.svg     subgraph_0_tidl_net.bin                 subgraph_0_tidl_net.bin_netLog.txt
subgraph_0_calib_raw_data.bin  subgraph_0_tidl_net.bin.layer_info.txt

I used the following Python code for inference:

import onnxruntime as rt
import numpy as np

##########################################
###          Script parameters         ###
##########################################

model_path = 'single_conv_layer_2_kernels.onnx'
# EP_list = ['CPUExecutionProvider']
# EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']
options = {}
options['artifacts_folder'] = './model-artifacts-dir/'
options['tidl_tools_path'] = '/home/root/tidl_tools'
options['debug_level'] = 3

##########################################

so = rt.SessionOptions()
so.graph_optimization_level = rt.GraphOptimizationLevel.ORT_DISABLE_ALL

# Load the ONNX model
# session = rt.InferenceSession(model_path, providers=EP_list, sess_options=so)
session = rt.InferenceSession(model_path, providers=EP_list, provider_options=[options, {}], sess_options=so)

# Create a random input tensor with the same shape as the input tensor defined in the model
input_data = np.random.rand(1, 1, 5, 5).astype(np.float32)

# Run the model
outputs = session.run(None, {'input': input_data})

# Print the output
print("Model output:", outputs[0])

Also, I want to draw your attention that Quantization & Calibration step was not finished during the model compilation and Memery planing step was not started. I mean that there is Quantization & Calibration Started message:

 ============= [Quantization & Calibration for subgraph_0 Started] =============

but there is no corresponding Quantization & Calibration Completed message. Please, see the compilation log from my original message. Is it expected in my case?

0 Asha Bhandarkar over 1 year ago in reply to Konstantin Zaytsev

TI__Genius 10170 points

Hi Konstantin,

Thanks for the additional information. I'll be able to look into this more next week. Expect a response on 9/25

Best,

Asha

+1 Konstantin Zaytsev over 1 year ago in reply to Asha Bhandarkar

Prodigy 50 points

Hi Asha,

I have solved my issue. The problem was that I needed to run the session calibration_frames number of times when I compiled the model. Default calibration_frames option value equals to 20 but I run the session only once. Below is the compilation script that worked for me. Maybe it is worth to add mention about calibration_frames requirement to the Colab notebook (https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/examples/jupyter_notebooks/colab/infer_ort.ipynb) or create distinct Colab notebook with compilatoin script.

import onnxruntime as rt
import numpy as np

##########################################
###          Script parameters         ###
##########################################

model_path = 'single_conv_layer_2_kernels.onnx'
# EP_list = ['CPUExecutionProvider']
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
# EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']
options = {}
options['artifacts_folder'] = './model-artifacts-dir/'
options['tidl_tools_path'] = '/home/root/tidl_tools'
options['debug_level'] = 3
calibration_frames = 20

##########################################

so = rt.SessionOptions()

# Load the ONNX model
# session = rt.InferenceSession(model_path, providers=EP_list, sess_options=so)
session = rt.InferenceSession(model_path, providers=EP_list, provider_options=[options, {}], sess_options=so)

# Compilation
for _ in range(calibration_frames):
    input_data = np.random.rand(1, 1, 5, 5).astype(np.float32)
    outputs = session.run(None, {'input': input_data})

# # Execution
# input_data = np.random.rand(1, 1, 5, 5).astype(np.float32)
# outputs = session.run(None, {'input': input_data})

# Print the output
print("Model output:", outputs[0])

Also, it would be nice to have corresponding error message during compilation when number of session runs does not equal to calibration_frames option value.

0 Asha Bhandarkar over 1 year ago in reply to Konstantin Zaytsev

TI__Genius 10170 points

Hi Konstantin,

The standard compilation script that we ask customers to use, rather than the colab notebooks which are largely for reference, are the scripts within https://github.com/TexasInstruments/edgeai-tidl-tools/tree/master/examples/osrt_python such as onnxrt_ep.py which have default options already set (see common_utils.py) which can then be tweaked.

I would recommend trying your model evaluation with these.

Best,

Asha

Processors

Processors forum

AM69A: ONNXRuntime error: Expected shape from model of {1,2,5,5} does not match actual shape of {1,1,1,2,5,5} for output