Tool/software:
I have created a simple model with single convolution layer, two kernels, one input and two output channels:
import onnx
import onnx.helper as helper
import onnx.numpy_helper as numpy_helper
import numpy as np
# Create a tensor for the weights of the convolutional layer
weights = np.random.randn(2, 1, 9, 1).astype(np.float32)
weights_tensor = numpy_helper.from_array(weights, name='conv1_weight')
# Create the input tensor value info
input_tensor = helper.make_tensor_value_info('input', onnx.TensorProto.FLOAT, [1, 1, 5, 5])
# Create the output tensor value info
output_tensor = helper.make_tensor_value_info('output', onnx.TensorProto.FLOAT, [1, 2, 5, 5])
# Create the node (layer)
conv_node = helper.make_node(
'Conv',
inputs=['input', 'conv1_weight'],
outputs=['output'],
kernel_shape=[9, 1],
pads=[4, 0, 4, 0],
strides=[1, 1]
)
# Create the graph
graph = helper.make_graph(
[conv_node],
'conv_graph',
[input_tensor],
[output_tensor],
[weights_tensor]
)
# Create the model
model = helper.make_model(graph, producer_name='onnx-example')
# Save the model to a file
onnx.save(model, 'single_conv_layer_2_kernels.onnx')
print("ONNX model with one convolution layer producing 2 output channels from 1 input channel has been created and saved as 'single_conv_layer_2_kernels.onnx'")
During model compilation with c7x offload I have received the following message from ONNXRuntime:
[W:onnxruntime:, execution_frame.cc:835 VerifyOutputSizes] Expected shape from model of {1,2,5,5} does not match actual shape of {1,1,1,2,5,5} for output output
At the same time, there is no ONNXRuntime error when I do inference without c7x offload (ARM only mode). Here is the Python code used for model compilation:
import onnxruntime as rt
import numpy as np
##########################################
### Script parameters ###
##########################################
model_path = 'single_conv_layer_2_kernels.onnx'
# EP_list = ['CPUExecutionProvider']
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
# EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']
options = {}
options['artifacts_folder'] = './model-artifacts-dir/'
options['tidl_tools_path'] = '/home/root/tidl_tools'
options['debug_level'] = 1
##########################################
so = rt.SessionOptions()
so.graph_optimization_level = rt.GraphOptimizationLevel.ORT_DISABLE_ALL
# Load the ONNX model
# session = rt.InferenceSession(model_path, providers=EP_list, sess_options=so)
session = rt.InferenceSession(model_path, providers=EP_list, provider_options=[options, {}], sess_options=so)
# Create a random input tensor with the same shape as the input tensor defined in the model
input_data = np.random.rand(1, 1, 5, 5).astype(np.float32)
# Run the model
outputs = session.run(None, {'input': input_data})
# Print the output
print("Model output:", outputs[0])
You can find single_conv_layer_2_kernels.onnx file in the archive:
single_conv_layer_2_kernels.zip
Below is complete console log for model compilation with debug_level=3:
tidl_tools_path = /home/root/tidl_tools
artifacts_folder = ./model-artifacts-dir/
tidl_tensor_bits = 8
debug_level = 3
num_tidl_subgraphs = 16
tidl_denylist =
tidl_denylist_layer_name =
tidl_denylist_layer_type =
tidl_allowlist_layer_name =
model_type =
tidl_calibration_accuracy_level = 7
tidl_calibration_options:num_frames_calibration = 20
tidl_calibration_options:bias_calibration_iterations = 50
mixed_precision_factor = -1.000000
model_group_id = 0
power_of_2_quantization = 2
ONNX QDQ Enabled = 0
enable_high_resolution_optimization = 0
pre_batchnorm_fold = 1
add_data_convert_ops = 0
output_feature_16bit_names_list =
m_params_16bit_names_list =
m_single_core_layers_names_list =
Inference mode = 0
Number of cores = 1
reserved_compile_constraints_flag = 1601
partial_init_during_compile = 0
ti_internal_reserved_1 =
========================= [Model Compilation Started] =========================
Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning
============================== [Version Summary] ==============================
-------------------------------------------------------------------------------
| TIDL Tools Version | 10_00_04_00 |
-------------------------------------------------------------------------------
| C7x Firmware Version | 10_00_02_00 |
-------------------------------------------------------------------------------
| Runtime Version | 1.14.0+10000005 |
-------------------------------------------------------------------------------
| Model Opset Version | 18 |
-------------------------------------------------------------------------------
NOTE: The runtime version here specifies ONNXRT_VERSION+TIDL_VERSION
Ex: 1.14.0+1000XXXX -> ONNXRT 1.14.0 and a TIDL_VERSION 10.00.XX.XX
============================== [Parsing Started] ==============================
[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options
[TIDL Import] WARNING: Parameters - Kernel 9x1, Stride 1x1, dilation 1x1, Pad 4x0, Bias 0 in [output] has gone through limited verification
[TIDL Import] [PARSER] SUPPORTED: Layers type supported by TIDL --- layer type - Conv, Node name - -- [tidl_onnxRtImport_core.cpp, 524]
------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
| Core | No. of Nodes | Number of Subgraphs |
-------------------------------------------------------------------------------
| C7x | 1 | 1 |
| CPU | 0 | x |
-------------------------------------------------------------------------------
Running Runtimes GraphViz - /home/root/tidl_tools/tidl_graphVisualiser_runtimes.out ./model-artifacts-dir//allowedNode.txt ./model-artifacts-dir//tempDir/graphvizInfo.txt ./model-artifacts-dir//tempDir/runtimes_visualization.svg
============================= [Parsing Completed] =============================
TIDL_createStateImportFunc Started:
Compute on node : TIDLExecutionProvider_TIDL_0_0
0, Conv, 2, 1, input, output
Input tensor name - input
Output tensor name - output
In TIDL_onnxRtImportInit subgraph_name=subgraph_0
Layer 0, subgraph id subgraph_0, name=output
Layer 1, subgraph id subgraph_0, name=input
==================== [Optimization for subgraph_0 Started] ====================
In TIDL_runtimesOptimizeNet: LayerIndex = 3, dataIndex = 2
----------------------------- Optimization Summary -----------------------------
--------------------------------------------------------------------------------
| Layer | Nodes before optimization | Nodes after optimization |
--------------------------------------------------------------------------------
| TIDL_ConvolutionLayer | 1 | 1 |
--------------------------------------------------------------------------------
=================== [Optimization for subgraph_0 Completed] ===================
In TIDL_runtimesPostProcessNet
[TIDL Import] WARNING: Parameters - Kernel 9x1, Stride 1x1, dilation 1x1, Pad 4x0, Bias 0 in [] has gone through limited verification
************ in TIDL_subgraphRtCreate ************
The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
0.0s: VX_ZONE_INIT:Enabled
0.2s: VX_ZONE_ERROR:Enabled
0.2s: VX_ZONE_WARNING:Enabled
0.4412s: VX_ZONE_INIT:[tivxInit:190] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************
============= [Quantization & Calibration for subgraph_0 Started] =============
2024-09-16 11:07:17.981512612 [W:onnxruntime:, execution_frame.cc:835 VerifyOutputSizes] Expected shape from model of {1,2,5,5} does not match actual shape of {1,1,1,2,5,5} for output output
******* In TIDL_subgraphRtInvoke ********
0 1.00000 0.00103 0.99637 6
1 1.00000 -1.20857 3.14400 6
Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,Multic7xContextCopyCycles,
1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0
Sub Graph Stats 11.000000 2203.000000 90.000000
******* TIDL_subgraphRtInvoke done ********
Model output: [[[[[[ 0.00820085 0.00886303 -0.54941744 -0.16496155 -1.1038206 ]
[-0.18181708 -0.9722068 -0.7312772 -1.2085706 -0.19888027]
[ 0.14048597 0.6771251 0.319242 -0.80483294 0.07669117]
[-0.42998666 0.09482104 0.42958534 0.17192125 0.00922517]
[ 0.10692582 -0.13820928 0.27003202 1.1216784 0.22250456]]
[[ 1.5228472 -0.5873079 -0.3525843 -0.3002947 1.2096493 ]
[ 1.1365042 3.144003 2.0103579 0.6627136 2.2352157 ]
[ 0.44892883 1.3533359 1.2602484 1.9661076 0.32214844]
[-0.6333883 -1.034955 -0.25269186 1.7660313 0.21569622]
[ 1.2422765 -0.08347851 -0.7523635 0.23331156 -0.19755787]]]]]]
************ in TIDL_subgraphRtDelete ************
MEM: Deinit ... !!!
MEM: Alloc's: 27 alloc's of 115313733 bytes
MEM: Free's : 27 free's of 115313733 bytes
MEM: Open's : 0 allocs of 0 bytes
MEM: Deinit ... Done !!!