Part Number: J721EXSOMXEVM
Greetings,
I have been trying to compile a a fully convolution neural network for text recognition using edgeai-tidl-tools. The model runs fine when using only CPUExecutionProvider in the EP_list, but fails when we try to compile it with TIDLExecutionProvider.
Here are the compile options:
output_dir = 'fcnn-artifacts'
num_bits = 8
accuracy = 1
onnx_model_path = 'fcnn.onnx'
compile_options = {
'tidl_tools_path': os.environ['TIDL_TOOLS_PATH'],
'artifacts_folder': output_dir,
'tensor_bits': num_bits,
'accuracy_level': accuracy,
'advanced_options:calibration_frames': len(calib_images),
'advanced_options:calibration_iterations': 3, # used if accuracy_level = 1
'debug_level': 3,
# Comma separated string of operator types as defined by ONNX runtime, ex "MaxPool, Concat"
'deny_list': ""
}
Here are the logs during the segmentation fault:
Running shape inference on model ../../../models/public/vgg_fcn_text_opset11.onnx tidl_tools_path = /home/root/tidl_tools artifacts_folder = ../../../model-artifacts//fcnn/ tidl_tensor_bits = 16 debug_level = 3 num_tidl_subgraphs = 16 tidl_denylist = tidl_denylist_layer_name = tidl_denylist_layer_type = tidl_allowlist_layer_name = model_type = tidl_calibration_accuracy_level = 7 tidl_calibration_options:num_frames_calibration = 3 tidl_calibration_options:bias_calibration_iterations = 5 mixed_precision_factor = -1.000000 model_group_id = 0 power_of_2_quantization = 2 enable_high_resolution_optimization = 0 pre_batchnorm_fold = 1 add_data_convert_ops = 3 output_feature_16bit_names_list = m_params_16bit_names_list = reserved_compile_constraints_flag = 1601 ti_internal_reserved_1 = ****** WARNING : Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options****** Supported TIDL layer type --- Reshape -- model/block1_conv1/BiasAdd__6 Supported TIDL layer type --- Conv -- model/block1_conv1/BiasAdd Supported TIDL layer type --- Relu -- model/block1_conv1/Relu Supported TIDL layer type --- Conv -- model/block1_conv2/BiasAdd Supported TIDL layer type --- Relu -- model/block1_conv2/Relu Supported TIDL layer type --- MaxPool -- model/block1_pool/MaxPool Supported TIDL layer type --- Conv -- model/block2_conv1/BiasAdd Supported TIDL layer type --- Relu -- model/block2_conv1/Relu Supported TIDL layer type --- Conv -- model/block2_conv2/BiasAdd Supported TIDL layer type --- Relu -- model/block2_conv2/Relu Supported TIDL layer type --- MaxPool -- model/block2_pool/MaxPool Supported TIDL layer type --- Conv -- model/block3_conv1/BiasAdd Supported TIDL layer type --- Relu -- model/block3_conv1/Relu Supported TIDL layer type --- Conv -- model/block3_conv2/BiasAdd Supported TIDL layer type --- Relu -- model/block3_conv2/Relu Supported TIDL layer type --- Conv -- model/block3_conv3/BiasAdd Supported TIDL layer type --- Relu -- model/block3_conv3/Relu Supported TIDL layer type --- MaxPool -- model/block3_pool/MaxPool Supported TIDL layer type --- Transpose -- Transpose__70 Unsupported (import) TIDL layer type for ONNX op type --- Shape Unsupported (TIDL check) TIDL layer type --- Gather Supported TIDL layer type --- Cast -- model/reshape/Shape__46 Unsupported slice - axis parameters, in Slice -- model/reshape/strided_slice Unsupported (TIDL check) TIDL layer type --- Slice Unsupported (TIDL check) TIDL layer type --- Concat Supported TIDL layer type --- Cast -- model/reshape/Reshape__55 Segmentation fault (core dumped)
When we put certain layers into the deny list:
********** Frame Index 1 : Running float inference **********
2024-01-04 05:21:57.262513053 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Concat node. Name:'model/reshape/Reshape/shape_Concat__54' Status Message: /home/a0496663/work/edgeaitidltools/rel90/onnx/onnxruntime_bit/onnxruntime/onnxruntime/core/providers/cpu/tensor/concat.cc:72 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForCompute(onnxruntime::OpKernelContext*, const std::vector<const onnxruntime::Tensor*>&, onnxruntime::Prepare&) const inputs_n_rank == inputs_0_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 4 got: 1
Traceback (most recent call last):
File "/home/root/examples/osrt_python/ort/fcnn_onnxrt_ep.py", line 266, in <module>
run_model(model, mIdx)
File "/home/root/examples/osrt_python/ort/fcnn_onnxrt_ep.py", line 215, in run_model
preds = sess.run([output_name], {input_name: test_img})
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Concat node. Name:'model/reshape/Reshape/shape_Concat__54' Status Message: /home/a0496663/work/edgeaitidltools/rel90/onnx/onnxruntime_bit/onnxruntime/onnxruntime/core/providers/cpu/tensor/concat.cc:72 onnxruntime::common::Status onnxruntime::ConcatBase::PrepareForCompute(onnxruntime::OpKernelContext*, const std::vector<const onnxruntime::Tensor*>&, onnxruntime::Prepare&) const inputs_n_rank == inputs_0_rank was false. Ranks of input data are different, cannot concatenate them. expected rank: 4 got: 1
************ in TIDL_subgraphRtDelete ************
************ in TIDL_subgraphRtDelete ************
MEM: Deinit ... !!!
MEM: Alloc's: 54 alloc's of 183110924 bytes
MEM: Free's : 54 free's of 183110924 bytes
MEM: Open's : 0 allocs of 0 bytes
MEM: Deinit ... Done !!!
************ in TIDL_subgraphRtDelete ************
Segmentation fault (core dumped)
Unable to get past this internal compilation errors even though the model is primarily a convolution model, with some extra operators added due to conversion from tensorflow to onnx. Please guide on how to get past this error, as this is critical to our pipeline.
Regards,
Vaibhav Kashera