This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Converting DeepLabv3-Mobilenet throws error



I am trying to convert a pre-trained  DeepLab v3-Mobilenet model into TIDL framework, but the process raises an error. 

Converting DeepLabv3-Mobilenet throws error: ERROR: TIDL_E_QUANT_STATS_NOT_AVAILABLE] tidl_quant_stats_tool.out fails to collect dynamic range.

The CNN model was obtained from what is available in this link, which belongs to TI EdgeAI Model Zoo.

I am using ti-processor-sdk-rtos-j721e-evm-08_00_00_12 which is installed in $PSDKR_PATH.

The steps for reproducing the error from a fresh install follow:

$ cd $PSDKR_PATH/vision_apps
$ make tiovx
$ cd $PSDKR_PATH/tidl_j7_08_00_00_10
$ make
$ wget http://software-dl.ti.com/jacinto7/esd/modelzoo/latest/models/vision/segmentation/ade20k32/edgeai-tv/deeplabv3plus_mobilenetv2_edgeailite_512x512_20210308.onnx -O $PSDKR_PATH/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/testvecs/models/public/onnx/deeplabv3plus_mobilenetv2_edgeailite.onnx

Edit $PSDKR_PATH/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/models.py by adding the following lines to the models data-structure:

models = [
  # ...
  'deeplabv3plus_mobilenetv2_edgeailite' : {
    'model_path' : os.path.join(models_base_path, 'deeplabv3plus_mobilenetv2_edgeailite.onnx'),
    'dataset_list' : os.path.join(dataset_base,'tflite-test-data/tidl-dataset-lite/ADEChallengeData2016Val/seg_val_list.txt'),
    'mean': [123.675, 116.28, 103.53],
    'std' : [0.017125, 0.017507, 0.017429],
    'num_images' : numImages,
    'num_classes': 19,
    'model_type': 'seg'
  },
  # ...
]

Edit $PSDKR_PATH/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt_ep.py

models = ['deeplabv3plus_mobilenetv2_edgeailite']

Finally, run:

cd $PSDKR_RPATH//tidl_j7_08_00_00_10/ti_dl/test/onnxrt
source prepare_model_compilation_env.sh
python onnxrt_ep.py -c

Which outputs:

Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['deeplabv3plus_mobilenetv2_edgeqilite']

Running_Model :  deeplabv3plus_mobilenetv2_edgeqilite  

 0.0s:  VX_ZONE_INIT:Enabled
 0.143s:  VX_ZONE_ERROR:Enabled
 0.146s:  VX_ZONE_WARNING:Enabled

Preliminary subgraphs created = 1 
Final number of subgraphs created are : 1, - Offloaded Nodes - 120, Total Nodes - 120 
Compile TIDLExecutionProvider_TIDL_0_0
Input tensor name -  input.1 
Output tensor name - 566 

**********  Frame Index 1 Running float import and float inference **********
INFORMATION: [TIDL_ResizeLayer] 571 Any resize ratio which is power of 2 and greater than 4 will be placed by combination of 4x4 resize layer and 2x2 resize layer. For example a 8x8 resize will be replaced by 4x4 resize followed by 2x2 resize.
INFORMATION: [TIDL_ResizeLayer] 576 Any resize ratio which is power of 2 and greater than 4 will be placed by combination of 4x4 resize layer and 2x2 resize layer. For example a 8x8 resize will be replaced by 4x4 resize followed by 2x2 resize.
WARNING: [TIDL_E_DATAFLOW_INFO_NULL] ti_cnnperfsim.out fails to allocate memory in MSMC. Please look into perfsim log. This model can only be used on PC emulation, it will get fault on target.
****************************************************
**          3 WARNINGS          0 ERRORS          **
****************************************************

**********  Frame Index 2 Running float inference - currFrameIdx <= numFramesCalibration : subgraph id **********

**********  Frame Index 3 Running fixed point mode for calibration : subgraph id **********

~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~

Processing config file #0 : /home/cnh/TDA4VM/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt-artifacts/deeplabv3plus_mobilenetv2_edgeqilite/tempDir/566_tidl_io_.qunat_stats_config.txt 
Illegal instruction

~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~

Processing config file #0 : /home/cnh/TDA4VM/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt-artifacts/deeplabv3plus_mobilenetv2_edgeqilite/tempDir/566_tidl_io_.qunat_stats_config.txt 
Illegal instruction

 *****************   Calibration iteration number 0 completed ************************ 


~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~

Processing config file #0 : /home/cnh/TDA4VM/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt-artifacts/deeplabv3plus_mobilenetv2_edgeqilite/tempDir/566_tidl_io_.qunat_stats_config.txt 
Illegal instruction

 *****************   Calibration iteration number 1 completed ************************ 

~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~

Processing config file #0 : /home/cnh/TDA4VM/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt-artifacts/deeplabv3plus_mobilenetv2_edgeqilite/tempDir/566_tidl_io_.qunat_stats_config.txt 
Illegal instruction

 *****************   Calibration iteration number 2 completed ************************ 
 
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~

Processing config file #0 : /home/cnh/TDA4VM/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt-artifacts/deeplabv3plus_mobilenetv2_edgeqilite/tempDir/566_tidl_io_.qunat_stats_config.txt 
Illegal instruction

 *****************   Calibration iteration number 3 completed ************************ 
 
 ~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~

Processing config file #0 : /home/cnh/TDA4VM/ti-processor-sdk-rtos-j721e-evm-08_00_00_12/tidl_j7_08_00_00_10/ti_dl/test/onnxrt/onnxrt-artifacts/deeplabv3plus_mobilenetv2_edgeqilite/tempDir/566_tidl_io_.qunat_stats_config.txt 
Illegal instruction

  *****************   Calibration iteration number 4 completed ************************ 
 
------------------ Network Compiler Traces -----------------------------
successful Memory allocation
substitute string tidl_net_ not found
INFORMATION: [TIDL_ResizeLayer] 571 Any resize ratio which is power of 2 and greater than 4 will be placed by combination of 4x4 resize layer and 2x2 resize layer. For example a 8x8 resize will be replaced by 4x4 resize followed by 2x2 resize.
INFORMATION: [TIDL_ResizeLayer] 576 Any resize ratio which is power of 2 and greater than 4 will be placed by combination of 4x4 resize layer and 2x2 resize layer. For example a 8x8 resize will be replaced by 4x4 resize followed by 2x2 resize.
ERROR: TIDL_E_QUANT_STATS_NOT_AVAILABLE] tidl_quant_stats_tool.out fails to collect dynamic range. Please look into quant stats log. This model will get fault on target.
****************************************************
**          2 WARNINGS          1 ERRORS          **
****************************************************
 13.869465s:  VX_ZONE_ERROR:[tivxAlgiVisionCreate:333] Calling ialg.algAlloc failed with status = -1115
 13.869657s:  VX_ZONE_ERROR:[ownContextSendCmd:785] Command ack message returned failure cmd_status: -1
 13.869730s:  VX_ZONE_ERROR:[ownContextSendCmd:819] tivxEventWait() failed.
 13.869737s:  VX_ZONE_ERROR:[ownNodeKernelInit:538] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
 13.869740s:  VX_ZONE_ERROR:[ownNodeKernelInit:539] Please be sure the target callbacks have been registered for this core
 13.869743s:  VX_ZONE_ERROR:[ownNodeKernelInit:540] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
 13.869749s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl ... failed !!!
 13.869754s:  VX_ZONE_ERROR:[vxVerifyGraph:2044] Node kernel init failed
 13.869756s:  VX_ZONE_ERROR:[vxVerifyGraph:2098] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
 13.874551s:  VX_ZONE_ERROR:[tivxAlgiVisionCreate:333] Calling ialg.algAlloc failed with status = -1115
 13.874805s:  VX_ZONE_ERROR:[ownContextSendCmd:785] Command ack message returned failure cmd_status: -1
 13.875053s:  VX_ZONE_ERROR:[ownContextSendCmd:819] tivxEventWait() failed.
 13.875059s:  VX_ZONE_ERROR:[ownNodeKernelInit:538] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
 13.875062s:  VX_ZONE_ERROR:[ownNodeKernelInit:539] Please be sure the target callbacks have been registered for this core
 13.875065s:  VX_ZONE_ERROR:[ownNodeKernelInit:540] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
 13.875070s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl ... failed !!!
 13.875074s:  VX_ZONE_ERROR:[vxVerifyGraph:2044] Node kernel init failed
 13.875077s:  VX_ZONE_ERROR:[vxVerifyGraph:2098] Graph verify failed
 13.875124s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:820] graph is not in a state required to be scheduled
 13.875181s:  VX_ZONE_ERROR:[vxProcessGraph:755] schedule graph failed
 13.875185s:  VX_ZONE_ERROR:[vxProcessGraph:760] wait graph failed
 
Completed_Model :     1, Name : deeplabv3plus_mobilenetv2_edgeqilite              , Total time :     4383.9, Offload Time :     1620.0 , DDR RW MBs : 0, Output File : post_proc_out_deeplabv3plus_mobilenetv2_edgeailite.onnx_airshow.jpg

Since this model is available in TI Model Zoo, I expected it to be compatible with TIDL, but due to the output "Illegal instruction" I am not sure if this is the case in fact..

I notice that I manage to run inference with the model when using a pure ONNX framework, which discards the hypothesis of this being a faulty model:

$ python onnxrt_ep.py -d
...
Completed_Model :     1, Name : deeplabv3plus_mobilenetv2_edgeqilite              , Total time :      159.7, Offload Time :        0.0 , DDR RW MBs : 0, Output File : post_proc_out_deeplabv3plus_mobilenetv2_edgeailite.onnx_ADE_val_00001801.jpg

Hence, I would like to know why this conversio is failing and how to fix it.