SK-AM69: Accuracy is decreasing on TIDL using segmentation model on TIDL (the issue is with compile of the model ??)

venk at

Part Number: SK-AM69
Other Parts Discussed in Thread: AM69A

Tool/software:

I downloaded model from https://github.com/TexasInstruments/edgeai-tensorlab/blob/main/edgeai-modelzoo/modelartifacts/AM69A/8bits/ss-8750_onnxrt_ade20k_hf-transformers_segformer_b0_finetuned_ade_512_512_simp_onnx.tar.gz.link

then i ran infer on sk-am69a and i got error

root@am69-sk:/zxseg# python3 infer_test.py
libtidl_onnxrt_EP loaded 0x3ff12eb0
Final number of subgraphs created are : 1, - Offloaded Nodes - 396, Total Nodes - 396
Segmentation fault (core dumped)

so i compile on my own

method one :
/root/ti2/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py

    "ss-ort-nvidia-b0": create_model_config(
        task_type="segmentation",
        source=dict(
            model_url="",
            infer_shape=True,
        ),
        preprocess=dict(
            resize=512,
            crop=512,
            data_layout="NCHW",
            pad_color=0,
            resize_with_pad=False,
            reverse_channels=False,
        ),
        session=dict(
            session_name="onnxrt",
            model_path="/root/ti2/edgeai-tidl-tools/kenny/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx",
            # meta_arch_type=3,
            input_mean=[123.675, 116.28, 103.53],
            input_scale=[0.017125, 0.017507, 0.017429],
            input_optimization=True,
        ),
        postprocess=dict(with_argmax=True),
        extra_info=dict(num_images=numImages, num_classes=150),
    ),

o/p

root@b7f0ab54a02d:~/ti2/edgeai-tidl-tools/examples/osrt_python/ort# python3 onnxrt_ep.py -c -m ss-ort-nvidia-b0
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['ss-ort-nvidia-b0']


Running_Model :  ss-ort-nvidia-b0  


Running shape inference on model /root/ti2/edgeai-tidl-tools/kenny/test_model/optimized_model.onnx 

========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              11_00_06_00             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              11_00_00_00             |
-------------------------------------------------------------------------------
|            Runtime Version           |                1.15.0                |
-------------------------------------------------------------------------------
|          Model Opset Version         |                  17                  |
-------------------------------------------------------------------------------

============================== [Parsing Started] ==============================

[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options
[TIDL Import]  WARNING: Resize layer - /decode_head/Resize_3 with scales > 4 is not optimal

------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
|          Core           |      No. of Nodes       |   Number of Subgraphs   |
-------------------------------------------------------------------------------
| C7x                     |                     382 |                       3 |
| CPU                     |                      12 |                       x |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   Node    |                        Node Name                        |                                    Reason                                     |
-------------------------------------------------------------------------------------------------------------------------------------------------------
| Split     | _token_0                                                | Layer 295 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose                                               | Layer 296 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_1                                             | Layer 299 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_3                                       | Layer 303 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.0/attention/self/Transpose_2 | Layer 300 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Split     | _token_7                                                | Layer 331 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose_token_6                                       | Layer 332 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_1                                             | Layer 335 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_10                                      | Layer 339 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.1/attention/self/Transpose_2 | Layer 336 - op type Transpose, Unknown input dimension, not supported by TIDL |
-------------------------------------------------------------------------------------------------------------------------------------------------------
============================= [Parsing Completed] =============================

==================== [Optimization for subgraph_0 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                       12 |
| TIDL_TransposeLayer    |                        61 |                       69 |
| TIDL_ConstDataLayer    |                         0 |                      140 |
| TIDL_LayerNormLayer    |                        26 |                       26 |
| TIDL_ConvolutionLayer  |                        16 |                       12 |
| TIDL_InnerProductLayer |                        52 |                       56 |
| TIDL_EltWiseLayer      |                        82 |                      108 |
| TIDL_PoolingLayer      |                         2 |                        2 |
| TIDL_ResizeLayer       |                         3 |                        2 |
| TIDL_SoftMaxLayer      |                         6 |                        6 |
| TIDL_ErfLayer          |                         6 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 507

=================== [Optimization for subgraph_0 Completed] ===================

The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 0.6s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 0.7s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 0.94s:  VX_ZONE_INFO: [ownAddTargetKernelInternal:189] registered kernel vx_tutorial_graph.phase_rgb on target DSP_C7-2
 0.940s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 0.968s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 0.990s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 0.1003s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 0.1022s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1 
 0.1043s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_2 
 0.1058s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_3 
 0.1071s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_4 
 0.1089s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_5 
 0.1104s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_6 
 0.1115s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_7 
 0.1129s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_8 
 0.1143s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2 
 0.1157s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_2 
 0.1169s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_3 
 0.1229s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_4 
 0.1244s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_5 
 0.1259s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_6 
 0.1276s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_7 
 0.1290s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_8 
 0.1309s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3 
 0.1324s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_2 
 0.1341s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_3 
 0.1354s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_4 
 0.1375s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_5 
 0.1390s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_6 
 0.1406s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_7 
 0.1420s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_8 
 0.1440s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4 
 0.1455s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_2 
 0.1466s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_3 
 0.1479s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_4 
 0.1491s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_5 
 0.1504s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_6 
 0.1518s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_7 
 0.1532s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_8 
 0.1547s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-0 
 0.1560s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_NF 
 0.1572s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_LDC1 
 0.1584s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC1 
 0.1595s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC2 
 0.1610s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_VISS1 
 0.1622s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE1 
 0.1637s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE2 
 0.1651s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE3 
 0.1663s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE4 
 0.1676s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE5 
 0.1691s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE6 
 0.1703s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE7 
 0.1716s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE8 
 0.1729s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE9 
 0.1741s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE10 
 0.1752s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE11 
 0.1765s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE12 
 0.1781s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY1 
 0.1794s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY2 
 0.1805s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX 
 0.1820s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX2 
 0.1833s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M1 
 0.1845s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M2 
 0.1857s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M3 
 0.1868s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M4 
 0.1880s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC1_FC 
 0.1897s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-1 
 0.1911s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_SDE 
 0.1922s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_DOF 
 0.1938s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-0 
 0.1953s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-1 
 0.1967s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-0 
 0.1984s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_NF 
 0.1995s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_LDC1 
 0.2006s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC1 
 0.2021s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC2 
 0.2032s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_VISS1 
 0.2044s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_FC 
 0.2061s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-1 
 0.2062s:  VX_ZONE_INFO: [tivxInit:152] Initialization Done !!!
 0.2065s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
============= [Quantization & Calibration for subgraph_0 Started] =============

==================== [Optimization for subgraph_1 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_ErfLayer          |                         1 |                        0 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConstDataLayer    |                         0 |                       12 |
| TIDL_ConvolutionLayer  |                         1 |                        1 |
| TIDL_TransposeLayer    |                         3 |                        4 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_InnerProductLayer |                         6 |                        6 |
| TIDL_EltWiseLayer      |                        11 |                       10 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 54

=================== [Optimization for subgraph_1 Completed] ===================

============= [Quantization & Calibration for subgraph_1 Started] =============

==================== [Optimization for subgraph_2 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_EltWiseLayer      |                        11 |                       10 |
| TIDL_ConstDataLayer    |                         0 |                       12 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_TransposeLayer    |                         4 |                        5 |
| TIDL_InnerProductLayer |                         6 |                        6 |
| TIDL_ConvolutionLayer  |                         3 |                        3 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_ResizeLayer       |                         1 |                        2 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConcatLayer       |                         1 |                        1 |
| TIDL_ReLULayer         |                         1 |                        0 |
| TIDL_ErfLayer          |                         1 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 64

=================== [Optimization for subgraph_2 Completed] ===================

============= [Quantization & Calibration for subgraph_2 Started] =============


-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [2 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [3 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [4 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [5 / 5]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [2 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [3 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [4 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [5 / 5]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [2 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [3 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [4 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [5 / 5]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




 
Completed_Model :     1, Name : ss-ort-nvidia-b0                                  , Total time :  117987.66, Offload Time :   34027.03 , DDR RW MBs : 0, Output Image File : py_out_ss-ort-nvidia-b0_ADE_val_00001801.jpg, Output Bin File : py_out_ss-ort-nvidia-b0_ADE_val_00001801.bin
 
 
MEM: Deinit ... !!!
MEM: Alloc's: 92 alloc's of 1110078843 bytes 
MEM: Free's : 92 free's  of 1110078843 bytes 
MEM: Open's : 0 allocs  of 0 bytes 
MEM: Deinit ... Done !!!

now i ran infer
accuracy is low on tidl and cpu

root@am69-sk:/zxseg# python3 infer_test.py 
libtidl_onnxrt_EP loaded 0x148ef060 
Final number of subgraphs created are : 3, - Offloaded Nodes - 382, Total Nodes - 394 
APP: Init ... !!!
 15821.757931 s: MEM: Init ... !!!
 15821.758271 s: MEM: Initialized DMA HEAP (fd=5) !!!
 15821.758576 s: MEM: Init ... Done !!!
 15821.758765 s: IPC: Init ... !!!
 15822.126204 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
 15822.292806 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
 15822.293302 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 15822.293456 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 15822.293587 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 15822.294372 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 15822.294830 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 15822.295133 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 15822.295435 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 15822.295605 s:  VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
 15822.295640 s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
Loaded model with providers: ['TIDLExecutionProvider', 'CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -44.924595 -2.3853767
Output dtype: float32
output [array([[[[ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
           -2.7829394,  -2.7829394],
         [ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
           -2.7829394,  -2.7829394],
         [ -3.1805022,  -2.7829394,  -3.1805022, ...,  -3.1805022,
           -3.1805022,  -2.7829394],
         ...,
         [ -3.578065 ,  -3.578065 ,  -3.578065 , ...,  -3.1805022,
           -3.9756277,  -3.9756277],
         [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
           -3.578065 ,  -3.9756277],
         [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
           -3.578065 ,  -3.578065 ]],

        [[-13.517135 , -13.517135 , -13.517135 , ..., -15.504948 ,
          -15.107386 , -15.107386 ],
         [-13.914697 , -13.517135 , -13.914697 , ..., -15.902511 ,
          -15.504948 , -15.504948 ],
         [-14.709823 , -14.31226  , -14.31226  , ..., -15.902511 ,
          -15.504948 , -15.504948 ],
         ...,
         [-15.107386 , -15.504948 , -14.709823 , ..., -11.926883 ,
          -13.119572 , -13.119572 ],
         [-15.107386 , -15.504948 , -14.31226  , ..., -12.324446 ,
          -12.722009 , -13.119572 ],
         [-14.709823 , -15.504948 , -14.31226  , ..., -13.119572 ,
          -12.722009 , -13.119572 ]],

        [[-11.926883 , -11.529321 , -11.529321 , ..., -14.709823 ,
          -13.914697 , -13.119572 ],
         [-12.324446 , -11.926883 , -11.529321 , ..., -14.709823 ,
          -13.914697 , -13.914697 ],
         [-13.517135 , -13.119572 , -12.722009 , ..., -14.709823 ,
          -14.31226  , -13.517135 ],
         ...,
         [-15.504948 , -16.697636 , -15.504948 , ..., -11.529321 ,
          -12.324446 , -13.119572 ],
         [-15.504948 , -17.0952   , -14.31226  , ..., -11.926883 ,
          -11.529321 , -13.119572 ],
         [-15.107386 , -15.107386 , -14.31226  , ..., -12.722009 ,
          -11.529321 , -13.119572 ]],

        ...,

        [[-21.070827 , -20.275702 , -19.87814  , ..., -22.263515 ,
          -22.263515 , -22.263515 ],
         [-21.46839  , -21.070827 , -20.275702 , ..., -23.058641 ,
          -22.661077 , -23.058641 ],
         [-22.263515 , -21.46839  , -20.673264 , ..., -23.456203 ,
          -22.661077 , -23.058641 ],
         ...,
         [-25.444017 , -26.239143 , -25.046455 , ..., -19.083014 ,
          -21.46839  , -21.46839  ],
         [-25.84158  , -25.84158  , -23.456203 , ..., -20.673264 ,
          -21.46839  , -22.263515 ],
         [-25.444017 , -27.03427  , -24.648891 , ..., -21.46839  ,
          -21.46839  , -21.865952 ]],

        [[-19.083014 , -18.287888 , -18.287888 , ..., -20.275702 ,
          -19.87814  , -19.87814  ],
         [-19.87814  , -19.083014 , -19.083014 , ..., -20.673264 ,
          -20.275702 , -20.673264 ],
         [-21.070827 , -19.480576 , -19.87814  , ..., -20.673264 ,
          -20.275702 , -20.275702 ],
         ...,
         [-24.648891 , -25.84158  , -23.456203 , ..., -18.68545  ,
          -21.46839  , -21.865952 ],
         [-24.648891 , -25.444017 , -22.263515 , ..., -20.275702 ,
          -21.070827 , -22.661077 ],
         [-23.456203 , -25.046455 , -23.058641 , ..., -21.865952 ,
          -21.46839  , -22.263515 ]],

        [[ -7.15613  ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
           -7.553693 ,  -7.553693 ],
         [ -7.553693 ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
           -6.7585673,  -7.9512553],
         [ -8.348818 ,  -7.553693 ,  -7.15613  , ...,  -7.9512553,
           -7.15613  ,  -7.9512553],
         ...,
         [-20.275702 , -21.070827 , -19.083014 , ..., -15.107386 ,
          -17.0952   , -17.492762 ],
         [-20.275702 , -20.673264 , -18.68545  , ..., -15.902511 ,
          -16.697636 , -17.492762 ],
         [-19.083014 , -20.673264 , -19.480576 , ..., -16.697636 ,
          -17.0952   , -17.492762 ]]]], dtype=float32)]
Mask unique values: [ 0 18]
APP: Deinit ... !!!
REMOTE_SERVICE: Deinit ... !!!
REMOTE_SERVICE: Deinit ... Done !!!
 15825.212523 s: IPC: Deinit ... !!!
 15825.753588 s: IPC: DeInit ... Done !!!
 15825.753657 s: MEM: Deinit ... !!!
 15825.758545 s: DDR_SHARED_MEM: Alloc's: 35 alloc's of 132751508 bytes 
 15825.758604 s: DDR_SHARED_MEM: Free's : 35 free's  of 132751508 bytes 
 15825.758628 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes 
 15825.758666 s: MEM: Deinit ... Done !!!
APP: Deinit ... Done !!!
root@am69-sk:/zxseg# python3 infer_test.py 
EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
Loaded model with providers: ['CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -37.647076 1.502726
Output dtype: float32
output [array([[[[-9.10738182e+00, -8.46180344e+00, -8.89690208e+00, ...,
          -7.30843163e+00, -7.19831467e+00, -7.42499590e+00],
         [-9.50431728e+00, -8.79891968e+00, -9.27894783e+00, ...,
          -7.61665154e+00, -7.40585041e+00, -7.66443872e+00],
         [-1.01199865e+01, -9.30077934e+00, -9.55318069e+00, ...,
          -7.91833210e+00, -7.49162006e+00, -7.73509026e+00],
         ...,
         [-8.89202118e+00, -9.60954189e+00, -8.67526245e+00, ...,
          -8.45482254e+00, -9.26228619e+00, -9.42125702e+00],
         [-8.94773388e+00, -9.37217522e+00, -7.73367977e+00, ...,
          -8.67119598e+00, -8.89135551e+00, -9.23087692e+00],
         [-8.78436470e+00, -8.72591019e+00, -7.97214413e+00, ...,
          -9.18196774e+00, -8.78701019e+00, -9.19165039e+00]],

        [[-1.05580688e+00, -5.19240439e-01, -6.46204770e-01, ...,
           3.97472233e-02,  1.96441859e-02, -2.23712787e-01],
         [-1.28475881e+00, -6.92354441e-01, -9.00263131e-01, ...,
           2.34207064e-02,  8.41307044e-02, -1.85798183e-01],
         [-1.68210590e+00, -9.01914299e-01, -1.04362941e+00, ...,
          -1.41012087e-01,  7.51651824e-02, -1.63039312e-01],
         ...,
         [-8.75086594e+00, -9.43372631e+00, -8.31778622e+00, ...,
          -1.16131220e+01, -1.23025236e+01, -1.22214193e+01],
         [-8.66691113e+00, -9.01147556e+00, -7.58560610e+00, ...,
          -1.17375622e+01, -1.17695818e+01, -1.21904783e+01],
         [-8.39127922e+00, -8.39451599e+00, -7.91367579e+00, ...,
          -1.19990225e+01, -1.17306757e+01, -1.22769880e+01]],

        [[-9.35542679e+00, -8.22606564e+00, -9.77154541e+00, ...,
          -4.36007261e+00, -3.64588404e+00, -2.74775696e+00],
         [-9.54254723e+00, -8.22373581e+00, -9.83346272e+00, ...,
          -4.21152925e+00, -3.36076975e+00, -2.49715137e+00],
         [-1.02955265e+01, -8.80494118e+00, -9.67026711e+00, ...,
          -4.68067694e+00, -3.28478169e+00, -2.06064844e+00],
         ...,
         [-1.15484982e+01, -1.39187107e+01, -1.15395317e+01, ...,
          -1.06908541e+01, -1.08596764e+01, -1.14195423e+01],
         [-1.20424957e+01, -1.34465389e+01, -8.34186172e+00, ...,
          -1.16886873e+01, -1.00103464e+01, -1.12006750e+01],
         [-1.16304359e+01, -1.15399914e+01, -7.32819128e+00, ...,
          -1.30932884e+01, -9.65425110e+00, -1.13869896e+01]],

        ...,

        [[-1.65767632e+01, -1.48137388e+01, -1.53054876e+01, ...,
          -1.22598991e+01, -1.19845648e+01, -1.21397972e+01],
         [-1.73164673e+01, -1.56152582e+01, -1.56489859e+01, ...,
          -1.25846987e+01, -1.23050089e+01, -1.25115051e+01],
         [-1.88177929e+01, -1.66239719e+01, -1.62385197e+01, ...,
          -1.31708536e+01, -1.24973049e+01, -1.26569748e+01],
         ...,
         [-1.65434132e+01, -1.66482334e+01, -1.49522409e+01, ...,
          -1.73858204e+01, -1.91236134e+01, -1.93649387e+01],
         [-1.65106354e+01, -1.62999249e+01, -1.37105570e+01, ...,
          -1.81827908e+01, -1.84118557e+01, -1.91355877e+01],
         [-1.60666618e+01, -1.56589861e+01, -1.40776854e+01, ...,
          -1.88848476e+01, -1.80791683e+01, -1.91343880e+01]],

        [[-1.40990629e+01, -1.18888369e+01, -1.17627220e+01, ...,
          -9.66371632e+00, -9.24732304e+00, -9.20552731e+00],
         [-1.45914030e+01, -1.25023279e+01, -1.20748682e+01, ...,
          -9.73106861e+00, -9.32184887e+00, -9.38448811e+00],
         [-1.58103552e+01, -1.32339678e+01, -1.24932852e+01, ...,
          -9.88481331e+00, -9.29177380e+00, -9.32881260e+00],
         ...,
         [-1.67702923e+01, -1.75025196e+01, -1.59494038e+01, ...,
          -1.84159298e+01, -1.95513268e+01, -1.96526814e+01],
         [-1.68778381e+01, -1.70773335e+01, -1.45592756e+01, ...,
          -1.91143475e+01, -1.89065056e+01, -1.97410011e+01],
         [-1.64891090e+01, -1.61917324e+01, -1.50849390e+01, ...,
          -2.03280964e+01, -1.88051624e+01, -2.02607059e+01]],

        [[-1.16451569e+01, -1.01056519e+01, -1.04362154e+01, ...,
          -7.99716377e+00, -8.05412006e+00, -7.95117903e+00],
         [-1.17157431e+01, -1.03467522e+01, -1.02508469e+01, ...,
          -7.81100845e+00, -7.57984543e+00, -8.00262260e+00],
         [-1.24734840e+01, -1.07574902e+01, -1.04642143e+01, ...,
          -8.11416531e+00, -7.45732689e+00, -7.94904995e+00],
         ...,
         [-9.75324059e+00, -1.07886009e+01, -9.89416981e+00, ...,
          -1.33662148e+01, -1.47563696e+01, -1.48689194e+01],
         [-9.78142452e+00, -1.02218876e+01, -8.56538582e+00, ...,
          -1.35902987e+01, -1.42914600e+01, -1.45997658e+01],
         [-9.69767570e+00, -9.56430054e+00, -8.45411301e+00, ...,
          -1.51341333e+01, -1.42068319e+01, -1.50633011e+01]]]],
      dtype=float32)]
Mask unique values: [  1   6  11  12  20  87 149]
root@am69-sk:/zxseg#

INFER SCRIPT:

import onnxruntime as ort
import numpy as np
import cv2

model = "/zxseg/ss-ort-nvidia-b0/model/optimized_model.onnx"
image_path = "/zxseg/DAT/people-walking-through-business-district-in-the-city-at-sunset_bbxoqvaod_thumbnail-1080_09.png"
artifacts_dir = "/zxseg/ss-ort-nvidia-b0/artifacts"
# artifacts_dir = "/zxseg/artifacts2"

# Set up TIDL provider options
so = ort.SessionOptions()
runtime_options = {
    "artifacts_folder": artifacts_dir,
}
# providers = ['TIDLExecutionProvider', 'CPUExecutionProvider']
providers = ['CPUExecutionProvider']
provider_options = [runtime_options, {}]


ort_session = ort.InferenceSession(
    model,
    providers=providers,
    provider_options=provider_options,
    sess_options=so
)
print(f"Loaded model with providers: {ort_session.get_providers()}")

original_image = cv2.imread(image_path)
original_h, original_w = original_image.shape[:2]

image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (512, 512))
image = image.astype(np.float32)
mean = np.array([123.675, 116.28, 103.53], dtype=np.float32).reshape(3, 1, 1)
scale = np.array([0.017125, 0.017507, 0.017429], dtype=np.float32).reshape(3, 1, 1)
image = np.transpose(image, (2, 0, 1))  # (3, 512, 512)
image = (image - mean) * scale
image = np.expand_dims(image, axis=0)

inputs = {ort_session.get_inputs()[0].name: image}
outputs = ort_session.run(None, inputs)

print("Output shape:", outputs[0].shape)
print("Output min/max:", outputs[0].min(), outputs[0].max())
print("Output dtype:", outputs[0].dtype)
print("output",outputs)

# Try both axis=0 and axis=-1 for argmax, print unique values
if outputs[0].shape[0] <= 20:  # likely [C, H, W]
    output = outputs[0][0]
    mask = np.argmax(output, axis=0)
else:  # possibly [1, H, W, C] or [1, C, H, W]
    output = outputs[0][0]  # shape: [num_classes, H, W]
    mask = np.argmax(output, axis=0)  # shape: [H, W]

print("Mask unique values:", np.unique(mask))

# Use mask for palette and overlay
num_classes = 150  # ADE20K has 150 classes

def get_palette(num_classes):
    palette = np.random.randint(0, 255, size=(num_classes, 3), dtype=np.uint8)
    palette[0] = [0, 0, 0]  # background as black
    return palette

palette = get_palette(num_classes)

# Resize mask to original image size
mask_resized = cv2.resize(mask.astype(np.uint8), (original_w, original_h), interpolation=cv2.INTER_NEAREST)

# Map each class index to its color
output_color = palette[mask_resized]  # shape: (H, W, 3)

# Overlay the mask on the original image
overlay = cv2.addWeighted(original_image, 0.6, output_color, 0.4, 0)

cv2.imwrite("segmentation_overlay.png", overlay)

method 2 ::

ADVANCE WAY IE CUSTOM WAY

STILL SAME ERROR

wrote by using yaml and configs in model zoo

import onnxruntime as ort

import cv2

import numpy as np

import os

# Segmentation model configuration based on config.yaml

model_path = '/root/ti2/edgeai-tidl-tools/ken_seg/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx'

calibration_images_path = '/root/ti2/edgeai-tidl-tools/ken_seg/DAT' # ADE20k validation images

out_dir_path = '/root/ti2/edgeai-tidl-tools/ken_seg/artifacts2'

def preprocess(image_path):

"""Preprocess image for segmentation model based on config.yaml preprocessing settings"""

img = cv2.imread(image_path)

if img is None:

raise RuntimeError(f'Failed to read image: {image_path}')

img = cv2.resize(img, (512, 512))

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

img = img.astype(np.float32)

# Apply normalization from config.yaml

# input_mean: [123.675, 116.28, 103.53]

# input_scale: [0.017125, 0.017507, 0.017429] (which is 1/std where std=[58.395, 57.12, 57.375])

mean = np.array([123.675, 116.28, 103.53], dtype=np.float32)

scale = np.array([0.017125, 0.017507, 0.017429], dtype=np.float32)

img = (img - mean) * scale

img = img.astype(np.float32)

img = np.transpose(img, (2, 0, 1))

# Add batch dimension

img = np.expand_dims(img, axis=0)

return img

all_calib_images = [os.path.join(calibration_images_path, name) for name in os.listdir(calibration_images_path)

if name.lower().endswith(('.png', '.jpg', '.jpeg'))]

if len(all_calib_images) == 0:

raise RuntimeError(f'No calibration images found in {calibration_images_path}')

# Limit calibration images to match config (calibration_frames: 12)

max_calib_frames = 12

all_calib_images = all_calib_images[:max_calib_frames]

tidl_tools_path = os.environ.get('TIDL_TOOLS_PATH')

if not tidl_tools_path:

raise EnvironmentError('TIDL_TOOLS_PATH environment variable not set.')

os.makedirs(out_dir_path, exist_ok=True)

for root, dirs, files in os.walk(out_dir_path, topdown=False):

for f in files:

os.remove(os.path.join(root, f))

for d in dirs:

os.rmdir(os.path.join(root, d))

compile_options = {

# Core settings from runtime_options in config.yaml

'tidl_tools_path': tidl_tools_path,

'artifacts_folder': out_dir_path,

'platform': 'J7',

'version': '10.1',

'import': 'yes',

'tensor_bits': 8,

'accuracy_level': 1,

'debug_level': 0,

'inference_mode': 0,

'advanced_options:calibration_frames': len(all_calib_images),

'advanced_options:calibration_iterations': 12,

'advanced_options:quantization_scale_type': 4,

'advanced_options:activation_clipping': 1,

'advanced_options:weight_clipping': 1,

'advanced_options:bias_calibration': 1,

'advanced_options:high_resolution_optimization': 0,

'advanced_options:pre_batchnorm_fold': 1,

'advanced_options:output_feature_16bit_names_list': '',

'advanced_options:params_16bit_names_list': '',

'advanced_options:add_data_convert_ops': 3,

'ti_internal_nc_flag': 83886080,

'advanced_options:max_num_subgraph_nodes': 2048,

}

so = ort.SessionOptions()

providers = ['TIDLCompilationProvider', 'CPUExecutionProvider']

session = ort.InferenceSession(

model_path,

providers=providers,

provider_options=[compile_options, {}],

session_options=so,

)

input_name = session.get_inputs()[0].name

for img_path in all_calib_images:

img = preprocess(img_path)

session.run(None, {input_name: img})

print("DONE: Segmentation model compiled successfully with TIDL Compilation Provider.")

print(f"Artifacts saved to: {out_dir_path}")

print(f"Model input shape: {session.get_inputs()[0].shape}")

print(f"Model output shape: {session.get_outputs()[0].shape}")

print(f"Used {len(all_calib_images)} calibration images")

python3 compile_seg_b0.py
========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              11_00_06_00             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              11_00_00_00             |
-------------------------------------------------------------------------------
|            Runtime Version           |                1.15.0                |
-------------------------------------------------------------------------------
|          Model Opset Version         |                  17                  |
-------------------------------------------------------------------------------

============================== [Parsing Started] ==============================

[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network,
 please specify "model_type":"OD" as part of OSRT compilation options                                                                                                                 [TIDL Import]  WARNING: Resize layer - /decode_head/Resize_3 with scales > 4 is not optimal

------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
|          Core           |      No. of Nodes       |   Number of Subgraphs   |
-------------------------------------------------------------------------------
| C7x                     |                     383 |                       3 |
| CPU                     |                      12 |                       x |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   Node    |                        Node Name                        |                                    Reason                                     |
-------------------------------------------------------------------------------------------------------------------------------------------------------
| Split     | _token_0                                                | Layer 295 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose                                               | Layer 296 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_1                                             | Layer 299 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_3                                       | Layer 303 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.0/attention/self/Transpose_2 | Layer 300 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Split     | _token_7                                                | Layer 331 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose_token_6                                       | Layer 332 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_1                                             | Layer 335 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_10                                      | Layer 339 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.1/attention/self/Transpose_2 | Layer 336 - op type Transpose, Unknown input dimension, not supported by TIDL |
-------------------------------------------------------------------------------------------------------------------------------------------------------
============================= [Parsing Completed] =============================

==================== [Optimization for subgraph_0 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                       12 |
| TIDL_TransposeLayer    |                        61 |                       69 |
| TIDL_ConstDataLayer    |                         0 |                       67 |
| TIDL_LayerNormLayer    |                        26 |                       26 |
| TIDL_ConvolutionLayer  |                        16 |                       12 |
| TIDL_InnerProductLayer |                        52 |                       75 |
| TIDL_EltWiseLayer      |                        82 |                       16 |
| TIDL_PoolingLayer      |                         2 |                        2 |
| TIDL_ResizeLayer       |                         3 |                        2 |
| TIDL_SoftMaxLayer      |                         6 |                        6 |
| TIDL_ErfLayer          |                         6 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 361

=================== [Optimization for subgraph_0 Completed] ===================

The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 0.14s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 0.38s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 0.139s:  VX_ZONE_INFO: [ownAddTargetKernelInternal:189] registered kernel vx_tutorial_graph.phase_rgb on target DSP_C7-2
 0.1023s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 0.1042s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 0.1067s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 0.1106s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 0.1124s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1 
 0.1139s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_2 
 0.1153s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_3 
 0.1166s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_4 
 0.1183s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_5 
 0.1198s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_6 
 0.1212s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_7 
 0.1225s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_8 
 0.1237s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2 
 0.1251s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_2 
 0.1262s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_3 
 0.1274s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_4 
 0.1288s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_5 
 0.1299s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_6 
 0.1312s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_7 
 0.1325s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_8 
 0.1341s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3 
 0.1356s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_2 
 0.1368s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_3 
 0.1379s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_4 
 0.1395s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_5 
 0.1407s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_6 
 0.1419s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_7 
 0.1432s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_8 
 0.1448s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4 
 0.1461s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_2 
 0.1472s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_3 
 0.1484s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_4 
 0.1496s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_5 
 0.1509s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_6 
 0.1521s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_7 
 0.1534s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_8 
 0.1550s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-0 
 0.1563s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_NF 
 0.1574s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_LDC1 
 0.1585s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC1 
 0.1596s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC2 
 0.1610s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_VISS1 
 0.1622s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE1 
 0.1635s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE2 
 0.1649s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE3 
 0.1661s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE4 
 0.1672s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE5 
 0.1686s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE6 
 0.1699s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE7 
 0.1712s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE8 
 0.1724s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE9 
 0.1737s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE10 
 0.1748s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE11 
 0.1759s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE12 
 0.1771s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY1 
 0.1784s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY2 
 0.1797s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX 
 0.1812s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX2 
 0.1829s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M1 
 0.1843s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M2 
 0.1857s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M3 
 0.1869s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M4 
 0.1880s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC1_FC 
 0.1896s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-1 
 0.1910s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_SDE 
 0.1922s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_DOF 
 0.1938s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-0 
 0.1951s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-1 
 0.1965s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-0 
 0.1977s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_NF 
 0.1989s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_LDC1 
 0.2001s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC1 
 0.2013s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC2 
 0.2026s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_VISS1 
 0.2039s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_FC 
 0.2053s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-1 
 0.2055s:  VX_ZONE_INFO: [tivxInit:152] Initialization Done !!!
 0.2058s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
============= [Quantization & Calibration for subgraph_0 Started] =============

==================== [Optimization for subgraph_1 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_ErfLayer          |                         1 |                        0 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConstDataLayer    |                         0 |                        4 |
| TIDL_ConvolutionLayer  |                         1 |                        1 |
| TIDL_TransposeLayer    |                         3 |                        4 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_InnerProductLayer |                         6 |                        6 |
| TIDL_EltWiseLayer      |                        11 |                        2 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 38

=================== [Optimization for subgraph_1 Completed] ===================

============= [Quantization & Calibration for subgraph_1 Started] =============

==================== [Optimization for subgraph_2 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_EltWiseLayer      |                        11 |                        2 |
| TIDL_ConstDataLayer    |                         0 |                        5 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_TransposeLayer    |                         4 |                        5 |
| TIDL_InnerProductLayer |                         6 |                        7 |
| TIDL_ConvolutionLayer  |                         3 |                        3 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_ResizeLayer       |                         1 |                        2 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConcatLayer       |                         1 |                        1 |
| TIDL_ReLULayer         |                         1 |                        0 |
| TIDL_ErfLayer          |                         1 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 50

=================== [Optimization for subgraph_2 Completed] ===================

============= [Quantization & Calibration for subgraph_2 Started] =============



-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [1 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [2 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [3 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [4 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [5 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [6 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [7 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [8 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [9 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [10 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [11 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [12 / 12]: -----------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [1 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [2 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [3 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [4 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [5 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [6 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [7 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [8 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [9 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [10 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [11 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [12 / 12]: -----------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [1 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [2 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [3 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [4 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [5 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [6 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [7 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [8 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [9 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [10 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [11 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [12 / 12]: -----------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================



DONE: Segmentation model compiled successfully with TIDL Compilation Provider.
Artifacts saved to: /root/ti2/edgeai-tidl-tools/ken_seg/artifacts2
Model input shape: [1, 3, 512, 512]
Model output shape: [1, 150, 128, 128]
Used 12 calibration images
MEM: Deinit ... !!!
MEM: Alloc's: 92 alloc's of 1501308027 bytes 
MEM: Free's : 92 free's  of 1501308027 bytes 
MEM: Open's : 0 allocs  of 0 bytes 
MEM: Deinit ... Done !!!

same issue

root@am69-sk:/zxseg# python3 infer_test.py 
libtidl_onnxrt_EP loaded 0x3d72ca30 
Final number of subgraphs created are : 3, - Offloaded Nodes - 383, Total Nodes - 395 
APP: Init ... !!!
 18178.158420 s: MEM: Init ... !!!
 18178.158773 s: MEM: Initialized DMA HEAP (fd=5) !!!
 18178.159101 s: MEM: Init ... Done !!!
 18178.159313 s: IPC: Init ... !!!
 18178.392594 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
 18178.442924 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
 18178.443344 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 18178.443370 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 18178.443389 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 18178.444054 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 18178.444602 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 18178.445087 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 18178.445488 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 18178.445680 s:  VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
 18178.445721 s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
Loaded model with providers: ['TIDLExecutionProvider', 'CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -40.24002 -2.2006261
Output dtype: float32
output [array([[[[ -3.4581268,  -3.4581268,  -3.4581268, ...,  -2.8293765,
           -2.8293765,  -2.8293765],
         [ -3.4581268,  -3.4581268,  -3.4581268, ...,  -3.1437516,
           -2.8293765,  -2.8293765],
         [ -3.4581268,  -3.4581268,  -3.772502 , ...,  -3.1437516,
           -2.8293765,  -2.8293765],
         ...,
         [ -3.1437516,  -3.4581268,  -3.4581268, ...,  -2.8293765,
           -3.4581268,  -3.772502 ],
         [ -3.1437516,  -3.4581268,  -2.8293765, ...,  -3.1437516,
           -3.1437516,  -3.4581268],
         [ -3.1437516,  -3.1437516,  -3.1437516, ...,  -3.4581268,
           -3.1437516,  -3.4581268]],

        [[-11.946257 , -11.631881 , -11.631881 , ..., -14.775633 ,
          -14.146882 , -14.146882 ],
         [-12.5750065, -11.946257 , -12.260632 , ..., -15.090008 ,
          -14.461258 , -14.775633 ],
         [-12.889381 , -12.260632 , -12.5750065, ..., -15.404383 ,
          -14.775633 , -15.090008 ],
         ...,
         [-16.033133 , -16.661884 , -16.033133 , ..., -13.518132 ,
          -14.775633 , -15.090008 ],
         [-15.718758 , -16.347507 , -15.090008 , ..., -14.146882 ,
          -14.146882 , -15.090008 ],
         [-15.718758 , -16.033133 , -15.718758 , ..., -14.775633 ,
          -14.461258 , -15.090008 ]],

        [[-11.317506 , -11.003131 , -10.688755 , ..., -14.146882 ,
          -13.203756 , -12.260632 ],
         [-11.631881 , -11.317506 , -11.003131 , ..., -13.832507 ,
          -13.203756 , -12.889381 ],
         [-12.5750065, -12.260632 , -11.631881 , ..., -14.461258 ,
          -13.832507 , -13.203756 ],
         ...,
         [-14.461258 , -15.404383 , -15.090008 , ..., -11.003131 ,
          -11.946257 , -12.5750065],
         [-14.775633 , -16.033133 , -13.203756 , ..., -11.631881 ,
          -11.003131 , -12.889381 ],
         [-14.461258 , -14.461258 , -12.889381 , ..., -12.5750065,
          -11.317506 , -13.518132 ]],

        ...,

        [[-19.805635 , -18.86251  , -18.548134 , ..., -19.176886 ,
          -18.548134 , -18.86251  ],
         [-20.12001  , -19.49126  , -18.86251  , ..., -19.805635 ,
          -19.176886 , -20.12001  ],
         [-20.434385 , -19.805635 , -19.49126  , ..., -20.74876  ,
          -19.49126  , -20.434385 ],
         ...,
         [-24.206888 , -24.835638 , -23.892513 , ..., -16.347507 ,
          -19.176886 , -19.49126  ],
         [-23.578136 , -24.521263 , -22.006262 , ..., -17.60501  ,
          -18.548134 , -19.805635 ],
         [-23.263762 , -25.150013 , -23.263762 , ..., -19.176886 ,
          -18.548134 , -19.49126  ]],

        [[-20.12001  , -18.86251  , -19.176886 , ..., -21.37751  ,
          -20.74876  , -20.74876  ],
         [-20.74876  , -19.805635 , -19.805635 , ..., -21.691887 ,
          -20.74876  , -21.691887 ],
         [-22.006262 , -20.434385 , -20.74876  , ..., -22.320637 ,
          -21.063135 , -22.006262 ],
         ...,
         [-23.892513 , -25.464388 , -23.263762 , ..., -16.97626  ,
          -19.805635 , -20.434385 ],
         [-23.263762 , -24.206888 , -21.37751  , ..., -18.233759 ,
          -19.176886 , -20.434385 ],
         [-22.320637 , -24.206888 , -22.320637 , ..., -20.12001  ,
          -19.49126  , -21.063135 ]],

        [[-10.37438  ,  -9.431255 ,  -9.116879 , ..., -10.688755 ,
          -10.688755 , -10.688755 ],
         [-10.688755 ,  -9.74563  ,  -9.74563  , ..., -11.003131 ,
          -10.060005 , -11.317506 ],
         [-11.003131 , -10.060005 , -10.060005 , ..., -11.631881 ,
          -10.37438  , -11.631881 ],
         ...,
         [-22.949387 , -24.206888 , -22.320637 , ..., -15.404383 ,
          -17.919384 , -18.548134 ],
         [-22.949387 , -23.578136 , -21.063135 , ..., -16.347507 ,
          -17.290634 , -18.233759 ],
         [-22.320637 , -23.578136 , -22.006262 , ..., -17.919384 ,
          -17.290634 , -18.548134 ]]]], dtype=float32)]
Mask unique values: [ 0 38 95]
APP: Deinit ... !!!
REMOTE_SERVICE: Deinit ... !!!
REMOTE_SERVICE: Deinit ... Done !!!
 18181.244298 s: IPC: Deinit ... !!!
 18181.732321 s: IPC: DeInit ... Done !!!
 18181.732384 s: MEM: Deinit ... !!!
 18181.735611 s: DDR_SHARED_MEM: Alloc's: 35 alloc's of 132765800 bytes 
 18181.735656 s: DDR_SHARED_MEM: Free's : 35 free's  of 132765800 bytes 
 18181.735676 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes 
 18181.735702 s: MEM: Deinit ... Done !!!
APP: Deinit ... Done !!!
root@am69-sk:/zxseg# python3 infer_test.py 
EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
Loaded model with providers: ['CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -37.647076 1.502726
Output dtype: float32
output [array([[[[-9.10738182e+00, -8.46180344e+00, -8.89690208e+00, ...,
          -7.30843163e+00, -7.19831467e+00, -7.42499590e+00],
         [-9.50431728e+00, -8.79891968e+00, -9.27894783e+00, ...,
          -7.61665154e+00, -7.40585041e+00, -7.66443872e+00],
         [-1.01199865e+01, -9.30077934e+00, -9.55318069e+00, ...,
          -7.91833210e+00, -7.49162006e+00, -7.73509026e+00],
         ...,
         [-8.89202118e+00, -9.60954189e+00, -8.67526245e+00, ...,
          -8.45482254e+00, -9.26228619e+00, -9.42125702e+00],
         [-8.94773388e+00, -9.37217522e+00, -7.73367977e+00, ...,
          -8.67119598e+00, -8.89135551e+00, -9.23087692e+00],
         [-8.78436470e+00, -8.72591019e+00, -7.97214413e+00, ...,
          -9.18196774e+00, -8.78701019e+00, -9.19165039e+00]],

        [[-1.05580688e+00, -5.19240439e-01, -6.46204770e-01, ...,
           3.97472233e-02,  1.96441859e-02, -2.23712787e-01],
         [-1.28475881e+00, -6.92354441e-01, -9.00263131e-01, ...,
           2.34207064e-02,  8.41307044e-02, -1.85798183e-01],
         [-1.68210590e+00, -9.01914299e-01, -1.04362941e+00, ...,
          -1.41012087e-01,  7.51651824e-02, -1.63039312e-01],
         ...,
         [-8.75086594e+00, -9.43372631e+00, -8.31778622e+00, ...,
          -1.16131220e+01, -1.23025236e+01, -1.22214193e+01],
         [-8.66691113e+00, -9.01147556e+00, -7.58560610e+00, ...,
          -1.17375622e+01, -1.17695818e+01, -1.21904783e+01],
         [-8.39127922e+00, -8.39451599e+00, -7.91367579e+00, ...,
          -1.19990225e+01, -1.17306757e+01, -1.22769880e+01]],

        [[-9.35542679e+00, -8.22606564e+00, -9.77154541e+00, ...,
          -4.36007261e+00, -3.64588404e+00, -2.74775696e+00],
         [-9.54254723e+00, -8.22373581e+00, -9.83346272e+00, ...,
          -4.21152925e+00, -3.36076975e+00, -2.49715137e+00],
         [-1.02955265e+01, -8.80494118e+00, -9.67026711e+00, ...,
          -4.68067694e+00, -3.28478169e+00, -2.06064844e+00],
         ...,
         [-1.15484982e+01, -1.39187107e+01, -1.15395317e+01, ...,
          -1.06908541e+01, -1.08596764e+01, -1.14195423e+01],
         [-1.20424957e+01, -1.34465389e+01, -8.34186172e+00, ...,
          -1.16886873e+01, -1.00103464e+01, -1.12006750e+01],
         [-1.16304359e+01, -1.15399914e+01, -7.32819128e+00, ...,
          -1.30932884e+01, -9.65425110e+00, -1.13869896e+01]],

        ...,

        [[-1.65767632e+01, -1.48137388e+01, -1.53054876e+01, ...,
          -1.22598991e+01, -1.19845648e+01, -1.21397972e+01],
         [-1.73164673e+01, -1.56152582e+01, -1.56489859e+01, ...,
          -1.25846987e+01, -1.23050089e+01, -1.25115051e+01],
         [-1.88177929e+01, -1.66239719e+01, -1.62385197e+01, ...,
          -1.31708536e+01, -1.24973049e+01, -1.26569748e+01],
         ...,
         [-1.65434132e+01, -1.66482334e+01, -1.49522409e+01, ...,
          -1.73858204e+01, -1.91236134e+01, -1.93649387e+01],
         [-1.65106354e+01, -1.62999249e+01, -1.37105570e+01, ...,
          -1.81827908e+01, -1.84118557e+01, -1.91355877e+01],
         [-1.60666618e+01, -1.56589861e+01, -1.40776854e+01, ...,
          -1.88848476e+01, -1.80791683e+01, -1.91343880e+01]],

        [[-1.40990629e+01, -1.18888369e+01, -1.17627220e+01, ...,
          -9.66371632e+00, -9.24732304e+00, -9.20552731e+00],
         [-1.45914030e+01, -1.25023279e+01, -1.20748682e+01, ...,
          -9.73106861e+00, -9.32184887e+00, -9.38448811e+00],
         [-1.58103552e+01, -1.32339678e+01, -1.24932852e+01, ...,
          -9.88481331e+00, -9.29177380e+00, -9.32881260e+00],
         ...,
         [-1.67702923e+01, -1.75025196e+01, -1.59494038e+01, ...,
          -1.84159298e+01, -1.95513268e+01, -1.96526814e+01],
         [-1.68778381e+01, -1.70773335e+01, -1.45592756e+01, ...,
          -1.91143475e+01, -1.89065056e+01, -1.97410011e+01],
         [-1.64891090e+01, -1.61917324e+01, -1.50849390e+01, ...,
          -2.03280964e+01, -1.88051624e+01, -2.02607059e+01]],

        [[-1.16451569e+01, -1.01056519e+01, -1.04362154e+01, ...,
          -7.99716377e+00, -8.05412006e+00, -7.95117903e+00],
         [-1.17157431e+01, -1.03467522e+01, -1.02508469e+01, ...,
          -7.81100845e+00, -7.57984543e+00, -8.00262260e+00],
         [-1.24734840e+01, -1.07574902e+01, -1.04642143e+01, ...,
          -8.11416531e+00, -7.45732689e+00, -7.94904995e+00],
         ...,
         [-9.75324059e+00, -1.07886009e+01, -9.89416981e+00, ...,
          -1.33662148e+01, -1.47563696e+01, -1.48689194e+01],
         [-9.78142452e+00, -1.02218876e+01, -8.56538582e+00, ...,
          -1.35902987e+01, -1.42914600e+01, -1.45997658e+01],
         [-9.69767570e+00, -9.56430054e+00, -8.45411301e+00, ...,
          -1.51341333e+01, -1.42068319e+01, -1.50633011e+01]]]],
      dtype=float32)]
Mask unique values: [  1   6  11  12  20  87 149]
root@am69-sk:/zxseg#

please help,
regards Venkat

2 months ago

0 Chris Tsongas 2 months ago

TI__Genius 15310 points

Hi Venkat,

I was able to compile the model and run under emulation and on the device. I will need some more information on the decreasing accuracy part. I need to see expected and actual results before I can make a determination. Also, in general, we cannot debug custom scripts only with the standard TIDL tools. Occasionally we will give pointers if the see a glaring error but we do not know enough of what you are trying to do to give accurate feedback.

I have included the compilation and inference files I used (emulation and device). Along with the input data.

Compile under 11.00.06.00

./tidl_model_import.out import_segformer
========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
| TIDL Tools Version | 11_00_06_00 |
-------------------------------------------------------------------------------
| C7x Firmware Version | 11_00_00_00 |
-------------------------------------------------------------------------------

ONNX model (Proto) file : NPL/model/segformer_b0_finetuned_ade_512_512_simp.onnx
TIDL network file : out/tidl_net.bin
TIDL IO info file : out/tidl_io_buff
Current ONNX OpSet version : 17
============================ [Optimization started] ============================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
| Layer | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer | 0 | 16 |
| TIDL_SliceLayer | 2 | 6 |
| TIDL_TransposeLayer | 74 | 80 |
| TIDL_ReLULayer | 1 | 0 |
| TIDL_ConcatLayer | 1 | 1 |
| TIDL_LayerNormLayer | 30 | 30 |
| TIDL_EltWiseLayer | 104 | 128 |
| TIDL_ConvolutionLayer | 20 | 16 |
| TIDL_InnerProductLayer | 64 | 68 |
| TIDL_ErfLayer | 8 | 0 |
| TIDL_PoolingLayer | 2 | 2 |
| TIDL_ResizeLayer | 4 | 4 |
| TIDL_SoftMaxLayer | 8 | 8 |
| TIDL_ConstDataLayer | 0 | 164 |
| TIDL_SqueezeLayer | 6 | 0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 597

=========================== [Optimization completed] ===========================

-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 1]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================

------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================

------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================

Emulation Run:

root@e87020451ea4:/home/root/tools/AM69A/tidl_tools# ./PC_dsp_test_dl_algo.out s:infer_segformer

Processing config file #0 : infer_segformer
----------------------- TIDL Process with REF_ONLY FLOW------------------------

# 0 . .. T 15614.39 .... ..... ... .... .....

Device Run:

root@am69-sk:/opt/tidl_test# ./TI_DEVICE_armv8_test_dl_algo_host_rt.out s:infer_seg_dev

Processing config file #0 : infer_seg_dev
APP: Init ... !!!
9940.650785 s: MEM: Init ... !!!
9940.650842 s: MEM: Initialized DMA HEAP (fd=5) !!!
9940.650994 s: MEM: Init ... Done !!!
9940.651013 s: IPC: Init ... !!!
9940.684696 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
9940.692539 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
9940.692652 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
9940.692670 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
9940.692677 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
9940.693320 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0
9940.693469 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1
9940.693576 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2
9940.693681 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3
9940.693696 s: VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
9940.693707 s: VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO

# NETWORK_INIT_TIME = 1010.65 (in ms, c7x @1GHz)
----------------------- TIDL Process with TARGET DATA FLOW ------------------------

# NETWORK_EXECUTION_TIME = 36.34 (in ms, c7x @1GHz) with DDR_BANDWIDTH (Read + Write) = 112.39, 128.81, 241.20 (in Mega Bytes/frame) ... .... .....APP: Deinit ... !!!
REMOTE_SERVICE: Deinit ... !!!
REMOTE_SERVICE: Deinit ... Done !!!
9941.734243 s: IPC: Deinit ... !!!
9941.735204 s: IPC: DeInit ... Done !!!
9941.735235 s: MEM: Deinit ... !!!
9941.735248 s: DDR_SHARED_MEM: Alloc's: 7 alloc's of 10677976 bytes
9941.735260 s: DDR_SHARED_MEM: Free's : 7 free's of 10677976 bytes
9941.735270 s: DDR_SHARED_MEM: Open's : 0 allocs of 0 bytes
9941.735288 s: MEM: Deinit ... Done !!!
APP: Deinit ... Done !!!

https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/import_5F00_segformerhttps://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/infer_5F00_segformerhttps://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/infer_5F00_seg_5F00_dev

Regards,

Chris

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

Mask unique values in CPU: [ 1 6 11 12 20 87 149] (these are classes in image total 150) ,
Mask unique values in TIDL: [ 0 18],
the compile ran perfectly no mask detection during inference.

it run on device but accuracy is low.

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

i followed what u suggested (allowedNode.txt is not generated ) how can i do it ???

root@61028b545efe:~/ti3/edgeai-tidl-tools/tools/AM69A/tidl_tools# ./tidl_model_import.out /root/ti3/edgeai-tidl-tools/kenny/import.cfg --
modelType 2 --inputNetFile /root/ti3/edgeai-tidl-tools/kenny/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx --outputNetFile /roo
t/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_net.bin --inputParamsFile /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_io_buff_templat
e --outputParamsFile /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_io_buff --inDataNorm 1 --inMean 123.675 116.28 103.53 --inScale 0.
017125 0.017507 0.017429 --inData /root/ti3/edgeai-tidl-tools/kenny/dat/in_data_list.txt --tidlStatsTool /root/ti3/edgeai-tidl-tools/tool
s/AM69A/tidl_tools/PC_dsp_test_dl_algo.out --perfSimTool /root/ti3/edgeai-tidl-tools/tools/AM69A/tidl_tools/ti_cnnperfsim.out --graphVizT
ool /root/ti3/edgeai-tidl-tools/tools/AM69A/tidl_tools/tidl_graphVisualiser.out --inHeight 512 --inWidth 512 --inNumChannels 3 --numFrame
s 1
========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              11_00_06_00             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              11_00_00_00             |
-------------------------------------------------------------------------------

ONNX model (Proto) file      : /root/ti3/edgeai-tidl-tools/kenny/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx  
TIDL network file            : /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_net.bin  
TIDL IO info file            : /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_io_buff  
Current ONNX OpSet version   : 17  
============================ [Optimization started] ============================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                       16 |
| TIDL_SliceLayer        |                         2 |                        6 |
| TIDL_TransposeLayer    |                        74 |                       80 |
| TIDL_ReLULayer         |                         1 |                        0 |
| TIDL_ConcatLayer       |                         1 |                        1 |
| TIDL_LayerNormLayer    |                        30 |                       30 |
| TIDL_EltWiseLayer      |                       104 |                      128 |
| TIDL_ConvolutionLayer  |                        20 |                       16 |
| TIDL_InnerProductLayer |                        64 |                       68 |
| TIDL_ErfLayer          |                         8 |                        0 |
| TIDL_PoolingLayer      |                         2 |                        2 |
| TIDL_ResizeLayer       |                         4 |                        4 |
| TIDL_SoftMaxLayer      |                         8 |                        8 |
| TIDL_ConstDataLayer    |                         0 |                      164 |
| TIDL_SqueezeLayer      |                         6 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 597

=========================== [Optimization completed] ===========================


-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 1]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================

artfacts

/root/ti3/edgeai-tidl-tools/kenny/using_bash
├── import.cfg.perf_sim_config.txt
├── import.cfg.qunat_stats_config.txt
├── import.cfg_stats_tool_out.bin
├── tidl_io_buff1.bin
├── tidl_net
│   ├── bufinfolog_0.csv
│   ├── bufinfolog_0.txt
│   ├── perfSimInfo.bin
│   └── wlinfolog_0.txt
├── tidl_net.bin
├── tidl_net.bin.layer_info.txt
├── tidl_net.bin.svg
├── tidl_net.bin_netLog.txt
└── tidl_net.bin_paramDebug.csv

Regards,
venkat

0 Chris Tsongas 2 months ago in reply to venk at

TI__Genius 15310 points

Hi Venkat,

allowedNode.txt is a OSRT artifact not TIDLRT. With TIDLRT all nodes run on the C7x/MMA or the compilation will fail.

Regards,

Chris

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

Hi Chris,

I got it working in TIDLRT. Are there any example documents available that show how to build a reference pipeline using OpenCV (cv2)? The documentation mentions checking the vision_apps directory, but I’d like a clear reference to help me write my own pipeline. After segmentation, I need to perform additional post-processing on the SoC, so this is my Plan B.

Ultimately, I want to use this in OSRT.
For reference, here's what I'm seeing:

Mask unique values on CPU: [1, 6, 11, 12, 20, 87, 149] (these represent the expected classes; total = 150)
Mask unique values from TIDL: [0, 18]

As seen from the mask values on the CPU, the detection is accurate. However, it's not performing as well on TIDL

Regards,
Venkat

0 Chris Tsongas 2 months ago in reply to venk at

TI__Genius 15310 points

Hi Venk at,

Usually TIDLRT is more difficult to setup so OSRT should be an easy follow on. Is that all the incorrect output, one 7 element array (correct) and one 2 element array (incorrect)? Are you expecting these two arrays to match in size and content? So I can test, are you using my input image or something else?

Here are some data flow docs:

https://software-dl.ti.com/jacinto7/esd/processor-sdk-linux-am69a/10_00_00/exports/edgeai-docs/common/edgeai_dataflows.html#multi-input-multi-inference

Regards,

Chris

0 Chris Tsongas 2 months ago in reply to Chris Tsongas

TI__Genius 15310 points

Hi Venk at,

The output you presented does not make any sense. The only output I see in your model is a tensor of 150x128x128.

output: 1492
output: NodeArg(name='1492', type='tensor(float)', shape=[1, 150, 128, 128])

What I need to debug this is your expected 1492 output tensor. I do not know what the two output vectors from a previous post are.

Regards,

Chris

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

Hi Chris,

Thank you for taking the time to help with my issue.

I obtained the model from the TI Edge AI TensorLab Model Zoo. When I checked the model on Netron.app, I saw that the output node was 1492.

For testing, I used an image I found on Google by searching for "people in street."

The masks are correct when I use the CPUExecutionProvider. However, when I use the TIExecutionProvider, the masks are terrible. I suspect this is an issue with artifact generation, as I faced a similar problem with pose estimation models as well.

The output shape was 1x150x128x128, and with CPU execution, the masks look good.
The two output vectors from my perious posts are the class numbers i shated it to convey that the accuracy is low I will attach the full o/p in this post

I read the documentation for tidlrt and wrote a script to convert out_binary to masks. While the masks were okay, they weren't great. As you mentioned, I would also prefer using onnxrt.

TIDL o/p

Loaded model with providers: ['TIDLExecutionProvider', 'CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -44.924595 -2.3853767
Output dtype: float32
output [array([[[[ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
           -2.7829394,  -2.7829394],
         [ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
           -2.7829394,  -2.7829394],
         [ -3.1805022,  -2.7829394,  -3.1805022, ...,  -3.1805022,
           -3.1805022,  -2.7829394],
         ...,
         [ -3.578065 ,  -3.578065 ,  -3.578065 , ...,  -3.1805022,
           -3.9756277,  -3.9756277],
         [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
           -3.578065 ,  -3.9756277],
         [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
           -3.578065 ,  -3.578065 ]],

        [[-13.517135 , -13.517135 , -13.517135 , ..., -15.504948 ,
          -15.107386 , -15.107386 ],
         [-13.914697 , -13.517135 , -13.914697 , ..., -15.902511 ,
          -15.504948 , -15.504948 ],
         [-14.709823 , -14.31226  , -14.31226  , ..., -15.902511 ,
          -15.504948 , -15.504948 ],
         ...,
         [-15.107386 , -15.504948 , -14.709823 , ..., -11.926883 ,
          -13.119572 , -13.119572 ],
         [-15.107386 , -15.504948 , -14.31226  , ..., -12.324446 ,
          -12.722009 , -13.119572 ],
         [-14.709823 , -15.504948 , -14.31226  , ..., -13.119572 ,
          -12.722009 , -13.119572 ]],

        [[-11.926883 , -11.529321 , -11.529321 , ..., -14.709823 ,
          -13.914697 , -13.119572 ],
         [-12.324446 , -11.926883 , -11.529321 , ..., -14.709823 ,
          -13.914697 , -13.914697 ],
         [-13.517135 , -13.119572 , -12.722009 , ..., -14.709823 ,
          -14.31226  , -13.517135 ],
         ...,
         [-15.504948 , -16.697636 , -15.504948 , ..., -11.529321 ,
          -12.324446 , -13.119572 ],
         [-15.504948 , -17.0952   , -14.31226  , ..., -11.926883 ,
          -11.529321 , -13.119572 ],
         [-15.107386 , -15.107386 , -14.31226  , ..., -12.722009 ,
          -11.529321 , -13.119572 ]],

        ...,

        [[-21.070827 , -20.275702 , -19.87814  , ..., -22.263515 ,
          -22.263515 , -22.263515 ],
         [-21.46839  , -21.070827 , -20.275702 , ..., -23.058641 ,
          -22.661077 , -23.058641 ],
         [-22.263515 , -21.46839  , -20.673264 , ..., -23.456203 ,
          -22.661077 , -23.058641 ],
         ...,
         [-25.444017 , -26.239143 , -25.046455 , ..., -19.083014 ,
          -21.46839  , -21.46839  ],
         [-25.84158  , -25.84158  , -23.456203 , ..., -20.673264 ,
          -21.46839  , -22.263515 ],
         [-25.444017 , -27.03427  , -24.648891 , ..., -21.46839  ,
          -21.46839  , -21.865952 ]],

        [[-19.083014 , -18.287888 , -18.287888 , ..., -20.275702 ,
          -19.87814  , -19.87814  ],
         [-19.87814  , -19.083014 , -19.083014 , ..., -20.673264 ,
          -20.275702 , -20.673264 ],
         [-21.070827 , -19.480576 , -19.87814  , ..., -20.673264 ,
          -20.275702 , -20.275702 ],
         ...,
         [-24.648891 , -25.84158  , -23.456203 , ..., -18.68545  ,
          -21.46839  , -21.865952 ],
         [-24.648891 , -25.444017 , -22.263515 , ..., -20.275702 ,
          -21.070827 , -22.661077 ],
         [-23.456203 , -25.046455 , -23.058641 , ..., -21.865952 ,
          -21.46839  , -22.263515 ]],

        [[ -7.15613  ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
           -7.553693 ,  -7.553693 ],
         [ -7.553693 ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
           -6.7585673,  -7.9512553],
         [ -8.348818 ,  -7.553693 ,  -7.15613  , ...,  -7.9512553,
           -7.15613  ,  -7.9512553],
         ...,
         [-20.275702 , -21.070827 , -19.083014 , ..., -15.107386 ,
          -17.0952   , -17.492762 ],
         [-20.275702 , -20.673264 , -18.68545  , ..., -15.902511 ,
          -16.697636 , -17.492762 ],
         [-19.083014 , -20.673264 , -19.480576 , ..., -16.697636 ,
          -17.0952   , -17.492762 ]]]], dtype=float32)]

CPU o/p::

Loaded model with providers: ['CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -37.647076 1.502726
Output dtype: float32
output [array([[[[-9.10738182e+00, -8.46180344e+00, -8.89690208e+00, ...,
          -7.30843163e+00, -7.19831467e+00, -7.42499590e+00],
         [-9.50431728e+00, -8.79891968e+00, -9.27894783e+00, ...,
          -7.61665154e+00, -7.40585041e+00, -7.66443872e+00],
         [-1.01199865e+01, -9.30077934e+00, -9.55318069e+00, ...,
          -7.91833210e+00, -7.49162006e+00, -7.73509026e+00],
         ...,
         [-8.89202118e+00, -9.60954189e+00, -8.67526245e+00, ...,
          -8.45482254e+00, -9.26228619e+00, -9.42125702e+00],
         [-8.94773388e+00, -9.37217522e+00, -7.73367977e+00, ...,
          -8.67119598e+00, -8.89135551e+00, -9.23087692e+00],
         [-8.78436470e+00, -8.72591019e+00, -7.97214413e+00, ...,
          -9.18196774e+00, -8.78701019e+00, -9.19165039e+00]],

        [[-1.05580688e+00, -5.19240439e-01, -6.46204770e-01, ...,
           3.97472233e-02,  1.96441859e-02, -2.23712787e-01],
         [-1.28475881e+00, -6.92354441e-01, -9.00263131e-01, ...,
           2.34207064e-02,  8.41307044e-02, -1.85798183e-01],
         [-1.68210590e+00, -9.01914299e-01, -1.04362941e+00, ...,
          -1.41012087e-01,  7.51651824e-02, -1.63039312e-01],
         ...,
         [-8.75086594e+00, -9.43372631e+00, -8.31778622e+00, ...,
          -1.16131220e+01, -1.23025236e+01, -1.22214193e+01],
         [-8.66691113e+00, -9.01147556e+00, -7.58560610e+00, ...,
          -1.17375622e+01, -1.17695818e+01, -1.21904783e+01],
         [-8.39127922e+00, -8.39451599e+00, -7.91367579e+00, ...,
          -1.19990225e+01, -1.17306757e+01, -1.22769880e+01]],

        [[-9.35542679e+00, -8.22606564e+00, -9.77154541e+00, ...,
          -4.36007261e+00, -3.64588404e+00, -2.74775696e+00],
         [-9.54254723e+00, -8.22373581e+00, -9.83346272e+00, ...,
          -4.21152925e+00, -3.36076975e+00, -2.49715137e+00],
         [-1.02955265e+01, -8.80494118e+00, -9.67026711e+00, ...,
          -4.68067694e+00, -3.28478169e+00, -2.06064844e+00],
         ...,
         [-1.15484982e+01, -1.39187107e+01, -1.15395317e+01, ...,
          -1.06908541e+01, -1.08596764e+01, -1.14195423e+01],
         [-1.20424957e+01, -1.34465389e+01, -8.34186172e+00, ...,
          -1.16886873e+01, -1.00103464e+01, -1.12006750e+01],
         [-1.16304359e+01, -1.15399914e+01, -7.32819128e+00, ...,
          -1.30932884e+01, -9.65425110e+00, -1.13869896e+01]],

        ...,

        [[-1.65767632e+01, -1.48137388e+01, -1.53054876e+01, ...,
          -1.22598991e+01, -1.19845648e+01, -1.21397972e+01],
         [-1.73164673e+01, -1.56152582e+01, -1.56489859e+01, ...,
          -1.25846987e+01, -1.23050089e+01, -1.25115051e+01],
         [-1.88177929e+01, -1.66239719e+01, -1.62385197e+01, ...,
          -1.31708536e+01, -1.24973049e+01, -1.26569748e+01],
         ...,
         [-1.65434132e+01, -1.66482334e+01, -1.49522409e+01, ...,
          -1.73858204e+01, -1.91236134e+01, -1.93649387e+01],
         [-1.65106354e+01, -1.62999249e+01, -1.37105570e+01, ...,
          -1.81827908e+01, -1.84118557e+01, -1.91355877e+01],
         [-1.60666618e+01, -1.56589861e+01, -1.40776854e+01, ...,
          -1.88848476e+01, -1.80791683e+01, -1.91343880e+01]],

        [[-1.40990629e+01, -1.18888369e+01, -1.17627220e+01, ...,
          -9.66371632e+00, -9.24732304e+00, -9.20552731e+00],
         [-1.45914030e+01, -1.25023279e+01, -1.20748682e+01, ...,
          -9.73106861e+00, -9.32184887e+00, -9.38448811e+00],
         [-1.58103552e+01, -1.32339678e+01, -1.24932852e+01, ...,
          -9.88481331e+00, -9.29177380e+00, -9.32881260e+00],
         ...,
         [-1.67702923e+01, -1.75025196e+01, -1.59494038e+01, ...,
          -1.84159298e+01, -1.95513268e+01, -1.96526814e+01],
         [-1.68778381e+01, -1.70773335e+01, -1.45592756e+01, ...,
          -1.91143475e+01, -1.89065056e+01, -1.97410011e+01],
         [-1.64891090e+01, -1.61917324e+01, -1.50849390e+01, ...,
          -2.03280964e+01, -1.88051624e+01, -2.02607059e+01]],

        [[-1.16451569e+01, -1.01056519e+01, -1.04362154e+01, ...,
          -7.99716377e+00, -8.05412006e+00, -7.95117903e+00],
         [-1.17157431e+01, -1.03467522e+01, -1.02508469e+01, ...,
          -7.81100845e+00, -7.57984543e+00, -8.00262260e+00],
         [-1.24734840e+01, -1.07574902e+01, -1.04642143e+01, ...,
          -8.11416531e+00, -7.45732689e+00, -7.94904995e+00],
         ...,
         [-9.75324059e+00, -1.07886009e+01, -9.89416981e+00, ...,
          -1.33662148e+01, -1.47563696e+01, -1.48689194e+01],
         [-9.78142452e+00, -1.02218876e+01, -8.56538582e+00, ...,
          -1.35902987e+01, -1.42914600e+01, -1.45997658e+01],
         [-9.69767570e+00, -9.56430054e+00, -8.45411301e+00, ...,
          -1.51341333e+01, -1.42068319e+01, -1.50633011e+01]]]],
      dtype=float32)]

Regards,

Venkat

---

0 Chris Tsongas 2 months ago in reply to venk at

TI__Genius 15310 points

Hi Venkat,

These are incomplete. I need the .npy or .bin files to make a decent comparison.

Regards,

Chris

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

Hi Chris

I currently don’t have access to the model and device as I’m out of the office. I’ll share them as soon as possible (Monday).

Thanks a lot

Regards,

Venkat

0 Chris Tsongas 2 months ago in reply to venk at

TI__Genius 15310 points

Hi Venkat,

Did not mean to rush you. Just whenever. If you have the script. After the session.run just add:

outputs = session.run(outs, inp)
:
for i in range(0,len(outputs)):
arr = outputs[i].flatten().astype( convertDataType(session.get_outputs()[i].type))
arr.tofile(outs[i].replace('/','')+'.bin',sep="")

That should generate a 1492.bin file. If running under ONNX it will be a bunch of float32 values.

Regards,

Chris

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

ti_e2e_share.zip

Hi Chris,

I've shared the zip for both CPU and TI ONNX inference.
The image used for inference is the same one you used—of a Dassault Rafale jet taking off from a carrier.

Regards,
Venkat

0 venk at 2 months ago in reply to Chris Tsongas

Prodigy 55 points

Hi Chris,

any updates??

0 Chris Tsongas 2 months ago in reply to venk at

TI__Genius 15310 points

Hi Venkat,

I looked at your data and compared your ONNX output to my TIDL generated 8 bit output all normalized to 0-255. The data looks close. I will do a layer by layer MSE comparison to see if it is going off in the weeds somewhere. RIght now, from a TIDL point of view, it imports, runs, and provides quantized data close to the ONNX output so there may be a model problem.

import numpy as np
import matplotlib.pyplot as plt

#tidl_1492 = np.fromfile('/shared/1492_ti_onnx.bin',dtype=np.float32)
tidl_1492 = np.fromfile('/shared/jet_tidl_out.bin',dtype=np.uint8)
print(len(tidl_1492))

onnx_1492 = np.fromfile('/shared/1492_cpu_onnx.bin',dtype=np.float32)
print(len(onnx_1492))

# Noramlize TIDL data to 0-255
ntidl_1492 = (tidl_1492-tidl_1492.min())/(tidl_1492.max()-tidl_1492.min())
ntidl_1492_uint8 = (ntidl_1492 * 255).astype(np.uint8)

# Noramlize ONNX data to 0-255
nonnx_1492 = (onnx_1492-onnx_1492.min())/(onnx_1492.max()-onnx_1492.min())
nonnx_1492_uint8 = (nonnx_1492 * 255).astype(np.uint8)

plt.plot(ntidl_1492_uint8, label='TIDL Int8')
plt.plot(nonnx_1492_uint8,label='ONNX Float')

plt.legend(loc='best')
plt.show()

I will have the layer by layer analysis complete in a couple of days.

Regards,

Chris

0 venk at 1 month ago in reply to Chris Tsongas

Prodigy 55 points

hi Chris,

any updates??

pls help

Regards,
Venkat

0 Christina Kuruvilla 1 month ago in reply to venk at

TI__Expert 5960 points

Hi Venkat,

Chris is currently out of office for the next two weeks. We appreciate your patience during this time.

Warm regards,

Christina

0 Chris Tsongas 1 month ago in reply to Christina Kuruvilla

TI__Genius 15310 points

Hi Venkat,

I spent a couple of days on this since my return, and I am confident the discrepancy is in the SoftMax layer. The SoftMax layer has been addressed in the 11.01 TIDL release. Here is the output from 2 SoftMax layers.

C7x_1_infer_segformer_0565_0001_0001_00001_00008_00256x00256

C7x_1_infer_segformer_0530_0001_0001_00001_00008_00256x00256

The traces should look like:

Where the TIDL output is right on top of the float output, this is not the case in the softmax layer. Additionally, I believe the model expects input values of 0-1, and regular PNG/JPG images do not work well as input; however, that may be a separate issue. I will test with 11.01 tomorrow and send you an update.

Regards,

Chris

0 Chris Tsongas 1 month ago in reply to Chris Tsongas

TI__Genius 15310 points

Hi Venkat,

Found the issue on 11.01.06.00 also and entered a Jira for your issue as TIDL-12438. We will discuss this in the CCB next Monday and provide a time estimate for a fix.

Regards,

Chris

0 venk at 26 days ago in reply to Chris Tsongas

Prodigy 55 points

Hi Chris,

thanks a lot

regards,
Venkat

Processors

Processors forum

SK-AM69: Accuracy is decreasing on TIDL using segmentation model on TIDL (the issue is with compile of the model ??)