SK-AM69: Accuracy is decreasing on TIDL using segmentation model on TIDL (the issue is with compile of the model ??)

Part Number: SK-AM69
Other Parts Discussed in Thread: AM69A

Tool/software:

I downloaded model from https://github.com/TexasInstruments/edgeai-tensorlab/blob/main/edgeai-modelzoo/modelartifacts/AM69A/8bits/ss-8750_onnxrt_ade20k_hf-transformers_segformer_b0_finetuned_ade_512_512_simp_onnx.tar.gz.link

then i ran infer on sk-am69a and i got error 


root@am69-sk:/zxseg# python3 infer_test.py
libtidl_onnxrt_EP loaded 0x3ff12eb0
Final number of subgraphs created are : 1, - Offloaded Nodes - 396, Total Nodes - 396
Segmentation fault (core dumped)

so i compile on my own 

method one : 
/root/ti2/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py

    "ss-ort-nvidia-b0": create_model_config(
        task_type="segmentation",
        source=dict(
            model_url="",
            infer_shape=True,
        ),
        preprocess=dict(
            resize=512,
            crop=512,
            data_layout="NCHW",
            pad_color=0,
            resize_with_pad=False,
            reverse_channels=False,
        ),
        session=dict(
            session_name="onnxrt",
            model_path="/root/ti2/edgeai-tidl-tools/kenny/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx",
            # meta_arch_type=3,
            input_mean=[123.675, 116.28, 103.53],
            input_scale=[0.017125, 0.017507, 0.017429],
            input_optimization=True,
        ),
        postprocess=dict(with_argmax=True),
        extra_info=dict(num_images=numImages, num_classes=150),
    ),


o/p

root@b7f0ab54a02d:~/ti2/edgeai-tidl-tools/examples/osrt_python/ort# python3 onnxrt_ep.py -c -m ss-ort-nvidia-b0
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['ss-ort-nvidia-b0']


Running_Model :  ss-ort-nvidia-b0  


Running shape inference on model /root/ti2/edgeai-tidl-tools/kenny/test_model/optimized_model.onnx 

========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              11_00_06_00             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              11_00_00_00             |
-------------------------------------------------------------------------------
|            Runtime Version           |                1.15.0                |
-------------------------------------------------------------------------------
|          Model Opset Version         |                  17                  |
-------------------------------------------------------------------------------

============================== [Parsing Started] ==============================

[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options
[TIDL Import]  WARNING: Resize layer - /decode_head/Resize_3 with scales > 4 is not optimal

------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
|          Core           |      No. of Nodes       |   Number of Subgraphs   |
-------------------------------------------------------------------------------
| C7x                     |                     382 |                       3 |
| CPU                     |                      12 |                       x |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   Node    |                        Node Name                        |                                    Reason                                     |
-------------------------------------------------------------------------------------------------------------------------------------------------------
| Split     | _token_0                                                | Layer 295 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose                                               | Layer 296 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_1                                             | Layer 299 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_3                                       | Layer 303 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.0/attention/self/Transpose_2 | Layer 300 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Split     | _token_7                                                | Layer 331 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose_token_6                                       | Layer 332 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_1                                             | Layer 335 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_10                                      | Layer 339 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.1/attention/self/Transpose_2 | Layer 336 - op type Transpose, Unknown input dimension, not supported by TIDL |
-------------------------------------------------------------------------------------------------------------------------------------------------------
============================= [Parsing Completed] =============================

==================== [Optimization for subgraph_0 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                       12 |
| TIDL_TransposeLayer    |                        61 |                       69 |
| TIDL_ConstDataLayer    |                         0 |                      140 |
| TIDL_LayerNormLayer    |                        26 |                       26 |
| TIDL_ConvolutionLayer  |                        16 |                       12 |
| TIDL_InnerProductLayer |                        52 |                       56 |
| TIDL_EltWiseLayer      |                        82 |                      108 |
| TIDL_PoolingLayer      |                         2 |                        2 |
| TIDL_ResizeLayer       |                         3 |                        2 |
| TIDL_SoftMaxLayer      |                         6 |                        6 |
| TIDL_ErfLayer          |                         6 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 507

=================== [Optimization for subgraph_0 Completed] ===================

The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 0.6s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 0.7s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 0.94s:  VX_ZONE_INFO: [ownAddTargetKernelInternal:189] registered kernel vx_tutorial_graph.phase_rgb on target DSP_C7-2
 0.940s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 0.968s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 0.990s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 0.1003s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 0.1022s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1 
 0.1043s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_2 
 0.1058s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_3 
 0.1071s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_4 
 0.1089s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_5 
 0.1104s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_6 
 0.1115s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_7 
 0.1129s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_8 
 0.1143s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2 
 0.1157s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_2 
 0.1169s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_3 
 0.1229s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_4 
 0.1244s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_5 
 0.1259s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_6 
 0.1276s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_7 
 0.1290s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_8 
 0.1309s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3 
 0.1324s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_2 
 0.1341s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_3 
 0.1354s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_4 
 0.1375s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_5 
 0.1390s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_6 
 0.1406s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_7 
 0.1420s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_8 
 0.1440s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4 
 0.1455s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_2 
 0.1466s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_3 
 0.1479s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_4 
 0.1491s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_5 
 0.1504s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_6 
 0.1518s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_7 
 0.1532s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_8 
 0.1547s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-0 
 0.1560s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_NF 
 0.1572s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_LDC1 
 0.1584s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC1 
 0.1595s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC2 
 0.1610s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_VISS1 
 0.1622s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE1 
 0.1637s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE2 
 0.1651s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE3 
 0.1663s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE4 
 0.1676s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE5 
 0.1691s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE6 
 0.1703s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE7 
 0.1716s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE8 
 0.1729s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE9 
 0.1741s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE10 
 0.1752s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE11 
 0.1765s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE12 
 0.1781s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY1 
 0.1794s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY2 
 0.1805s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX 
 0.1820s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX2 
 0.1833s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M1 
 0.1845s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M2 
 0.1857s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M3 
 0.1868s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M4 
 0.1880s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC1_FC 
 0.1897s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-1 
 0.1911s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_SDE 
 0.1922s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_DOF 
 0.1938s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-0 
 0.1953s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-1 
 0.1967s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-0 
 0.1984s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_NF 
 0.1995s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_LDC1 
 0.2006s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC1 
 0.2021s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC2 
 0.2032s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_VISS1 
 0.2044s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_FC 
 0.2061s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-1 
 0.2062s:  VX_ZONE_INFO: [tivxInit:152] Initialization Done !!!
 0.2065s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
============= [Quantization & Calibration for subgraph_0 Started] =============

==================== [Optimization for subgraph_1 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_ErfLayer          |                         1 |                        0 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConstDataLayer    |                         0 |                       12 |
| TIDL_ConvolutionLayer  |                         1 |                        1 |
| TIDL_TransposeLayer    |                         3 |                        4 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_InnerProductLayer |                         6 |                        6 |
| TIDL_EltWiseLayer      |                        11 |                       10 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 54

=================== [Optimization for subgraph_1 Completed] ===================

============= [Quantization & Calibration for subgraph_1 Started] =============

==================== [Optimization for subgraph_2 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_EltWiseLayer      |                        11 |                       10 |
| TIDL_ConstDataLayer    |                         0 |                       12 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_TransposeLayer    |                         4 |                        5 |
| TIDL_InnerProductLayer |                         6 |                        6 |
| TIDL_ConvolutionLayer  |                         3 |                        3 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_ResizeLayer       |                         1 |                        2 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConcatLayer       |                         1 |                        1 |
| TIDL_ReLULayer         |                         1 |                        0 |
| TIDL_ErfLayer          |                         1 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 64

=================== [Optimization for subgraph_2 Completed] ===================

============= [Quantization & Calibration for subgraph_2 Started] =============


-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [2 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [3 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [4 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [5 / 5]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [2 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [3 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [4 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [5 / 5]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [2 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [3 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [4 / 5]: ------------------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [5 / 5]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




 
Completed_Model :     1, Name : ss-ort-nvidia-b0                                  , Total time :  117987.66, Offload Time :   34027.03 , DDR RW MBs : 0, Output Image File : py_out_ss-ort-nvidia-b0_ADE_val_00001801.jpg, Output Bin File : py_out_ss-ort-nvidia-b0_ADE_val_00001801.bin
 
 
MEM: Deinit ... !!!
MEM: Alloc's: 92 alloc's of 1110078843 bytes 
MEM: Free's : 92 free's  of 1110078843 bytes 
MEM: Open's : 0 allocs  of 0 bytes 
MEM: Deinit ... Done !!!

now i ran infer 
accuracy is low on tidl and cpu

root@am69-sk:/zxseg# python3 infer_test.py 
libtidl_onnxrt_EP loaded 0x148ef060 
Final number of subgraphs created are : 3, - Offloaded Nodes - 382, Total Nodes - 394 
APP: Init ... !!!
 15821.757931 s: MEM: Init ... !!!
 15821.758271 s: MEM: Initialized DMA HEAP (fd=5) !!!
 15821.758576 s: MEM: Init ... Done !!!
 15821.758765 s: IPC: Init ... !!!
 15822.126204 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
 15822.292806 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
 15822.293302 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 15822.293456 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 15822.293587 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 15822.294372 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 15822.294830 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 15822.295133 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 15822.295435 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 15822.295605 s:  VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
 15822.295640 s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
Loaded model with providers: ['TIDLExecutionProvider', 'CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -44.924595 -2.3853767
Output dtype: float32
output [array([[[[ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
           -2.7829394,  -2.7829394],
         [ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
           -2.7829394,  -2.7829394],
         [ -3.1805022,  -2.7829394,  -3.1805022, ...,  -3.1805022,
           -3.1805022,  -2.7829394],
         ...,
         [ -3.578065 ,  -3.578065 ,  -3.578065 , ...,  -3.1805022,
           -3.9756277,  -3.9756277],
         [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
           -3.578065 ,  -3.9756277],
         [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
           -3.578065 ,  -3.578065 ]],

        [[-13.517135 , -13.517135 , -13.517135 , ..., -15.504948 ,
          -15.107386 , -15.107386 ],
         [-13.914697 , -13.517135 , -13.914697 , ..., -15.902511 ,
          -15.504948 , -15.504948 ],
         [-14.709823 , -14.31226  , -14.31226  , ..., -15.902511 ,
          -15.504948 , -15.504948 ],
         ...,
         [-15.107386 , -15.504948 , -14.709823 , ..., -11.926883 ,
          -13.119572 , -13.119572 ],
         [-15.107386 , -15.504948 , -14.31226  , ..., -12.324446 ,
          -12.722009 , -13.119572 ],
         [-14.709823 , -15.504948 , -14.31226  , ..., -13.119572 ,
          -12.722009 , -13.119572 ]],

        [[-11.926883 , -11.529321 , -11.529321 , ..., -14.709823 ,
          -13.914697 , -13.119572 ],
         [-12.324446 , -11.926883 , -11.529321 , ..., -14.709823 ,
          -13.914697 , -13.914697 ],
         [-13.517135 , -13.119572 , -12.722009 , ..., -14.709823 ,
          -14.31226  , -13.517135 ],
         ...,
         [-15.504948 , -16.697636 , -15.504948 , ..., -11.529321 ,
          -12.324446 , -13.119572 ],
         [-15.504948 , -17.0952   , -14.31226  , ..., -11.926883 ,
          -11.529321 , -13.119572 ],
         [-15.107386 , -15.107386 , -14.31226  , ..., -12.722009 ,
          -11.529321 , -13.119572 ]],

        ...,

        [[-21.070827 , -20.275702 , -19.87814  , ..., -22.263515 ,
          -22.263515 , -22.263515 ],
         [-21.46839  , -21.070827 , -20.275702 , ..., -23.058641 ,
          -22.661077 , -23.058641 ],
         [-22.263515 , -21.46839  , -20.673264 , ..., -23.456203 ,
          -22.661077 , -23.058641 ],
         ...,
         [-25.444017 , -26.239143 , -25.046455 , ..., -19.083014 ,
          -21.46839  , -21.46839  ],
         [-25.84158  , -25.84158  , -23.456203 , ..., -20.673264 ,
          -21.46839  , -22.263515 ],
         [-25.444017 , -27.03427  , -24.648891 , ..., -21.46839  ,
          -21.46839  , -21.865952 ]],

        [[-19.083014 , -18.287888 , -18.287888 , ..., -20.275702 ,
          -19.87814  , -19.87814  ],
         [-19.87814  , -19.083014 , -19.083014 , ..., -20.673264 ,
          -20.275702 , -20.673264 ],
         [-21.070827 , -19.480576 , -19.87814  , ..., -20.673264 ,
          -20.275702 , -20.275702 ],
         ...,
         [-24.648891 , -25.84158  , -23.456203 , ..., -18.68545  ,
          -21.46839  , -21.865952 ],
         [-24.648891 , -25.444017 , -22.263515 , ..., -20.275702 ,
          -21.070827 , -22.661077 ],
         [-23.456203 , -25.046455 , -23.058641 , ..., -21.865952 ,
          -21.46839  , -22.263515 ]],

        [[ -7.15613  ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
           -7.553693 ,  -7.553693 ],
         [ -7.553693 ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
           -6.7585673,  -7.9512553],
         [ -8.348818 ,  -7.553693 ,  -7.15613  , ...,  -7.9512553,
           -7.15613  ,  -7.9512553],
         ...,
         [-20.275702 , -21.070827 , -19.083014 , ..., -15.107386 ,
          -17.0952   , -17.492762 ],
         [-20.275702 , -20.673264 , -18.68545  , ..., -15.902511 ,
          -16.697636 , -17.492762 ],
         [-19.083014 , -20.673264 , -19.480576 , ..., -16.697636 ,
          -17.0952   , -17.492762 ]]]], dtype=float32)]
Mask unique values: [ 0 18]
APP: Deinit ... !!!
REMOTE_SERVICE: Deinit ... !!!
REMOTE_SERVICE: Deinit ... Done !!!
 15825.212523 s: IPC: Deinit ... !!!
 15825.753588 s: IPC: DeInit ... Done !!!
 15825.753657 s: MEM: Deinit ... !!!
 15825.758545 s: DDR_SHARED_MEM: Alloc's: 35 alloc's of 132751508 bytes 
 15825.758604 s: DDR_SHARED_MEM: Free's : 35 free's  of 132751508 bytes 
 15825.758628 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes 
 15825.758666 s: MEM: Deinit ... Done !!!
APP: Deinit ... Done !!!
root@am69-sk:/zxseg# python3 infer_test.py 
EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
Loaded model with providers: ['CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -37.647076 1.502726
Output dtype: float32
output [array([[[[-9.10738182e+00, -8.46180344e+00, -8.89690208e+00, ...,
          -7.30843163e+00, -7.19831467e+00, -7.42499590e+00],
         [-9.50431728e+00, -8.79891968e+00, -9.27894783e+00, ...,
          -7.61665154e+00, -7.40585041e+00, -7.66443872e+00],
         [-1.01199865e+01, -9.30077934e+00, -9.55318069e+00, ...,
          -7.91833210e+00, -7.49162006e+00, -7.73509026e+00],
         ...,
         [-8.89202118e+00, -9.60954189e+00, -8.67526245e+00, ...,
          -8.45482254e+00, -9.26228619e+00, -9.42125702e+00],
         [-8.94773388e+00, -9.37217522e+00, -7.73367977e+00, ...,
          -8.67119598e+00, -8.89135551e+00, -9.23087692e+00],
         [-8.78436470e+00, -8.72591019e+00, -7.97214413e+00, ...,
          -9.18196774e+00, -8.78701019e+00, -9.19165039e+00]],

        [[-1.05580688e+00, -5.19240439e-01, -6.46204770e-01, ...,
           3.97472233e-02,  1.96441859e-02, -2.23712787e-01],
         [-1.28475881e+00, -6.92354441e-01, -9.00263131e-01, ...,
           2.34207064e-02,  8.41307044e-02, -1.85798183e-01],
         [-1.68210590e+00, -9.01914299e-01, -1.04362941e+00, ...,
          -1.41012087e-01,  7.51651824e-02, -1.63039312e-01],
         ...,
         [-8.75086594e+00, -9.43372631e+00, -8.31778622e+00, ...,
          -1.16131220e+01, -1.23025236e+01, -1.22214193e+01],
         [-8.66691113e+00, -9.01147556e+00, -7.58560610e+00, ...,
          -1.17375622e+01, -1.17695818e+01, -1.21904783e+01],
         [-8.39127922e+00, -8.39451599e+00, -7.91367579e+00, ...,
          -1.19990225e+01, -1.17306757e+01, -1.22769880e+01]],

        [[-9.35542679e+00, -8.22606564e+00, -9.77154541e+00, ...,
          -4.36007261e+00, -3.64588404e+00, -2.74775696e+00],
         [-9.54254723e+00, -8.22373581e+00, -9.83346272e+00, ...,
          -4.21152925e+00, -3.36076975e+00, -2.49715137e+00],
         [-1.02955265e+01, -8.80494118e+00, -9.67026711e+00, ...,
          -4.68067694e+00, -3.28478169e+00, -2.06064844e+00],
         ...,
         [-1.15484982e+01, -1.39187107e+01, -1.15395317e+01, ...,
          -1.06908541e+01, -1.08596764e+01, -1.14195423e+01],
         [-1.20424957e+01, -1.34465389e+01, -8.34186172e+00, ...,
          -1.16886873e+01, -1.00103464e+01, -1.12006750e+01],
         [-1.16304359e+01, -1.15399914e+01, -7.32819128e+00, ...,
          -1.30932884e+01, -9.65425110e+00, -1.13869896e+01]],

        ...,

        [[-1.65767632e+01, -1.48137388e+01, -1.53054876e+01, ...,
          -1.22598991e+01, -1.19845648e+01, -1.21397972e+01],
         [-1.73164673e+01, -1.56152582e+01, -1.56489859e+01, ...,
          -1.25846987e+01, -1.23050089e+01, -1.25115051e+01],
         [-1.88177929e+01, -1.66239719e+01, -1.62385197e+01, ...,
          -1.31708536e+01, -1.24973049e+01, -1.26569748e+01],
         ...,
         [-1.65434132e+01, -1.66482334e+01, -1.49522409e+01, ...,
          -1.73858204e+01, -1.91236134e+01, -1.93649387e+01],
         [-1.65106354e+01, -1.62999249e+01, -1.37105570e+01, ...,
          -1.81827908e+01, -1.84118557e+01, -1.91355877e+01],
         [-1.60666618e+01, -1.56589861e+01, -1.40776854e+01, ...,
          -1.88848476e+01, -1.80791683e+01, -1.91343880e+01]],

        [[-1.40990629e+01, -1.18888369e+01, -1.17627220e+01, ...,
          -9.66371632e+00, -9.24732304e+00, -9.20552731e+00],
         [-1.45914030e+01, -1.25023279e+01, -1.20748682e+01, ...,
          -9.73106861e+00, -9.32184887e+00, -9.38448811e+00],
         [-1.58103552e+01, -1.32339678e+01, -1.24932852e+01, ...,
          -9.88481331e+00, -9.29177380e+00, -9.32881260e+00],
         ...,
         [-1.67702923e+01, -1.75025196e+01, -1.59494038e+01, ...,
          -1.84159298e+01, -1.95513268e+01, -1.96526814e+01],
         [-1.68778381e+01, -1.70773335e+01, -1.45592756e+01, ...,
          -1.91143475e+01, -1.89065056e+01, -1.97410011e+01],
         [-1.64891090e+01, -1.61917324e+01, -1.50849390e+01, ...,
          -2.03280964e+01, -1.88051624e+01, -2.02607059e+01]],

        [[-1.16451569e+01, -1.01056519e+01, -1.04362154e+01, ...,
          -7.99716377e+00, -8.05412006e+00, -7.95117903e+00],
         [-1.17157431e+01, -1.03467522e+01, -1.02508469e+01, ...,
          -7.81100845e+00, -7.57984543e+00, -8.00262260e+00],
         [-1.24734840e+01, -1.07574902e+01, -1.04642143e+01, ...,
          -8.11416531e+00, -7.45732689e+00, -7.94904995e+00],
         ...,
         [-9.75324059e+00, -1.07886009e+01, -9.89416981e+00, ...,
          -1.33662148e+01, -1.47563696e+01, -1.48689194e+01],
         [-9.78142452e+00, -1.02218876e+01, -8.56538582e+00, ...,
          -1.35902987e+01, -1.42914600e+01, -1.45997658e+01],
         [-9.69767570e+00, -9.56430054e+00, -8.45411301e+00, ...,
          -1.51341333e+01, -1.42068319e+01, -1.50633011e+01]]]],
      dtype=float32)]
Mask unique values: [  1   6  11  12  20  87 149]
root@am69-sk:/zxseg# 


INFER SCRIPT:

import onnxruntime as ort
import numpy as np
import cv2

model = "/zxseg/ss-ort-nvidia-b0/model/optimized_model.onnx"
image_path = "/zxseg/DAT/people-walking-through-business-district-in-the-city-at-sunset_bbxoqvaod_thumbnail-1080_09.png"
artifacts_dir = "/zxseg/ss-ort-nvidia-b0/artifacts"
# artifacts_dir = "/zxseg/artifacts2"

# Set up TIDL provider options
so = ort.SessionOptions()
runtime_options = {
    "artifacts_folder": artifacts_dir,
}
# providers = ['TIDLExecutionProvider', 'CPUExecutionProvider']
providers = ['CPUExecutionProvider']
provider_options = [runtime_options, {}]


ort_session = ort.InferenceSession(
    model,
    providers=providers,
    provider_options=provider_options,
    sess_options=so
)
print(f"Loaded model with providers: {ort_session.get_providers()}")

original_image = cv2.imread(image_path)
original_h, original_w = original_image.shape[:2]

image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (512, 512))
image = image.astype(np.float32)
mean = np.array([123.675, 116.28, 103.53], dtype=np.float32).reshape(3, 1, 1)
scale = np.array([0.017125, 0.017507, 0.017429], dtype=np.float32).reshape(3, 1, 1)
image = np.transpose(image, (2, 0, 1))  # (3, 512, 512)
image = (image - mean) * scale
image = np.expand_dims(image, axis=0)

inputs = {ort_session.get_inputs()[0].name: image}
outputs = ort_session.run(None, inputs)

print("Output shape:", outputs[0].shape)
print("Output min/max:", outputs[0].min(), outputs[0].max())
print("Output dtype:", outputs[0].dtype)
print("output",outputs)

# Try both axis=0 and axis=-1 for argmax, print unique values
if outputs[0].shape[0] <= 20:  # likely [C, H, W]
    output = outputs[0][0]
    mask = np.argmax(output, axis=0)
else:  # possibly [1, H, W, C] or [1, C, H, W]
    output = outputs[0][0]  # shape: [num_classes, H, W]
    mask = np.argmax(output, axis=0)  # shape: [H, W]

print("Mask unique values:", np.unique(mask))

# Use mask for palette and overlay
num_classes = 150  # ADE20K has 150 classes

def get_palette(num_classes):
    palette = np.random.randint(0, 255, size=(num_classes, 3), dtype=np.uint8)
    palette[0] = [0, 0, 0]  # background as black
    return palette

palette = get_palette(num_classes)

# Resize mask to original image size
mask_resized = cv2.resize(mask.astype(np.uint8), (original_w, original_h), interpolation=cv2.INTER_NEAREST)

# Map each class index to its color
output_color = palette[mask_resized]  # shape: (H, W, 3)

# Overlay the mask on the original image
overlay = cv2.addWeighted(original_image, 0.6, output_color, 0.4, 0)

cv2.imwrite("segmentation_overlay.png", overlay)

method 2 ::

ADVANCE WAY IE CUSTOM WAY 

STILL SAME ERROR 

wrote by using yaml and configs in model zoo

import onnxruntime as ort
import cv2
import numpy as np
import os

# Segmentation model configuration based on config.yaml
model_path = '/root/ti2/edgeai-tidl-tools/ken_seg/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx'
calibration_images_path = '/root/ti2/edgeai-tidl-tools/ken_seg/DAT' # ADE20k validation images
out_dir_path = '/root/ti2/edgeai-tidl-tools/ken_seg/artifacts2'

def preprocess(image_path):
"""Preprocess image for segmentation model based on config.yaml preprocessing settings"""
img = cv2.imread(image_path)
if img is None:
raise RuntimeError(f'Failed to read image: {image_path}')

img = cv2.resize(img, (512, 512))

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

img = img.astype(np.float32)
# Apply normalization from config.yaml
# input_mean: [123.675, 116.28, 103.53]
# input_scale: [0.017125, 0.017507, 0.017429] (which is 1/std where std=[58.395, 57.12, 57.375])
mean = np.array([123.675, 116.28, 103.53], dtype=np.float32)
scale = np.array([0.017125, 0.017507, 0.017429], dtype=np.float32)
img = (img - mean) * scale
img = img.astype(np.float32)
img = np.transpose(img, (2, 0, 1))
# Add batch dimension
img = np.expand_dims(img, axis=0)
return img

all_calib_images = [os.path.join(calibration_images_path, name) for name in os.listdir(calibration_images_path)
if name.lower().endswith(('.png', '.jpg', '.jpeg'))]
if len(all_calib_images) == 0:
raise RuntimeError(f'No calibration images found in {calibration_images_path}')

# Limit calibration images to match config (calibration_frames: 12)
max_calib_frames = 12
all_calib_images = all_calib_images[:max_calib_frames]

tidl_tools_path = os.environ.get('TIDL_TOOLS_PATH')
if not tidl_tools_path:
raise EnvironmentError('TIDL_TOOLS_PATH environment variable not set.')

os.makedirs(out_dir_path, exist_ok=True)
for root, dirs, files in os.walk(out_dir_path, topdown=False):
for f in files:
os.remove(os.path.join(root, f))
for d in dirs:
os.rmdir(os.path.join(root, d))

compile_options = {
# Core settings from runtime_options in config.yaml
'tidl_tools_path': tidl_tools_path,
'artifacts_folder': out_dir_path,
'platform': 'J7',
'version': '10.1',
'import': 'yes',
'tensor_bits': 8,
'accuracy_level': 1,
'debug_level': 0,
'inference_mode': 0,
'advanced_options:calibration_frames': len(all_calib_images),
'advanced_options:calibration_iterations': 12,
'advanced_options:quantization_scale_type': 4,
'advanced_options:activation_clipping': 1,
'advanced_options:weight_clipping': 1,
'advanced_options:bias_calibration': 1,
'advanced_options:high_resolution_optimization': 0,
'advanced_options:pre_batchnorm_fold': 1,

'advanced_options:output_feature_16bit_names_list': '',
'advanced_options:params_16bit_names_list': '',

'advanced_options:add_data_convert_ops': 3,

'ti_internal_nc_flag': 83886080,

'advanced_options:max_num_subgraph_nodes': 2048,
}

so = ort.SessionOptions()
providers = ['TIDLCompilationProvider', 'CPUExecutionProvider']

session = ort.InferenceSession(
model_path,
providers=providers,
provider_options=[compile_options, {}],
session_options=so,
)

input_name = session.get_inputs()[0].name

for img_path in all_calib_images:
img = preprocess(img_path)
session.run(None, {input_name: img})

print("DONE: Segmentation model compiled successfully with TIDL Compilation Provider.")
print(f"Artifacts saved to: {out_dir_path}")
print(f"Model input shape: {session.get_inputs()[0].shape}")
print(f"Model output shape: {session.get_outputs()[0].shape}")
print(f"Used {len(all_calib_images)} calibration images")

python3 compile_seg_b0.py
========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              11_00_06_00             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              11_00_00_00             |
-------------------------------------------------------------------------------
|            Runtime Version           |                1.15.0                |
-------------------------------------------------------------------------------
|          Model Opset Version         |                  17                  |
-------------------------------------------------------------------------------

============================== [Parsing Started] ==============================

[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network,
 please specify "model_type":"OD" as part of OSRT compilation options                                                                                                                 [TIDL Import]  WARNING: Resize layer - /decode_head/Resize_3 with scales > 4 is not optimal

------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
|          Core           |      No. of Nodes       |   Number of Subgraphs   |
-------------------------------------------------------------------------------
| C7x                     |                     383 |                       3 |
| CPU                     |                      12 |                       x |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   Node    |                        Node Name                        |                                    Reason                                     |
-------------------------------------------------------------------------------------------------------------------------------------------------------
| Split     | _token_0                                                | Layer 295 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose                                               | Layer 296 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_1                                             | Layer 299 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_3                                       | Layer 303 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 0_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.0/attention/self/Transpose_2 | Layer 300 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Split     | _token_7                                                | Layer 331 - op type Split, Unknown input dimension, not supported by TIDL     |
| Transpose | Transpose_token_6                                       | Layer 332 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_1                                             | Layer 335 - op type Squeeze, Unknown input dimension, not supported by TIDL   |
| Transpose | Transpose_token_10                                      | Layer 339 - op type Transpose, Unknown input dimension, not supported by TIDL |
| Squeeze   | 1_squeeze_0                                             | Subgraph does not have any compute node                                       |
| Transpose | /segformer/encoder/block.3.1/attention/self/Transpose_2 | Layer 336 - op type Transpose, Unknown input dimension, not supported by TIDL |
-------------------------------------------------------------------------------------------------------------------------------------------------------
============================= [Parsing Completed] =============================

==================== [Optimization for subgraph_0 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                       12 |
| TIDL_TransposeLayer    |                        61 |                       69 |
| TIDL_ConstDataLayer    |                         0 |                       67 |
| TIDL_LayerNormLayer    |                        26 |                       26 |
| TIDL_ConvolutionLayer  |                        16 |                       12 |
| TIDL_InnerProductLayer |                        52 |                       75 |
| TIDL_EltWiseLayer      |                        82 |                       16 |
| TIDL_PoolingLayer      |                         2 |                        2 |
| TIDL_ResizeLayer       |                         3 |                        2 |
| TIDL_SoftMaxLayer      |                         6 |                        6 |
| TIDL_ErfLayer          |                         6 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 361

=================== [Optimization for subgraph_0 Completed] ===================

The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 0.14s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 0.38s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 0.139s:  VX_ZONE_INFO: [ownAddTargetKernelInternal:189] registered kernel vx_tutorial_graph.phase_rgb on target DSP_C7-2
 0.1023s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 0.1042s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 0.1067s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 0.1106s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 0.1124s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1 
 0.1139s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_2 
 0.1153s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_3 
 0.1166s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_4 
 0.1183s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_5 
 0.1198s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_6 
 0.1212s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_7 
 0.1225s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-1_PRI_8 
 0.1237s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2 
 0.1251s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_2 
 0.1262s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_3 
 0.1274s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_4 
 0.1288s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_5 
 0.1299s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_6 
 0.1312s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_7 
 0.1325s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-2_PRI_8 
 0.1341s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3 
 0.1356s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_2 
 0.1368s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_3 
 0.1379s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_4 
 0.1395s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_5 
 0.1407s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_6 
 0.1419s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_7 
 0.1432s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-3_PRI_8 
 0.1448s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4 
 0.1461s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_2 
 0.1472s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_3 
 0.1484s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_4 
 0.1496s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_5 
 0.1509s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_6 
 0.1521s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_7 
 0.1534s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSP_C7-4_PRI_8 
 0.1550s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-0 
 0.1563s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_NF 
 0.1574s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_LDC1 
 0.1585s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC1 
 0.1596s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_MSC2 
 0.1610s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC_VISS1 
 0.1622s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE1 
 0.1635s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE2 
 0.1649s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE3 
 0.1661s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE4 
 0.1672s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE5 
 0.1686s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE6 
 0.1699s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE7 
 0.1712s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE8 
 0.1724s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE9 
 0.1737s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE10 
 0.1748s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE11 
 0.1759s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CAPTURE12 
 0.1771s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY1 
 0.1784s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DISPLAY2 
 0.1797s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX 
 0.1812s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target CSITX2 
 0.1829s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M1 
 0.1843s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M2 
 0.1857s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M3 
 0.1869s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DSS_M2M4 
 0.1880s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC1_FC 
 0.1896s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU2-1 
 0.1910s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_SDE 
 0.1922s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target DMPAC_DOF 
 0.1938s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-0 
 0.1951s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU3-1 
 0.1965s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-0 
 0.1977s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_NF 
 0.1989s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_LDC1 
 0.2001s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC1 
 0.2013s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_MSC2 
 0.2026s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_VISS1 
 0.2039s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target VPAC2_FC 
 0.2053s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MCU4-1 
 0.2055s:  VX_ZONE_INFO: [tivxInit:152] Initialization Done !!!
 0.2058s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
============= [Quantization & Calibration for subgraph_0 Started] =============

==================== [Optimization for subgraph_1 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_ErfLayer          |                         1 |                        0 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConstDataLayer    |                         0 |                        4 |
| TIDL_ConvolutionLayer  |                         1 |                        1 |
| TIDL_TransposeLayer    |                         3 |                        4 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_InnerProductLayer |                         6 |                        6 |
| TIDL_EltWiseLayer      |                        11 |                        2 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 38

=================== [Optimization for subgraph_1 Completed] ===================

============= [Quantization & Calibration for subgraph_1 Started] =============

==================== [Optimization for subgraph_2 Started] ====================

----------------------------- Optimization Summary -----------------------------
---------------------------------------------------------------------------------
|          Layer         | Nodes before optimization | Nodes after optimization |
---------------------------------------------------------------------------------
| TIDL_BatchNormLayer    |                         0 |                        2 |
| TIDL_EltWiseLayer      |                        11 |                        2 |
| TIDL_ConstDataLayer    |                         0 |                        5 |
| TIDL_SqueezeLayer      |                         1 |                        0 |
| TIDL_TransposeLayer    |                         4 |                        5 |
| TIDL_InnerProductLayer |                         6 |                        7 |
| TIDL_ConvolutionLayer  |                         3 |                        3 |
| TIDL_SoftMaxLayer      |                         1 |                        1 |
| TIDL_ResizeLayer       |                         1 |                        2 |
| TIDL_LayerNormLayer    |                         2 |                        2 |
| TIDL_ConcatLayer       |                         1 |                        1 |
| TIDL_ReLULayer         |                         1 |                        0 |
| TIDL_ErfLayer          |                         1 |                        0 |
---------------------------------------------------------------------------------

Total nodes in subgraph: 50

=================== [Optimization for subgraph_2 Completed] ===================

============= [Quantization & Calibration for subgraph_2 Started] =============



-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [1 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [2 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [3 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [4 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [5 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [6 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [7 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [8 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [9 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [10 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [11 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [12 / 12]: -----------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [1 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [2 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [3 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [4 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [5 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [6 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [7 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [8 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [9 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [10 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [11 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [12 / 12]: -----------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================




-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [1 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [2 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [3 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [4 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [5 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [6 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [7 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [8 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [9 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [10 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [11 / 12]: -----------------
[=============================================================================] 100 %

----------------- Fixed-point Calibration Iteration [12 / 12]: -----------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

Rerunning network compiler...
========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================



DONE: Segmentation model compiled successfully with TIDL Compilation Provider.
Artifacts saved to: /root/ti2/edgeai-tidl-tools/ken_seg/artifacts2
Model input shape: [1, 3, 512, 512]
Model output shape: [1, 150, 128, 128]
Used 12 calibration images
MEM: Deinit ... !!!
MEM: Alloc's: 92 alloc's of 1501308027 bytes 
MEM: Free's : 92 free's  of 1501308027 bytes 
MEM: Open's : 0 allocs  of 0 bytes 
MEM: Deinit ... Done !!!


same issue 

root@am69-sk:/zxseg# python3 infer_test.py 
libtidl_onnxrt_EP loaded 0x3d72ca30 
Final number of subgraphs created are : 3, - Offloaded Nodes - 383, Total Nodes - 395 
APP: Init ... !!!
 18178.158420 s: MEM: Init ... !!!
 18178.158773 s: MEM: Initialized DMA HEAP (fd=5) !!!
 18178.159101 s: MEM: Init ... Done !!!
 18178.159313 s: IPC: Init ... !!!
 18178.392594 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
 18178.442924 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
 18178.443344 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
 18178.443370 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
 18178.443389 s:  VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
 18178.444054 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0 
 18178.444602 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1 
 18178.445087 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2 
 18178.445488 s:  VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3 
 18178.445680 s:  VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
 18178.445721 s:  VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO
Loaded model with providers: ['TIDLExecutionProvider', 'CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -40.24002 -2.2006261
Output dtype: float32
output [array([[[[ -3.4581268,  -3.4581268,  -3.4581268, ...,  -2.8293765,
           -2.8293765,  -2.8293765],
         [ -3.4581268,  -3.4581268,  -3.4581268, ...,  -3.1437516,
           -2.8293765,  -2.8293765],
         [ -3.4581268,  -3.4581268,  -3.772502 , ...,  -3.1437516,
           -2.8293765,  -2.8293765],
         ...,
         [ -3.1437516,  -3.4581268,  -3.4581268, ...,  -2.8293765,
           -3.4581268,  -3.772502 ],
         [ -3.1437516,  -3.4581268,  -2.8293765, ...,  -3.1437516,
           -3.1437516,  -3.4581268],
         [ -3.1437516,  -3.1437516,  -3.1437516, ...,  -3.4581268,
           -3.1437516,  -3.4581268]],

        [[-11.946257 , -11.631881 , -11.631881 , ..., -14.775633 ,
          -14.146882 , -14.146882 ],
         [-12.5750065, -11.946257 , -12.260632 , ..., -15.090008 ,
          -14.461258 , -14.775633 ],
         [-12.889381 , -12.260632 , -12.5750065, ..., -15.404383 ,
          -14.775633 , -15.090008 ],
         ...,
         [-16.033133 , -16.661884 , -16.033133 , ..., -13.518132 ,
          -14.775633 , -15.090008 ],
         [-15.718758 , -16.347507 , -15.090008 , ..., -14.146882 ,
          -14.146882 , -15.090008 ],
         [-15.718758 , -16.033133 , -15.718758 , ..., -14.775633 ,
          -14.461258 , -15.090008 ]],

        [[-11.317506 , -11.003131 , -10.688755 , ..., -14.146882 ,
          -13.203756 , -12.260632 ],
         [-11.631881 , -11.317506 , -11.003131 , ..., -13.832507 ,
          -13.203756 , -12.889381 ],
         [-12.5750065, -12.260632 , -11.631881 , ..., -14.461258 ,
          -13.832507 , -13.203756 ],
         ...,
         [-14.461258 , -15.404383 , -15.090008 , ..., -11.003131 ,
          -11.946257 , -12.5750065],
         [-14.775633 , -16.033133 , -13.203756 , ..., -11.631881 ,
          -11.003131 , -12.889381 ],
         [-14.461258 , -14.461258 , -12.889381 , ..., -12.5750065,
          -11.317506 , -13.518132 ]],

        ...,

        [[-19.805635 , -18.86251  , -18.548134 , ..., -19.176886 ,
          -18.548134 , -18.86251  ],
         [-20.12001  , -19.49126  , -18.86251  , ..., -19.805635 ,
          -19.176886 , -20.12001  ],
         [-20.434385 , -19.805635 , -19.49126  , ..., -20.74876  ,
          -19.49126  , -20.434385 ],
         ...,
         [-24.206888 , -24.835638 , -23.892513 , ..., -16.347507 ,
          -19.176886 , -19.49126  ],
         [-23.578136 , -24.521263 , -22.006262 , ..., -17.60501  ,
          -18.548134 , -19.805635 ],
         [-23.263762 , -25.150013 , -23.263762 , ..., -19.176886 ,
          -18.548134 , -19.49126  ]],

        [[-20.12001  , -18.86251  , -19.176886 , ..., -21.37751  ,
          -20.74876  , -20.74876  ],
         [-20.74876  , -19.805635 , -19.805635 , ..., -21.691887 ,
          -20.74876  , -21.691887 ],
         [-22.006262 , -20.434385 , -20.74876  , ..., -22.320637 ,
          -21.063135 , -22.006262 ],
         ...,
         [-23.892513 , -25.464388 , -23.263762 , ..., -16.97626  ,
          -19.805635 , -20.434385 ],
         [-23.263762 , -24.206888 , -21.37751  , ..., -18.233759 ,
          -19.176886 , -20.434385 ],
         [-22.320637 , -24.206888 , -22.320637 , ..., -20.12001  ,
          -19.49126  , -21.063135 ]],

        [[-10.37438  ,  -9.431255 ,  -9.116879 , ..., -10.688755 ,
          -10.688755 , -10.688755 ],
         [-10.688755 ,  -9.74563  ,  -9.74563  , ..., -11.003131 ,
          -10.060005 , -11.317506 ],
         [-11.003131 , -10.060005 , -10.060005 , ..., -11.631881 ,
          -10.37438  , -11.631881 ],
         ...,
         [-22.949387 , -24.206888 , -22.320637 , ..., -15.404383 ,
          -17.919384 , -18.548134 ],
         [-22.949387 , -23.578136 , -21.063135 , ..., -16.347507 ,
          -17.290634 , -18.233759 ],
         [-22.320637 , -23.578136 , -22.006262 , ..., -17.919384 ,
          -17.290634 , -18.548134 ]]]], dtype=float32)]
Mask unique values: [ 0 38 95]
APP: Deinit ... !!!
REMOTE_SERVICE: Deinit ... !!!
REMOTE_SERVICE: Deinit ... Done !!!
 18181.244298 s: IPC: Deinit ... !!!
 18181.732321 s: IPC: DeInit ... Done !!!
 18181.732384 s: MEM: Deinit ... !!!
 18181.735611 s: DDR_SHARED_MEM: Alloc's: 35 alloc's of 132765800 bytes 
 18181.735656 s: DDR_SHARED_MEM: Free's : 35 free's  of 132765800 bytes 
 18181.735676 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes 
 18181.735702 s: MEM: Deinit ... Done !!!
APP: Deinit ... Done !!!
root@am69-sk:/zxseg# python3 infer_test.py 
EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider']
Falling back to ['CPUExecutionProvider'] and retrying.
Loaded model with providers: ['CPUExecutionProvider']
Output shape: (1, 150, 128, 128)
Output min/max: -37.647076 1.502726
Output dtype: float32
output [array([[[[-9.10738182e+00, -8.46180344e+00, -8.89690208e+00, ...,
          -7.30843163e+00, -7.19831467e+00, -7.42499590e+00],
         [-9.50431728e+00, -8.79891968e+00, -9.27894783e+00, ...,
          -7.61665154e+00, -7.40585041e+00, -7.66443872e+00],
         [-1.01199865e+01, -9.30077934e+00, -9.55318069e+00, ...,
          -7.91833210e+00, -7.49162006e+00, -7.73509026e+00],
         ...,
         [-8.89202118e+00, -9.60954189e+00, -8.67526245e+00, ...,
          -8.45482254e+00, -9.26228619e+00, -9.42125702e+00],
         [-8.94773388e+00, -9.37217522e+00, -7.73367977e+00, ...,
          -8.67119598e+00, -8.89135551e+00, -9.23087692e+00],
         [-8.78436470e+00, -8.72591019e+00, -7.97214413e+00, ...,
          -9.18196774e+00, -8.78701019e+00, -9.19165039e+00]],

        [[-1.05580688e+00, -5.19240439e-01, -6.46204770e-01, ...,
           3.97472233e-02,  1.96441859e-02, -2.23712787e-01],
         [-1.28475881e+00, -6.92354441e-01, -9.00263131e-01, ...,
           2.34207064e-02,  8.41307044e-02, -1.85798183e-01],
         [-1.68210590e+00, -9.01914299e-01, -1.04362941e+00, ...,
          -1.41012087e-01,  7.51651824e-02, -1.63039312e-01],
         ...,
         [-8.75086594e+00, -9.43372631e+00, -8.31778622e+00, ...,
          -1.16131220e+01, -1.23025236e+01, -1.22214193e+01],
         [-8.66691113e+00, -9.01147556e+00, -7.58560610e+00, ...,
          -1.17375622e+01, -1.17695818e+01, -1.21904783e+01],
         [-8.39127922e+00, -8.39451599e+00, -7.91367579e+00, ...,
          -1.19990225e+01, -1.17306757e+01, -1.22769880e+01]],

        [[-9.35542679e+00, -8.22606564e+00, -9.77154541e+00, ...,
          -4.36007261e+00, -3.64588404e+00, -2.74775696e+00],
         [-9.54254723e+00, -8.22373581e+00, -9.83346272e+00, ...,
          -4.21152925e+00, -3.36076975e+00, -2.49715137e+00],
         [-1.02955265e+01, -8.80494118e+00, -9.67026711e+00, ...,
          -4.68067694e+00, -3.28478169e+00, -2.06064844e+00],
         ...,
         [-1.15484982e+01, -1.39187107e+01, -1.15395317e+01, ...,
          -1.06908541e+01, -1.08596764e+01, -1.14195423e+01],
         [-1.20424957e+01, -1.34465389e+01, -8.34186172e+00, ...,
          -1.16886873e+01, -1.00103464e+01, -1.12006750e+01],
         [-1.16304359e+01, -1.15399914e+01, -7.32819128e+00, ...,
          -1.30932884e+01, -9.65425110e+00, -1.13869896e+01]],

        ...,

        [[-1.65767632e+01, -1.48137388e+01, -1.53054876e+01, ...,
          -1.22598991e+01, -1.19845648e+01, -1.21397972e+01],
         [-1.73164673e+01, -1.56152582e+01, -1.56489859e+01, ...,
          -1.25846987e+01, -1.23050089e+01, -1.25115051e+01],
         [-1.88177929e+01, -1.66239719e+01, -1.62385197e+01, ...,
          -1.31708536e+01, -1.24973049e+01, -1.26569748e+01],
         ...,
         [-1.65434132e+01, -1.66482334e+01, -1.49522409e+01, ...,
          -1.73858204e+01, -1.91236134e+01, -1.93649387e+01],
         [-1.65106354e+01, -1.62999249e+01, -1.37105570e+01, ...,
          -1.81827908e+01, -1.84118557e+01, -1.91355877e+01],
         [-1.60666618e+01, -1.56589861e+01, -1.40776854e+01, ...,
          -1.88848476e+01, -1.80791683e+01, -1.91343880e+01]],

        [[-1.40990629e+01, -1.18888369e+01, -1.17627220e+01, ...,
          -9.66371632e+00, -9.24732304e+00, -9.20552731e+00],
         [-1.45914030e+01, -1.25023279e+01, -1.20748682e+01, ...,
          -9.73106861e+00, -9.32184887e+00, -9.38448811e+00],
         [-1.58103552e+01, -1.32339678e+01, -1.24932852e+01, ...,
          -9.88481331e+00, -9.29177380e+00, -9.32881260e+00],
         ...,
         [-1.67702923e+01, -1.75025196e+01, -1.59494038e+01, ...,
          -1.84159298e+01, -1.95513268e+01, -1.96526814e+01],
         [-1.68778381e+01, -1.70773335e+01, -1.45592756e+01, ...,
          -1.91143475e+01, -1.89065056e+01, -1.97410011e+01],
         [-1.64891090e+01, -1.61917324e+01, -1.50849390e+01, ...,
          -2.03280964e+01, -1.88051624e+01, -2.02607059e+01]],

        [[-1.16451569e+01, -1.01056519e+01, -1.04362154e+01, ...,
          -7.99716377e+00, -8.05412006e+00, -7.95117903e+00],
         [-1.17157431e+01, -1.03467522e+01, -1.02508469e+01, ...,
          -7.81100845e+00, -7.57984543e+00, -8.00262260e+00],
         [-1.24734840e+01, -1.07574902e+01, -1.04642143e+01, ...,
          -8.11416531e+00, -7.45732689e+00, -7.94904995e+00],
         ...,
         [-9.75324059e+00, -1.07886009e+01, -9.89416981e+00, ...,
          -1.33662148e+01, -1.47563696e+01, -1.48689194e+01],
         [-9.78142452e+00, -1.02218876e+01, -8.56538582e+00, ...,
          -1.35902987e+01, -1.42914600e+01, -1.45997658e+01],
         [-9.69767570e+00, -9.56430054e+00, -8.45411301e+00, ...,
          -1.51341333e+01, -1.42068319e+01, -1.50633011e+01]]]],
      dtype=float32)]
Mask unique values: [  1   6  11  12  20  87 149]
root@am69-sk:/zxseg# 



please help,
regards Venkat

  • Hi Venkat,

    I was able to compile the model and run under emulation and on the device.  I will need some more information on the decreasing accuracy part.  I need to see expected and actual results before I can make a determination.   Also, in general, we cannot debug custom scripts only with the standard TIDL tools.  Occasionally we will give pointers if the see a glaring error but we do not know enough of what you are trying to do to give accurate feedback.  

    I have included the compilation and inference files I used (emulation and device).  Along with the input data.

    Compile under 11.00.06.00

    ./tidl_model_import.out import_segformer
    ========================= [Model Compilation Started] =========================

    Model compilation will perform the following stages:
    1. Parsing
    2. Graph Optimization
    3. Quantization & Calibration
    4. Memory Planning

    ============================== [Version Summary] ==============================

    -------------------------------------------------------------------------------
    | TIDL Tools Version | 11_00_06_00 |
    -------------------------------------------------------------------------------
    | C7x Firmware Version | 11_00_00_00 |
    -------------------------------------------------------------------------------

    ONNX model (Proto) file : NPL/model/segformer_b0_finetuned_ade_512_512_simp.onnx
    TIDL network file : out/tidl_net.bin
    TIDL IO info file : out/tidl_io_buff
    Current ONNX OpSet version : 17
    ============================ [Optimization started] ============================

    ----------------------------- Optimization Summary -----------------------------
    ---------------------------------------------------------------------------------
    | Layer | Nodes before optimization | Nodes after optimization |
    ---------------------------------------------------------------------------------
    | TIDL_BatchNormLayer | 0 | 16 |
    | TIDL_SliceLayer | 2 | 6 |
    | TIDL_TransposeLayer | 74 | 80 |
    | TIDL_ReLULayer | 1 | 0 |
    | TIDL_ConcatLayer | 1 | 1 |
    | TIDL_LayerNormLayer | 30 | 30 |
    | TIDL_EltWiseLayer | 104 | 128 |
    | TIDL_ConvolutionLayer | 20 | 16 |
    | TIDL_InnerProductLayer | 64 | 68 |
    | TIDL_ErfLayer | 8 | 0 |
    | TIDL_PoolingLayer | 2 | 2 |
    | TIDL_ResizeLayer | 4 | 4 |
    | TIDL_SoftMaxLayer | 8 | 8 |
    | TIDL_ConstDataLayer | 0 | 164 |
    | TIDL_SqueezeLayer | 6 | 0 |
    ---------------------------------------------------------------------------------

    Total nodes in subgraph: 597

    =========================== [Optimization completed] ===========================


    -------- Running Calibration in Float Mode to Collect Tensor Statistics --------
    [=============================================================================] 100 %

    ------------------ Fixed-point Calibration Iteration [1 / 1]: ------------------
    [=============================================================================] 100 %

    ==================== [Quantization & Calibration Completed] ====================

    ========================== [Memory Planning Started] ==========================


    ------------------------- Network Compiler Traces ------------------------------
    Successful Memory Allocation
    Successful Workload Creation

    ========================= [Memory Planning Completed] =========================

    Rerunning network compiler...
    ========================== [Memory Planning Started] ==========================


    ------------------------- Network Compiler Traces ------------------------------
    Successful Memory Allocation
    Successful Workload Creation

    ========================= [Memory Planning Completed] =========================

    ======================== Subgraph Compiled Successfully ========================

    Emulation Run:

    root@e87020451ea4:/home/root/tools/AM69A/tidl_tools# ./PC_dsp_test_dl_algo.out s:infer_segformer

    Processing config file #0 : infer_segformer
    ----------------------- TIDL Process with REF_ONLY FLOW------------------------

    # 0 . .. T 15614.39 .... ..... ... .... .....

    Device Run:

    root@am69-sk:/opt/tidl_test# ./TI_DEVICE_armv8_test_dl_algo_host_rt.out s:infer_seg_dev

    Processing config file #0 : infer_seg_dev
    APP: Init ... !!!
    9940.650785 s: MEM: Init ... !!!
    9940.650842 s: MEM: Initialized DMA HEAP (fd=5) !!!
    9940.650994 s: MEM: Init ... Done !!!
    9940.651013 s: IPC: Init ... !!!
    9940.684696 s: IPC: Init ... Done !!!
    REMOTE_SERVICE: Init ... !!!
    REMOTE_SERVICE: Init ... Done !!!
    9940.692539 s: GTC Frequency = 200 MHz
    APP: Init ... Done !!!
    9940.692652 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_ERROR
    9940.692670 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_WARNING
    9940.692677 s: VX_ZONE_INFO: Globally Enabled VX_ZONE_INFO
    9940.693320 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-0
    9940.693469 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-1
    9940.693576 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-2
    9940.693681 s: VX_ZONE_INFO: [tivxPlatformCreateTargetId:169] Added target MPU-3
    9940.693696 s: VX_ZONE_INFO: [tivxInitLocal:202] Initialization Done !!!
    9940.693707 s: VX_ZONE_INFO: Globally Disabled VX_ZONE_INFO

    # NETWORK_INIT_TIME = 1010.65 (in ms, c7x @1GHz)
    ----------------------- TIDL Process with TARGET DATA FLOW ------------------------

    # NETWORK_EXECUTION_TIME = 36.34 (in ms, c7x @1GHz) with DDR_BANDWIDTH (Read + Write) = 112.39, 128.81, 241.20 (in Mega Bytes/frame) ... .... .....APP: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... Done !!!
    9941.734243 s: IPC: Deinit ... !!!
    9941.735204 s: IPC: DeInit ... Done !!!
    9941.735235 s: MEM: Deinit ... !!!
    9941.735248 s: DDR_SHARED_MEM: Alloc's: 7 alloc's of 10677976 bytes
    9941.735260 s: DDR_SHARED_MEM: Free's : 7 free's of 10677976 bytes
    9941.735270 s: DDR_SHARED_MEM: Open's : 0 allocs of 0 bytes
    9941.735288 s: MEM: Deinit ... Done !!!
    APP: Deinit ... Done !!!

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/import_5F00_segformerhttps://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/infer_5F00_segformerhttps://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/infer_5F00_seg_5F00_dev

    Regards,

    Chris

  • Mask unique values in CPU: [  1   6  11  12  20  87 149] (these are classes in image total 150) ,
    Mask unique values in TIDL: [ 0 18],
    the compile ran perfectly no mask detection during inference.

    it run on device but accuracy is low. 

  • i followed what u suggested (allowedNode.txt is not generated ) how can i do it ???

    root@61028b545efe:~/ti3/edgeai-tidl-tools/tools/AM69A/tidl_tools# ./tidl_model_import.out /root/ti3/edgeai-tidl-tools/kenny/import.cfg --
    modelType 2 --inputNetFile /root/ti3/edgeai-tidl-tools/kenny/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx --outputNetFile /roo
    t/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_net.bin --inputParamsFile /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_io_buff_templat
    e --outputParamsFile /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_io_buff --inDataNorm 1 --inMean 123.675 116.28 103.53 --inScale 0.
    017125 0.017507 0.017429 --inData /root/ti3/edgeai-tidl-tools/kenny/dat/in_data_list.txt --tidlStatsTool /root/ti3/edgeai-tidl-tools/tool
    s/AM69A/tidl_tools/PC_dsp_test_dl_algo.out --perfSimTool /root/ti3/edgeai-tidl-tools/tools/AM69A/tidl_tools/ti_cnnperfsim.out --graphVizT
    ool /root/ti3/edgeai-tidl-tools/tools/AM69A/tidl_tools/tidl_graphVisualiser.out --inHeight 512 --inWidth 512 --inNumChannels 3 --numFrame
    s 1
    ========================= [Model Compilation Started] =========================
    
    Model compilation will perform the following stages:
    1. Parsing
    2. Graph Optimization
    3. Quantization & Calibration
    4. Memory Planning
    
    ============================== [Version Summary] ==============================
    
    -------------------------------------------------------------------------------
    |          TIDL Tools Version          |              11_00_06_00             |
    -------------------------------------------------------------------------------
    |         C7x Firmware Version         |              11_00_00_00             |
    -------------------------------------------------------------------------------
    
    ONNX model (Proto) file      : /root/ti3/edgeai-tidl-tools/kenny/test_model/segformer_b0_finetuned_ade_512_512_simp.onnx  
    TIDL network file            : /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_net.bin  
    TIDL IO info file            : /root/ti3/edgeai-tidl-tools/kenny/using_bash/tidl_io_buff  
    Current ONNX OpSet version   : 17  
    ============================ [Optimization started] ============================
    
    ----------------------------- Optimization Summary -----------------------------
    ---------------------------------------------------------------------------------
    |          Layer         | Nodes before optimization | Nodes after optimization |
    ---------------------------------------------------------------------------------
    | TIDL_BatchNormLayer    |                         0 |                       16 |
    | TIDL_SliceLayer        |                         2 |                        6 |
    | TIDL_TransposeLayer    |                        74 |                       80 |
    | TIDL_ReLULayer         |                         1 |                        0 |
    | TIDL_ConcatLayer       |                         1 |                        1 |
    | TIDL_LayerNormLayer    |                        30 |                       30 |
    | TIDL_EltWiseLayer      |                       104 |                      128 |
    | TIDL_ConvolutionLayer  |                        20 |                       16 |
    | TIDL_InnerProductLayer |                        64 |                       68 |
    | TIDL_ErfLayer          |                         8 |                        0 |
    | TIDL_PoolingLayer      |                         2 |                        2 |
    | TIDL_ResizeLayer       |                         4 |                        4 |
    | TIDL_SoftMaxLayer      |                         8 |                        8 |
    | TIDL_ConstDataLayer    |                         0 |                      164 |
    | TIDL_SqueezeLayer      |                         6 |                        0 |
    ---------------------------------------------------------------------------------
    
    Total nodes in subgraph: 597
    
    =========================== [Optimization completed] ===========================
    
    
    -------- Running Calibration in Float Mode to Collect Tensor Statistics --------
    [=============================================================================] 100 %
    
    ------------------ Fixed-point Calibration Iteration [1 / 1]: ------------------
    [=============================================================================] 100 %
    
    ==================== [Quantization & Calibration Completed] ====================
    
    ========================== [Memory Planning Started] ==========================
    
    
    ------------------------- Network Compiler Traces ------------------------------
    Successful Memory Allocation
    Successful Workload Creation
    
    ========================= [Memory Planning Completed] =========================
    
    Rerunning network compiler...
    ========================== [Memory Planning Started] ==========================
    
    
    ------------------------- Network Compiler Traces ------------------------------
    Successful Memory Allocation
    Successful Workload Creation
    
    ========================= [Memory Planning Completed] =========================
    
    ======================== Subgraph Compiled Successfully ========================

    artfacts

    /root/ti3/edgeai-tidl-tools/kenny/using_bash
    ├── import.cfg.perf_sim_config.txt
    ├── import.cfg.qunat_stats_config.txt
    ├── import.cfg_stats_tool_out.bin
    ├── tidl_io_buff1.bin
    ├── tidl_net
    │   ├── bufinfolog_0.csv
    │   ├── bufinfolog_0.txt
    │   ├── perfSimInfo.bin
    │   └── wlinfolog_0.txt
    ├── tidl_net.bin
    ├── tidl_net.bin.layer_info.txt
    ├── tidl_net.bin.svg
    ├── tidl_net.bin_netLog.txt
    └── tidl_net.bin_paramDebug.csv




    Regards,
    venkat

  • Hi Venkat,

    allowedNode.txt is a OSRT artifact not TIDLRT.   With TIDLRT all nodes run on the C7x/MMA or the compilation will fail.

    Regards,

    Chris

  • Hi Chris,

    I got it working in TIDLRT. Are there any example documents available that show how to build a reference pipeline using OpenCV (cv2)? The documentation mentions checking the vision_apps directory, but I’d like a clear reference to help me write my own pipeline. After segmentation, I need to perform additional post-processing on the SoC, so this is my Plan B.

    Ultimately, I want to use this in OSRT.
    For reference, here's what I'm seeing:

    • Mask unique values on CPU: [1, 6, 11, 12, 20, 87, 149] (these represent the expected classes; total = 150)

    • Mask unique values from TIDL: [0, 18]

    As seen from the mask values on the CPU, the detection is accurate. However, it's not performing as well on TIDL

    Regards
    ,
    Venkat

  • Hi Venk at,

    Usually TIDLRT is more difficult to setup so OSRT should be an easy follow on.   Is that all the incorrect output, one 7 element array (correct) and one 2 element array (incorrect)?  Are you expecting these two arrays to match in size and content?  So I can test, are you using my input image or something else?  

    Here are some data flow docs:

    https://software-dl.ti.com/jacinto7/esd/processor-sdk-linux-am69a/10_00_00/exports/edgeai-docs/common/edgeai_dataflows.html#multi-input-multi-inference

    Regards,

    Chris

  • Hi Venk at,

    The output you presented does not make any sense.  The only output I see in your model is a tensor of 150x128x128.  

    output: 1492
    output: NodeArg(name='1492', type='tensor(float)', shape=[1, 150, 128, 128])

    What I need to debug this is your expected 1492 output tensor.  I do not know what the two output vectors from a previous post are. 

    Regards,

    Chris

  • Hi Chris,

    Thank you for taking the time to help with my issue.

    I obtained the model from the TI Edge AI TensorLab Model Zoo. When I checked the model on Netron.app, I saw that the output node was 1492.

    For testing, I used an image I found on Google by searching for "people in street."

    The masks are correct when I use the CPUExecutionProvider. However, when I use the TIExecutionProvider, the masks are terrible. I suspect this is an issue with artifact generation, as I faced a similar problem with pose estimation models as well.

    • The output shape was 1x150x128x128, and with CPU execution, the masks look good.
    • The two output vectors from my perious posts are the class numbers i shated it to convey  that the accuracy is low I will attach the full o/p in this post 

    I read the documentation for tidlrt and wrote a script to convert out_binary to masks. While the masks were okay, they weren't great. As you mentioned, I would also prefer using onnxrt.


    TIDL o/p

    Loaded model with providers: ['TIDLExecutionProvider', 'CPUExecutionProvider']
    Output shape: (1, 150, 128, 128)
    Output min/max: -44.924595 -2.3853767
    Output dtype: float32
    output [array([[[[ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
               -2.7829394,  -2.7829394],
             [ -2.7829394,  -2.7829394,  -2.7829394, ...,  -3.1805022,
               -2.7829394,  -2.7829394],
             [ -3.1805022,  -2.7829394,  -3.1805022, ...,  -3.1805022,
               -3.1805022,  -2.7829394],
             ...,
             [ -3.578065 ,  -3.578065 ,  -3.578065 , ...,  -3.1805022,
               -3.9756277,  -3.9756277],
             [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
               -3.578065 ,  -3.9756277],
             [ -3.578065 ,  -3.578065 ,  -3.1805022, ...,  -3.578065 ,
               -3.578065 ,  -3.578065 ]],
    
            [[-13.517135 , -13.517135 , -13.517135 , ..., -15.504948 ,
              -15.107386 , -15.107386 ],
             [-13.914697 , -13.517135 , -13.914697 , ..., -15.902511 ,
              -15.504948 , -15.504948 ],
             [-14.709823 , -14.31226  , -14.31226  , ..., -15.902511 ,
              -15.504948 , -15.504948 ],
             ...,
             [-15.107386 , -15.504948 , -14.709823 , ..., -11.926883 ,
              -13.119572 , -13.119572 ],
             [-15.107386 , -15.504948 , -14.31226  , ..., -12.324446 ,
              -12.722009 , -13.119572 ],
             [-14.709823 , -15.504948 , -14.31226  , ..., -13.119572 ,
              -12.722009 , -13.119572 ]],
    
            [[-11.926883 , -11.529321 , -11.529321 , ..., -14.709823 ,
              -13.914697 , -13.119572 ],
             [-12.324446 , -11.926883 , -11.529321 , ..., -14.709823 ,
              -13.914697 , -13.914697 ],
             [-13.517135 , -13.119572 , -12.722009 , ..., -14.709823 ,
              -14.31226  , -13.517135 ],
             ...,
             [-15.504948 , -16.697636 , -15.504948 , ..., -11.529321 ,
              -12.324446 , -13.119572 ],
             [-15.504948 , -17.0952   , -14.31226  , ..., -11.926883 ,
              -11.529321 , -13.119572 ],
             [-15.107386 , -15.107386 , -14.31226  , ..., -12.722009 ,
              -11.529321 , -13.119572 ]],
    
            ...,
    
            [[-21.070827 , -20.275702 , -19.87814  , ..., -22.263515 ,
              -22.263515 , -22.263515 ],
             [-21.46839  , -21.070827 , -20.275702 , ..., -23.058641 ,
              -22.661077 , -23.058641 ],
             [-22.263515 , -21.46839  , -20.673264 , ..., -23.456203 ,
              -22.661077 , -23.058641 ],
             ...,
             [-25.444017 , -26.239143 , -25.046455 , ..., -19.083014 ,
              -21.46839  , -21.46839  ],
             [-25.84158  , -25.84158  , -23.456203 , ..., -20.673264 ,
              -21.46839  , -22.263515 ],
             [-25.444017 , -27.03427  , -24.648891 , ..., -21.46839  ,
              -21.46839  , -21.865952 ]],
    
            [[-19.083014 , -18.287888 , -18.287888 , ..., -20.275702 ,
              -19.87814  , -19.87814  ],
             [-19.87814  , -19.083014 , -19.083014 , ..., -20.673264 ,
              -20.275702 , -20.673264 ],
             [-21.070827 , -19.480576 , -19.87814  , ..., -20.673264 ,
              -20.275702 , -20.275702 ],
             ...,
             [-24.648891 , -25.84158  , -23.456203 , ..., -18.68545  ,
              -21.46839  , -21.865952 ],
             [-24.648891 , -25.444017 , -22.263515 , ..., -20.275702 ,
              -21.070827 , -22.661077 ],
             [-23.456203 , -25.046455 , -23.058641 , ..., -21.865952 ,
              -21.46839  , -22.263515 ]],
    
            [[ -7.15613  ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
               -7.553693 ,  -7.553693 ],
             [ -7.553693 ,  -6.7585673,  -6.7585673, ...,  -7.9512553,
               -6.7585673,  -7.9512553],
             [ -8.348818 ,  -7.553693 ,  -7.15613  , ...,  -7.9512553,
               -7.15613  ,  -7.9512553],
             ...,
             [-20.275702 , -21.070827 , -19.083014 , ..., -15.107386 ,
              -17.0952   , -17.492762 ],
             [-20.275702 , -20.673264 , -18.68545  , ..., -15.902511 ,
              -16.697636 , -17.492762 ],
             [-19.083014 , -20.673264 , -19.480576 , ..., -16.697636 ,
              -17.0952   , -17.492762 ]]]], dtype=float32)]
    




    CPU o/p::

    Loaded model with providers: ['CPUExecutionProvider']
    Output shape: (1, 150, 128, 128)
    Output min/max: -37.647076 1.502726
    Output dtype: float32
    output [array([[[[-9.10738182e+00, -8.46180344e+00, -8.89690208e+00, ...,
              -7.30843163e+00, -7.19831467e+00, -7.42499590e+00],
             [-9.50431728e+00, -8.79891968e+00, -9.27894783e+00, ...,
              -7.61665154e+00, -7.40585041e+00, -7.66443872e+00],
             [-1.01199865e+01, -9.30077934e+00, -9.55318069e+00, ...,
              -7.91833210e+00, -7.49162006e+00, -7.73509026e+00],
             ...,
             [-8.89202118e+00, -9.60954189e+00, -8.67526245e+00, ...,
              -8.45482254e+00, -9.26228619e+00, -9.42125702e+00],
             [-8.94773388e+00, -9.37217522e+00, -7.73367977e+00, ...,
              -8.67119598e+00, -8.89135551e+00, -9.23087692e+00],
             [-8.78436470e+00, -8.72591019e+00, -7.97214413e+00, ...,
              -9.18196774e+00, -8.78701019e+00, -9.19165039e+00]],
    
            [[-1.05580688e+00, -5.19240439e-01, -6.46204770e-01, ...,
               3.97472233e-02,  1.96441859e-02, -2.23712787e-01],
             [-1.28475881e+00, -6.92354441e-01, -9.00263131e-01, ...,
               2.34207064e-02,  8.41307044e-02, -1.85798183e-01],
             [-1.68210590e+00, -9.01914299e-01, -1.04362941e+00, ...,
              -1.41012087e-01,  7.51651824e-02, -1.63039312e-01],
             ...,
             [-8.75086594e+00, -9.43372631e+00, -8.31778622e+00, ...,
              -1.16131220e+01, -1.23025236e+01, -1.22214193e+01],
             [-8.66691113e+00, -9.01147556e+00, -7.58560610e+00, ...,
              -1.17375622e+01, -1.17695818e+01, -1.21904783e+01],
             [-8.39127922e+00, -8.39451599e+00, -7.91367579e+00, ...,
              -1.19990225e+01, -1.17306757e+01, -1.22769880e+01]],
    
            [[-9.35542679e+00, -8.22606564e+00, -9.77154541e+00, ...,
              -4.36007261e+00, -3.64588404e+00, -2.74775696e+00],
             [-9.54254723e+00, -8.22373581e+00, -9.83346272e+00, ...,
              -4.21152925e+00, -3.36076975e+00, -2.49715137e+00],
             [-1.02955265e+01, -8.80494118e+00, -9.67026711e+00, ...,
              -4.68067694e+00, -3.28478169e+00, -2.06064844e+00],
             ...,
             [-1.15484982e+01, -1.39187107e+01, -1.15395317e+01, ...,
              -1.06908541e+01, -1.08596764e+01, -1.14195423e+01],
             [-1.20424957e+01, -1.34465389e+01, -8.34186172e+00, ...,
              -1.16886873e+01, -1.00103464e+01, -1.12006750e+01],
             [-1.16304359e+01, -1.15399914e+01, -7.32819128e+00, ...,
              -1.30932884e+01, -9.65425110e+00, -1.13869896e+01]],
    
            ...,
    
            [[-1.65767632e+01, -1.48137388e+01, -1.53054876e+01, ...,
              -1.22598991e+01, -1.19845648e+01, -1.21397972e+01],
             [-1.73164673e+01, -1.56152582e+01, -1.56489859e+01, ...,
              -1.25846987e+01, -1.23050089e+01, -1.25115051e+01],
             [-1.88177929e+01, -1.66239719e+01, -1.62385197e+01, ...,
              -1.31708536e+01, -1.24973049e+01, -1.26569748e+01],
             ...,
             [-1.65434132e+01, -1.66482334e+01, -1.49522409e+01, ...,
              -1.73858204e+01, -1.91236134e+01, -1.93649387e+01],
             [-1.65106354e+01, -1.62999249e+01, -1.37105570e+01, ...,
              -1.81827908e+01, -1.84118557e+01, -1.91355877e+01],
             [-1.60666618e+01, -1.56589861e+01, -1.40776854e+01, ...,
              -1.88848476e+01, -1.80791683e+01, -1.91343880e+01]],
    
            [[-1.40990629e+01, -1.18888369e+01, -1.17627220e+01, ...,
              -9.66371632e+00, -9.24732304e+00, -9.20552731e+00],
             [-1.45914030e+01, -1.25023279e+01, -1.20748682e+01, ...,
              -9.73106861e+00, -9.32184887e+00, -9.38448811e+00],
             [-1.58103552e+01, -1.32339678e+01, -1.24932852e+01, ...,
              -9.88481331e+00, -9.29177380e+00, -9.32881260e+00],
             ...,
             [-1.67702923e+01, -1.75025196e+01, -1.59494038e+01, ...,
              -1.84159298e+01, -1.95513268e+01, -1.96526814e+01],
             [-1.68778381e+01, -1.70773335e+01, -1.45592756e+01, ...,
              -1.91143475e+01, -1.89065056e+01, -1.97410011e+01],
             [-1.64891090e+01, -1.61917324e+01, -1.50849390e+01, ...,
              -2.03280964e+01, -1.88051624e+01, -2.02607059e+01]],
    
            [[-1.16451569e+01, -1.01056519e+01, -1.04362154e+01, ...,
              -7.99716377e+00, -8.05412006e+00, -7.95117903e+00],
             [-1.17157431e+01, -1.03467522e+01, -1.02508469e+01, ...,
              -7.81100845e+00, -7.57984543e+00, -8.00262260e+00],
             [-1.24734840e+01, -1.07574902e+01, -1.04642143e+01, ...,
              -8.11416531e+00, -7.45732689e+00, -7.94904995e+00],
             ...,
             [-9.75324059e+00, -1.07886009e+01, -9.89416981e+00, ...,
              -1.33662148e+01, -1.47563696e+01, -1.48689194e+01],
             [-9.78142452e+00, -1.02218876e+01, -8.56538582e+00, ...,
              -1.35902987e+01, -1.42914600e+01, -1.45997658e+01],
             [-9.69767570e+00, -9.56430054e+00, -8.45411301e+00, ...,
              -1.51341333e+01, -1.42068319e+01, -1.50633011e+01]]]],
          dtype=float32)]
    

    Regards,  

    Venkat

    ---

  • Hi Venkat,

    These are incomplete.  I need the .npy or .bin files to make a decent comparison.

    Regards,

    Chris

  • Hi Chris 

    I currently don’t have access to the model and device as I’m out of the office. I’ll share them as soon as possible (Monday).

    Thanks a lot 

    Regards,

    Venkat

  • Hi Venkat,

    Did not mean to rush you.  Just whenever.  If you have the script.  After the session.run just add:

    outputs = session.run(outs, inp)
    :
    for i in range(0,len(outputs)):
        arr = outputs[i].flatten().astype( convertDataType(session.get_outputs()[i].type))
        arr.tofile(outs[i].replace('/','')+'.bin',sep="")

    That should generate a 1492.bin file.  If running under ONNX it will be a bunch of float32 values.

    Regards,

    Chris

  • ti_e2e_share.zip

    Hi Chris,

    I've shared the zip for both CPU and TI ONNX inference.
    The image used for inference is the same one you used—of a Dassault Rafale jet taking off from a carrier.

    Regards,
    Venkat

  • Hi Chris,

    any updates??

  • Hi  Venkat,

    I looked at your data and compared your ONNX output to my TIDL generated 8 bit output all normalized to 0-255.   The data looks close.   I will do a layer by layer MSE comparison to see if it is going off in the weeds somewhere.  RIght now, from a TIDL point of view,  it imports, runs, and provides quantized data close to the ONNX output so there may be a model problem.

    import numpy as np
    import matplotlib.pyplot as plt


    #tidl_1492 = np.fromfile('/shared/1492_ti_onnx.bin',dtype=np.float32)
    tidl_1492 = np.fromfile('/shared/jet_tidl_out.bin',dtype=np.uint8)
    print(len(tidl_1492))

    onnx_1492 = np.fromfile('/shared/1492_cpu_onnx.bin',dtype=np.float32)
    print(len(onnx_1492))

    # Noramlize TIDL data to 0-255
    ntidl_1492 = (tidl_1492-tidl_1492.min())/(tidl_1492.max()-tidl_1492.min())
    ntidl_1492_uint8 = (ntidl_1492 * 255).astype(np.uint8)


    # Noramlize ONNX data to 0-255
    nonnx_1492 = (onnx_1492-onnx_1492.min())/(onnx_1492.max()-onnx_1492.min())
    nonnx_1492_uint8 = (nonnx_1492 * 255).astype(np.uint8)


    plt.plot(ntidl_1492_uint8, label='TIDL Int8')
    plt.plot(nonnx_1492_uint8,label='ONNX Float')


    plt.legend(loc='best')
    plt.show()

    I will have the layer by layer analysis complete in a couple of days.

    Regards,

    Chris