TDA4VH-Q1: [TIDL] gridsample onnx model example

Henry2333

Intellectual 490 points

Part Number: TDA4VH-Q1

Tool/software:

Hi, TI support

I want to use gridsample layer in edgeai-tidl-tools tag 10_00_08_00.

Can you provide a simple gridsample onnx model example? Then I can design my model based on this example

Best Regards

Henry

over 1 year ago

0 Wen Li over 1 year ago

TI__Expert 8131 points

Hi Henry; I will look into some examples. And get back with you later.

Thanks and regards

Wen Li

0 TommySong 10 months ago in reply to Wen Li

TI__Genius 9700 points

Hi, Henry,

Could you please help to comment whether you still require the example? Thanks.

Br, Tommy

0 Henry2333 10 months ago in reply to TommySong

Intellectual 490 points

yes, i need

thank you

0 Chris Tsongas 9 months ago in reply to Henry2333

TI__Genius 16500 points

Hi Henry2333,

Attached is a sample GridSample model. I have included the PyTorch code (the indents will have to be modified) that I used to generate this. I have validated this model will compile and run with TIDL 10.01 but this is regular ONNX and Pytorch stuff and really has nothing to do with TIDL.

https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/gridsample.onnx

import torch
import torch.nn.functional as F

class ModelWithGridSample(torch.nn.Module):
def __init__(self,input_size):
super(ModelWithGridSample, self).__init__()
# Define any layers needed before or after grid_sample
self.conv = torch.nn.Conv2d(1, 1, kernel_size=3, padding=1)

def forward(self, x, grid):
# x: input tensor (N, C, H_in, W_in)

x = self.conv(x)
output = F.grid_sample(x, grid, mode='bilinear', align_corners=True)
return output

input_size = 1

model = ModelWithGridSample(input_size)
model.eval()

dummy_input = torch.randn(1, 1, 36, 48)
# grid: flow-field tensor (N, H_out, W_out, 2)
grid = torch.randn( 1, 36, 48, 2)
my_tuple = (dummy_input,grid)
torch.onnx.export(model, # model being run
my_tuple, # model input (or a tuple for multiple inputs)
"gridsample.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=16, # the ONNX version to export to
do_constant_folding=False, # whether to execute constant folding for optimization
input_names = ['reshape_0','unsqueeze_0'], # the model's input names
output_names = ['myoutput']) # the model's output names

Regards,

Chris

Fullscreen 3823.example_compile.txt Download

root@6530a5cb9657:/home/root/examples/osrt_python/advanced_examples/unit_tests_validation/ort# python3 ./onnxrt_ep.py -c -m gridsample
['gridsample']
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['gridsample']


Running_Model :  gridsample  


Running shape inference on model ../unit_test_models/gridsample.onnx 

tidl_tools_path                                 = /home/root/tidl_tools/ 
artifacts_folder                                = ../model-artifacts//gridsample/artifacts 
tidl_tensor_bits                                = 8 
debug_level                                     = 4 
num_tidl_subgraphs                              = 16 
num_tidl_subgraph_max_node                      = 0 
enable_rt_multi_subgraph_support                = 0 
tidl_denylist                                   = 
tidl_denylist_layer_name                        = 
tidl_denylist_layer_type                        = 
tidl_allowlist_layer_name                       = 
model_type                                      =  
tidl_calibration_accuracy_level                 = 7 
tidl_calibration_options:num_frames_calibration = 1 
tidl_calibration_options:bias_calibration_iterations = 1 
mixed_precision_factor = -1.000000 
model_group_id = 0 
power_of_2_quantization                         = 2 
ONNX QDQ Enabled                                = 0 
enable_high_resolution_optimization             = 0 
pre_batchnorm_fold                              = 1 
add_data_convert_ops                            = 3 
output_feature_16bit_names_list                 =  
m_params_16bit_names_list                       =  
m_single_core_layers_names_list                 =  
Inference mode                                  = 0 
Number of cores                                 = 1 
reserved_compile_constraints_flag               = 83886080 
partial_init_during_compile                     = 0 
packetize_mode                                  = 0 
ti_internal_reserved_1                          = 

========================= [Model Compilation Started] =========================

Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning

============================== [Version Summary] ==============================

-------------------------------------------------------------------------------
|          TIDL Tools Version          |              10_01_00_01             |
-------------------------------------------------------------------------------
|         C7x Firmware Version         |              10_01_00_01             |
-------------------------------------------------------------------------------
|            Runtime Version           |                1.15.0                |
-------------------------------------------------------------------------------
|          Model Opset Version         |                  16                  |
-------------------------------------------------------------------------------

NOTE: The runtime version here specifies ONNXRT_VERSION+TIDL_VERSION
Ex: 1.14.0+1000XXXX -> ONNXRT 1.14.0 and a TIDL_VERSION 10.00.XX.XX

============================== [Parsing Started] ==============================

[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options
[TIDL Import] [PARSER] SUPPORTED: Layers type supported by TIDL --- layer type - Conv,  Node name - /conv/Conv -- [tidl_onnxRtImport_core.cpp, 587]
[TIDL Import] [PARSER] SUPPORTED: Layers type supported by TIDL --- layer type - GridSample,  Node name - /GridSample -- [tidl_onnxRtImport_core.cpp, 587]

------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
|          Core           |      No. of Nodes       |   Number of Subgraphs   |
-------------------------------------------------------------------------------
| C7x                     |                       2 |                       1 |
| CPU                     |                       0 |                       x |
-------------------------------------------------------------------------------
Running Runtimes GraphViz - /home/root/tidl_tools//tidl_graphVisualiser_runtimes.out ../model-artifacts//gridsample/artifacts/allowedNode.txt ../model-artifacts//gridsample/artifacts/tempDir/graphvizInfo.txt ../model-artifacts//gridsample/artifacts/tempDir/runtimes_visualization.svg 
============================= [Parsing Completed] =============================

TIDL_createStateImportFunc Started:
Compute on node : TIDLExecutionProvider_TIDL_0_0
  0,            Conv, 3, 1, reshape_0, /conv/Conv_output_0
  1,      GridSample, 2, 1, /conv/Conv_output_0, myoutput

Input tensor name -  reshape_0 

Input tensor name -  unsqueeze_0 
Output tensor name - myoutput 
In TIDL_onnxRtImportInit subgraph_name=subgraph_0
Layer 0, subgraph id subgraph_0, name=myoutput
Layer 1, subgraph id subgraph_0, name=reshape_0
Layer 2, subgraph id subgraph_0, name=unsqueeze_0
==================== [Optimization for subgraph_0 Started] ====================

In TIDL_runtimesOptimizeNet: LayerIndex = 5, dataIndex = 4 
----------------------------- Optimization Summary -----------------------------
--------------------------------------------------------------------------------
|         Layer         | Nodes before optimization | Nodes after optimization |
--------------------------------------------------------------------------------
| TIDL_ConvolutionLayer |                         1 |                        1 |
| TIDL_GridSampleLayer  |                         1 |                        1 |
--------------------------------------------------------------------------------

=================== [Optimization for subgraph_0 Completed] ===================

In TIDL_runtimesPostProcessNet 
************ in TIDL_subgraphRtCreate ************ 
 The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INIT:Enabled
 0.6s:  VX_ZONE_ERROR:Enabled
 0.10s:  VX_ZONE_WARNING:Enabled
 0.2185s:  VX_ZONE_INIT:[tivxInit:190] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************ 
 ============= [Quantization & Calibration for subgraph_0 Started] =============

*******   In TIDL_subgraphRtInvoke  ******** 
   0         1.00000        -3.04614         3.17097 6
   2         1.00000        -3.04614         3.17097 6
   1         1.00000        -3.11686         3.80166 6
   3         1.00000        -3.11686         3.80166 6
   4         1.00000        -2.08905         1.92511 6
   5         1.00000        -1.39632         0.96554 6
   6         1.00000        -1.39632         0.96554 6
 Layer,   Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger,    paddingWait,LayerWithoutPad,LayerHandleCopy,   BackupCycles,  RestoreCycles,Multic7xContextCopyCycles,
     2,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     3,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     4,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     5,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     6,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
 Sum of Layer Cycles 0 
Sub Graph Stats 12.000000 803.000000 144.000000 
*******  TIDL_subgraphRtInvoke done  ******** 
In TIDL_runtimesPostProcessNet 

-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %

------------------ Fixed-point Calibration Iteration [1 / 1]: ------------------
[=============================================================================] 100 %

==================== [Quantization & Calibration Completed] ====================

========================== [Memory Planning Started] ==========================


------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation

========================= [Memory Planning Completed] =========================

======================== Subgraph Compiled Successfully ========================



Completed model -  gridsample.onnx

 
Name : gridsample                                        , Total time :    1587.98, Offload Time :       0.80 , DDR RW MBs : 0
 
 
************ in TIDL_subgraphRtDelete ************ 
 MEM: Deinit ... !!!
MEM: Alloc's: 28 alloc's of 115520969 bytes 
MEM: Free's : 28 free's  of 115520969 bytes 
MEM: Open's : 0 allocs  of 0 bytes 
MEM: Deinit ... Done !!!

Fullscreen 3823.example_run.txt Download

root@6530a5cb9657:/home/root/examples/osrt_python/advanced_examples/unit_tests_validation/ort# python3 ./onnxrt_ep.py -m gridsample
['gridsample']
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['gridsample']


Running_Model :  gridsample  

libtidl_onnxrt_EP loaded 0x5b4854dce2e0 
artifacts_folder                                = ../model-artifacts//gridsample/artifacts 
debug_level                                     = 4 
target_priority                                 = 0 
max_pre_empt_delay                              = 340282346638528859811704183484516925440.000000 
Final number of subgraphs created are : 1, - Offloaded Nodes - 2, Total Nodes - 2 
In TIDL_createStateInfer 
Compute on node : TIDLExecutionProvider_TIDL_0_0
************ in TIDL_subgraphRtCreate ************ 
 The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
 0.0s:  VX_ZONE_INIT:Enabled
 0.10s:  VX_ZONE_ERROR:Enabled
 0.13s:  VX_ZONE_WARNING:Enabled
 0.2744s:  VX_ZONE_INIT:[tivxInit:190] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************ 
 *******   In TIDL_subgraphRtInvoke  ******** 
   0         1.00000        -3.04614         3.17097 6
   2        32.00000        -3.03125         3.15625 1
   4        49.71195        -2.09205         1.91101 1
   1         1.00000        -3.11686         3.80166 6
   3     32768.00000        -1.00000         0.99997 3
   5        49.71195        -1.46846         1.10637 1
   6         1.00000        -1.46846         1.10637 6
 Layer,   Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger,    paddingWait,LayerWithoutPad,LayerHandleCopy,   BackupCycles,  RestoreCycles,Multic7xContextCopyCycles,
     2,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     4,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     3,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     5,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
     6,              0,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              0,              0,              0,              0,              0,              0,
 Sum of Layer Cycles 0 
Sub Graph Stats 14.000000 24000.000000 104.000000 
*******  TIDL_subgraphRtInvoke done  ******** 
Completed model -  gridsample.onnx

 
Name : gridsample                                        , Total time :      24.07, Offload Time :      24.00 , DDR RW MBs : 0
 
 
************ in TIDL_subgraphRtDelete ************ 
 MEM: Deinit ... !!!
MEM: Alloc's: 28 alloc's of 75485531 bytes 
MEM: Free's : 28 free's  of 75485531 bytes 
MEM: Open's : 0 allocs  of 0 bytes 
MEM: Deinit ... Done !!!

Processors

Processors forum

TDA4VH-Q1: [TIDL] gridsample onnx model example