Part Number: TDA4VH-Q1
Tool/software:
Hi, TI support
I want to use gridsample layer in edgeai-tidl-tools tag 10_00_08_00.
Can you provide a simple gridsample onnx model example? Then I can design my model based on this example
Best Regards
Henry
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Part Number: TDA4VH-Q1
Tool/software:
Hi, TI support
I want to use gridsample layer in edgeai-tidl-tools tag 10_00_08_00.
Can you provide a simple gridsample onnx model example? Then I can design my model based on this example
Best Regards
Henry
Hi Henry2333,
Attached is a sample GridSample model. I have included the PyTorch code (the indents will have to be modified) that I used to generate this. I have validated this model will compile and run with TIDL 10.01 but this is regular ONNX and Pytorch stuff and really has nothing to do with TIDL.
https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/gridsample.onnx
import torch
import torch.nn.functional as F
class ModelWithGridSample(torch.nn.Module):
def __init__(self,input_size):
super(ModelWithGridSample, self).__init__()
# Define any layers needed before or after grid_sample
self.conv = torch.nn.Conv2d(1, 1, kernel_size=3, padding=1)
def forward(self, x, grid):
# x: input tensor (N, C, H_in, W_in)
x = self.conv(x)
output = F.grid_sample(x, grid, mode='bilinear', align_corners=True)
return output
input_size = 1
model = ModelWithGridSample(input_size)
model.eval()
dummy_input = torch.randn(1, 1, 36, 48)
# grid: flow-field tensor (N, H_out, W_out, 2)
grid = torch.randn( 1, 36, 48, 2)
my_tuple = (dummy_input,grid)
torch.onnx.export(model, # model being run
my_tuple, # model input (or a tuple for multiple inputs)
"gridsample.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=16, # the ONNX version to export to
do_constant_folding=False, # whether to execute constant folding for optimization
input_names = ['reshape_0','unsqueeze_0'], # the model's input names
output_names = ['myoutput']) # the model's output names
Regards,
Chris
root@6530a5cb9657:/home/root/examples/osrt_python/advanced_examples/unit_tests_validation/ort# python3 ./onnxrt_ep.py -c -m gridsample
['gridsample']
Available execution providers : ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']
Running 1 Models - ['gridsample']
Running_Model : gridsample
Running shape inference on model ../unit_test_models/gridsample.onnx
tidl_tools_path = /home/root/tidl_tools/
artifacts_folder = ../model-artifacts//gridsample/artifacts
tidl_tensor_bits = 8
debug_level = 4
num_tidl_subgraphs = 16
num_tidl_subgraph_max_node = 0
enable_rt_multi_subgraph_support = 0
tidl_denylist =
tidl_denylist_layer_name =
tidl_denylist_layer_type =
tidl_allowlist_layer_name =
model_type =
tidl_calibration_accuracy_level = 7
tidl_calibration_options:num_frames_calibration = 1
tidl_calibration_options:bias_calibration_iterations = 1
mixed_precision_factor = -1.000000
model_group_id = 0
power_of_2_quantization = 2
ONNX QDQ Enabled = 0
enable_high_resolution_optimization = 0
pre_batchnorm_fold = 1
add_data_convert_ops = 3
output_feature_16bit_names_list =
m_params_16bit_names_list =
m_single_core_layers_names_list =
Inference mode = 0
Number of cores = 1
reserved_compile_constraints_flag = 83886080
partial_init_during_compile = 0
packetize_mode = 0
ti_internal_reserved_1 =
========================= [Model Compilation Started] =========================
Model compilation will perform the following stages:
1. Parsing
2. Graph Optimization
3. Quantization & Calibration
4. Memory Planning
============================== [Version Summary] ==============================
-------------------------------------------------------------------------------
| TIDL Tools Version | 10_01_00_01 |
-------------------------------------------------------------------------------
| C7x Firmware Version | 10_01_00_01 |
-------------------------------------------------------------------------------
| Runtime Version | 1.15.0 |
-------------------------------------------------------------------------------
| Model Opset Version | 16 |
-------------------------------------------------------------------------------
NOTE: The runtime version here specifies ONNXRT_VERSION+TIDL_VERSION
Ex: 1.14.0+1000XXXX -> ONNXRT 1.14.0 and a TIDL_VERSION 10.00.XX.XX
============================== [Parsing Started] ==============================
[TIDL Import] [PARSER] WARNING: Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options
[TIDL Import] [PARSER] SUPPORTED: Layers type supported by TIDL --- layer type - Conv, Node name - /conv/Conv -- [tidl_onnxRtImport_core.cpp, 587]
[TIDL Import] [PARSER] SUPPORTED: Layers type supported by TIDL --- layer type - GridSample, Node name - /GridSample -- [tidl_onnxRtImport_core.cpp, 587]
------------------------- Subgraph Information Summary -------------------------
-------------------------------------------------------------------------------
| Core | No. of Nodes | Number of Subgraphs |
-------------------------------------------------------------------------------
| C7x | 2 | 1 |
| CPU | 0 | x |
-------------------------------------------------------------------------------
Running Runtimes GraphViz - /home/root/tidl_tools//tidl_graphVisualiser_runtimes.out ../model-artifacts//gridsample/artifacts/allowedNode.txt ../model-artifacts//gridsample/artifacts/tempDir/graphvizInfo.txt ../model-artifacts//gridsample/artifacts/tempDir/runtimes_visualization.svg
============================= [Parsing Completed] =============================
TIDL_createStateImportFunc Started:
Compute on node : TIDLExecutionProvider_TIDL_0_0
0, Conv, 3, 1, reshape_0, /conv/Conv_output_0
1, GridSample, 2, 1, /conv/Conv_output_0, myoutput
Input tensor name - reshape_0
Input tensor name - unsqueeze_0
Output tensor name - myoutput
In TIDL_onnxRtImportInit subgraph_name=subgraph_0
Layer 0, subgraph id subgraph_0, name=myoutput
Layer 1, subgraph id subgraph_0, name=reshape_0
Layer 2, subgraph id subgraph_0, name=unsqueeze_0
==================== [Optimization for subgraph_0 Started] ====================
In TIDL_runtimesOptimizeNet: LayerIndex = 5, dataIndex = 4
----------------------------- Optimization Summary -----------------------------
--------------------------------------------------------------------------------
| Layer | Nodes before optimization | Nodes after optimization |
--------------------------------------------------------------------------------
| TIDL_ConvolutionLayer | 1 | 1 |
| TIDL_GridSampleLayer | 1 | 1 |
--------------------------------------------------------------------------------
=================== [Optimization for subgraph_0 Completed] ===================
In TIDL_runtimesPostProcessNet
************ in TIDL_subgraphRtCreate ************
The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
0.0s: VX_ZONE_INIT:Enabled
0.6s: VX_ZONE_ERROR:Enabled
0.10s: VX_ZONE_WARNING:Enabled
0.2185s: VX_ZONE_INIT:[tivxInit:190] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************
============= [Quantization & Calibration for subgraph_0 Started] =============
******* In TIDL_subgraphRtInvoke ********
0 1.00000 -3.04614 3.17097 6
2 1.00000 -3.04614 3.17097 6
1 1.00000 -3.11686 3.80166 6
3 1.00000 -3.11686 3.80166 6
4 1.00000 -2.08905 1.92511 6
5 1.00000 -1.39632 0.96554 6
6 1.00000 -1.39632 0.96554 6
Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,Multic7xContextCopyCycles,
2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0
Sub Graph Stats 12.000000 803.000000 144.000000
******* TIDL_subgraphRtInvoke done ********
In TIDL_runtimesPostProcessNet
-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
[=============================================================================] 100 %
------------------ Fixed-point Calibration Iteration [1 / 1]: ------------------
[=============================================================================] 100 %
==================== [Quantization & Calibration Completed] ====================
========================== [Memory Planning Started] ==========================
------------------------- Network Compiler Traces ------------------------------
Successful Memory Allocation
Successful Workload Creation
========================= [Memory Planning Completed] =========================
======================== Subgraph Compiled Successfully ========================
Completed model - gridsample.onnx
Name : gridsample , Total time : 1587.98, Offload Time : 0.80 , DDR RW MBs : 0
************ in TIDL_subgraphRtDelete ************
MEM: Deinit ... !!!
MEM: Alloc's: 28 alloc's of 115520969 bytes
MEM: Free's : 28 free's of 115520969 bytes
MEM: Open's : 0 allocs of 0 bytes
MEM: Deinit ... Done !!!
root@6530a5cb9657:/home/root/examples/osrt_python/advanced_examples/unit_tests_validation/ort# python3 ./onnxrt_ep.py -m gridsample
['gridsample']
Available execution providers : ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']
Running 1 Models - ['gridsample']
Running_Model : gridsample
libtidl_onnxrt_EP loaded 0x5b4854dce2e0
artifacts_folder = ../model-artifacts//gridsample/artifacts
debug_level = 4
target_priority = 0
max_pre_empt_delay = 340282346638528859811704183484516925440.000000
Final number of subgraphs created are : 1, - Offloaded Nodes - 2, Total Nodes - 2
In TIDL_createStateInfer
Compute on node : TIDLExecutionProvider_TIDL_0_0
************ in TIDL_subgraphRtCreate ************
The soft limit is 10240
The hard limit is 10240
MEM: Init ... !!!
MEM: Init ... Done !!!
0.0s: VX_ZONE_INIT:Enabled
0.10s: VX_ZONE_ERROR:Enabled
0.13s: VX_ZONE_WARNING:Enabled
0.2744s: VX_ZONE_INIT:[tivxInit:190] Initialization Done !!!
************ TIDL_subgraphRtCreate done ************
******* In TIDL_subgraphRtInvoke ********
0 1.00000 -3.04614 3.17097 6
2 32.00000 -3.03125 3.15625 1
4 49.71195 -2.09205 1.91101 1
1 1.00000 -3.11686 3.80166 6
3 32768.00000 -1.00000 0.99997 3
5 49.71195 -1.46846 1.10637 1
6 1.00000 -1.46846 1.10637 6
Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,Multic7xContextCopyCycles,
2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0
Sub Graph Stats 14.000000 24000.000000 104.000000
******* TIDL_subgraphRtInvoke done ********
Completed model - gridsample.onnx
Name : gridsample , Total time : 24.07, Offload Time : 24.00 , DDR RW MBs : 0
************ in TIDL_subgraphRtDelete ************
MEM: Deinit ... !!!
MEM: Alloc's: 28 alloc's of 75485531 bytes
MEM: Free's : 28 free's of 75485531 bytes
MEM: Open's : 0 allocs of 0 bytes
MEM: Deinit ... Done !!!