Other Parts Discussed in Thread: TDA4VH, AM69A
Tool/software:
Hi TI Support,
I'm working on deploying a feature extractor to the TDA4VH using TIDL with C7x DSP acceleration. The model runs correctly on CPU using:
providers = ["CPUExecutionProvider"]
However, when switching to:
providers = ["TIDLCompilationProvider", "CPUExecutionProvider"]
I encounter either TIDL compilation errors or the output becomes unusable—either all zeros or NaNs, depending on the setup.
Working Configuration
-
Model: ONNX (opset 18), running with FP32 on CPU
-
Output: Accurate
Failing Configurations (with DSP)
I’ve tested the following compilation and quantization setups using compile.py
(based on onnxrt_ep.py
from edgeai-tidl-tools
):
-
QDQ quantized model (via
quantize_static
from ONNX)-
Fails at compile time
-
TIDL import error: "missing inputs", topological sort failure (see below)
-
-
Quantization during TIDL compilation
-
Segfaults at TIDL compilation (session.run() with inference_mode:0)
-
With a warning:
Conv node failure: Name '/custom_backbone/block1_2/conv1/Conv'
Status Message: Input channels C is not equal to kernel channels * group.
C: 1 kernel channels: 3 group:
After this warning, the process segfaults immediately
-
I solved the Conv parameters (kernel_size=11, stride=4, padding=5), allowing the model to compile.
However, the output range is now enormous (-3.5e+33 to +2.6e+33
), suggesting a quantization or numeric overflow issue.
-
QDQ Import Failure Details
When attempting to compile the QDQ-format model (option 1), I get this TIDL import error:
[TIDL Import] [PARSER] ERROR: Layer 53, custom_backbone.block1_dd.conv_block.conv1.weight_DequantizeLinear_Output/duplicated:custom_backbone.block1_dd.conv_block.conv1.weight_DequantizeLinear_Output/duplicated is missing inputs in the network and cannot be topologically sorted. Missing inputs are:
-- [tidl_import_common.cpp, 4378]
Input 0: custom_backbone.block1_dd.conv_block.conv1.weight_DequantizeLinear_Output/duplicated, dataId=116
[TIDL Import] ERROR: - Failed in function: tidl_optimizeNet -- [tidl_import_core.cpp, 2602]
[TIDL Import] ERROR: Network Optimization failed - Failed in function: TIDL_runtimesOptimizeNet -- [tidl_runtimes_import_common.cpp, 1287]
It appears TIDL cannot resolve a duplicated or reused DequantizeLinear
node, which seems to be inserted during ONNX QDQ export.
Environment
-
Platform: TDA4VH
-
Python: 3.10.12
-
ONNX opset: 18
-
Quantization:
onnxruntime.quantization.quantize_static
-
TIDL tools commit:
1b75e86e79cfddb8f6e181014e6343e89765883d
-
Compilation: Based on
edgeai-tidl-tools/onnxrt_ep.py
- Calibration: performed using 100 test images in both options (quantize_static and TIDL quantization)
Key Questions
-
Are there known TIDL quantization or import issues with QDQ models (especially with reused weights or shared DequantizeLinear nodes)?
-
What causes the import error above, and how can I structure QDQ graphs to avoid this?
-
Does this kernel + stride combination require special handling in calibration?
-
Is there a known cause for this enormous output range after quantization?
Available for Debugging
I can provide the following artifacts if helpful:
-
ONNX models
-
Layer execution summary (
tempDir
) -
Calibration data + output examples
Any help would be greatly appreciated. Please let me know what additional files or logs you’d like to see.
Thanks in advance!
Victoria