TDA4VM: Segfault when running automated mixed precision algorithm

Mladen Knezic

Part Number: TDA4VM

Hi all,

I am experimenting with the TIDL automated mixed precision algorithm for TDA4VM board. I use EdgeAI TIDL Tools version 10_01_04_00 and TI provided cl-ort-resnet18-v1model with mixed_precision_factor parameter set to 1.3 (tolerated 30% increase of latency compared to 8-bit inference). However, the algorithm fails during the quantization process with the following error:

-------- Running Calibration in Float Mode to Collect Tensor Statistics --------
Segmentation fault (core dumped)
[TIDL Import] ERROR: Failed to run calibration pass, system command returned error: 35584 -- [tidl_import_core.cpp, 678]
[TIDL Import] ERROR: Failed to run Calibration - Failed in function: tidlRunQuantStatsTool -- [tidl_import_core.cpp, 1746]
[TIDL Import] [QUANTIZATION] ERROR: - Failed in function: TIDL_quantStatsFixedOrFloat -- [tidl_import_quantize.cpp, 3969]
[TIDL Import] ERROR: - Failed in function: TIDL_executeAutomatedMixedPrecision -- [tidl_import_core.cpp, 4037]
[TIDL Import] ERROR: - Failed in function: TIDL_import_backend -- [tidl_import_core.cpp, 4419]
[TIDL Import] ERROR: - Failed in function: TIDL_runtimesPostProcessNet -- [tidl_runtimes_import_common.cpp, 1414]
[TIDL Import] [PARSER] ERROR: - Failed in function: TIDL_subgraphImport -- [tidl_onnxRtImport_EP.cpp, 1737]
[TIDL Import] [PARSER] ERROR: - Failed in function: TIDL_computeInvokeFunc -- [tidl_onnxRtImport_EP.cpp, 2511]

I traced down the issue to the point where PC_dsp_test_dl_algo.out tool is invoked to collect the statistics for model quantization. This tool experience a segmentation fault after calling the function getLayerIdToExecute() in c7x-mma-tidl/ti_dl/algo/src/tidl_alg.c. This function returns the layerId which is out of bounds so afterwards invalid data is accessed. I suspect that network binary file, which is used by the tool, somehow gets corrupted, but I am not sure where and why. The tricky part is that this is a random behavior, so sometimes the algorithm runs without the issues. I tried it with other (customer specific) models and observed the same behavior, so I do not think it is an issue with the model.

I am using the Docker environment as recommended in the EdgeAI TIDL Tools documentation, so I sippose that potential i environment issues can be ruled out.

Did someone observed the similar behavior on their end and do you maybe know how to address it?

Note: I also tried to run the model qunatization on a more recent EdgeAI TIDL Tools version (11_01_06_00), but the issue is the same.

Best regards,

Mladen

30 days ago

+1 Vaibhav Kumar 26 days ago

TI__Intellectual 1220 points

Hi, I have debugged and made a change. Can you try with following tools 10_01_04_00
3162.tidl_tools.tar.gz

Regards

Vaibhav

0 Mladen Knezic 25 days ago in reply to Vaibhav Kumar

Prodigy 60 points

Vaibhav Kumar,

Thanks for addressing the issue and coming back with the fix.

We also managed to debug and find a fix. On our end, we memset sTIDL_Network_t structure in function TIDL_executeAutomatedMixedPrecision() located in c7x-mma-tidl/ti_dl/utils/tidlModelImport/tidl_import_core.cpp. We also fixed the part which release the previously allocated resources (even though this seems not to impact the crash). I attach here the diff with all the changes for your reference. Can you confirm that this is also a change in your fix?

I'll also try with your binaries and come back with the results.

Regards,

Mladen

0 Vaibhav Kumar 25 days ago in reply to Mladen Knezic

TI__Intellectual 1220 points

Yes, same as yours, I also zero initialized sTIDL_Network_t structure in the same function. I don't see any diff attached.

Regards

0 Mladen Knezic 25 days ago in reply to Vaibhav Kumar

Prodigy 60 points

Sorry, I forgot to attach it. Here it is.

Fullscreen tidl_import_tools_patch.txt Download

diff -ruN a/c7x-mma-tidl/ti_dl/utils/tidlModelImport/tidl_import_core.cpp b/c7x-mma-tidl/ti_dl/utils/tidlModelImport/tidl_import_core.cpp
--- a/c7x-mma-tidl/ti_dl/utils/tidlModelImport/tidl_import_core.cpp	2024-12-12 17:18:42.000000000 +0100
+++ b/c7x-mma-tidl/ti_dl/utils/tidlModelImport/tidl_import_core.cpp	2026-04-24 08:28:53.986258000 +0200
@@ -3945,6 +3945,7 @@
 
   strcpy(inConfigFilename, TIDL_augmentCharArrayWithSuffix(inConfigFileNameOrig, "_float").c_str());
   sTIDL_Network_t * tidlNetStructureFloat = new sTIDL_Network_t;
+  memset(tidlNetStructureFloat, 0, sizeof(sTIDL_Network_t));
   TIDL_updateConfigParameters(&gParams,-1,-1,-1,-1,-1,gParams.numFramesBiasCalibration/4);
   gParams.writeTraceLevel = 3;
   TIDL_IMPORT_CHECK_AND_RETURN(TIDL_quantStatsFixedOrFloat(&orgTIDLNetStructure,
@@ -4330,8 +4331,13 @@
 
       /* Execute the algorithm */
       TIDL_IMPORT_CHECK_AND_RETURN(TIDL_executeAutomatedMixedPrecision(layerIndex, orgTIDLNetStructureOrig, &configParamsOrig), "");
+      
+      TIDL_freeModelParams(orgTIDLNetStructureOrig, orgTIDLNetStructure.numLayers);
 
-      delete orgTIDLNetStructureOrig;
+      if ( orgTIDLNetStructureOrig != NULL )
+      {
+        delete orgTIDLNetStructureOrig;
+      }
     }
     /* Needs review on when exactly we want to abort if this function fails */
     TIDL_importBitDepthProtoTxt(&orgTIDLNetStructure, &gParams);

0 Mladen Knezic 25 days ago in reply to Vaibhav Kumar

Prodigy 60 points

I also tried your pre-built files and I confirm it fixes the issue.

0 Vaibhav Kumar 25 days ago in reply to Mladen Knezic

TI__Intellectual 1220 points

Changes look fine to me. If you have futher issues, you may create new ticket.

Processors

Processors forum

TDA4VM: Segfault when running automated mixed precision algorithm