This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM62A7: the size of the tensor output by onnx was inconsistent with the size output by ti simulation.

Part Number: AM62A7

Tool/software:

When I was simulating the model using onnxrt_ep. py, the effect was good when I used - d-m, and the size of the tensor output was (21, 6). However, when I used - d, I found that the output of the model was (300, 6), without filtering for nms. But when I executed export-oriented, I added the option of export_nms. Then I uploaded the tensors of these two results in the attachments, which were saved in OCD2_608pc-d - m. txt and OCD2_608pc-m. txt files respectively. Finally, I also uploaded pictures of onnx, ti simulation, and board end tiovx execution results, three of which are py_ovx. ut_oCD2_608pc-d - m. jpg, py_out_oCD2_608pc-m. jpg, and board.jpg, It can be found that the effect of onnxruntime is normal, while the simulation and board side effects are incorrect. Can you help analyze the reasons for this,thank you very much

best wishes

zhuangyihao2577.1205 experiment.zip

  • Hello Zhuang,

    Thank you for moving this into a separate thread! It will be easier to track and find for future developers with similar queries.

    Yes, I can see that you are getting different outputs for -d (run on CPU with no offload/processing from TI's software) and no argument (which will use TIDL's execution provider). 

    TIDL uses 8-bit quantization by default. There is some inherent accuracy loss here, and we try to mitigate this as much as possible. It is normal to see small variations from your floating-point model, and the current task on your model is to reduce these variations. As a first step, I'd recommend setting the tensor_bits in common_utils.py to 16 to enable 16-bit quantization. Typically, accuracy loss is negligible. From there, we can use the 'output_feature_16bit_names_list' setting to use 16-bit mode on select layers and 8-bit everywhere else. In this way, we balance performance and accuracy loss.

    The 300x vs 21x boxes sounds normal to me. When we compile an object detection model, there is a setting for how many boxes to keep as part of the meta architecture This setting can also be controlled by the 'delegate_options' passed to the ONNXRuntime TIDLExecutionProvider during compilation. These additional boxes can just be ignored during postprocessing by applying a confidence threshold.

    BR,
    Reese

  • Hello Reese, I followed your suggestion and set tensor-bits to 16, which improved the detection performance of the model. However, the overall time consumption increased significantly, from the original 35ms to 105ms, which is unacceptable. As you mentioned, only some layers were set to 16 bits. According to your suggestion, tensor-bits was set to 8, and the three conv values before the post-processing node were filled in for outputting _feature_16bit_names_list and params_16bit_names_list, but the quantization accuracy decreased again. It seems that outputting _feature_16bit_names_list and params_16bit_names_list do not work. Am I setting them correctly

    BR

    zhuangyihao

  • Hello,

    Thank you for trying this -- it is the expected result (better accuracy, lower performance). 

    According to your suggestion, tensor-bits was set to 8, and the three conv values before the post-processing node were filled in for outputting _feature_16bit_names_list and params_16bit_names_list, but the quantization accuracy decreased again. It seems that outputting _feature_16bit_names_list and params_16bit_names_list do not work. Am I setting them correctly

    Is the accuracy now identical to what is was before with only 8-bit, and no 16-bit layers? If so, this setting probably applied incorrectly

    Can you share your settings for these two outputting _feature_16bit_names_list and params_16bit_names_list settings?

    • if you used "outputting _feature_16bit_names_list" instead of "output_feature_16bit_names_list" (the second one being correct), then the setting would not have been applied. 
    • The values in this "output_feature_16bit_names_list" is a single string containing comma-separated of the intermediate tensors that you wish to be 16-bit.
      • for instance, a Conv layer named "A" may have input named "X" and output named "Y". Your output_feature_16bit_names_list would include "Y" 
    • similarly, the "params_16bit_names_list" should use the layer output (from line above, "Y"), and the parameters / constant weights supplied to the layer should then be 16 bit as well.

    If you are not sure if the setting was applied correctly, open the subgraph SVG files within artifacts/tempDir in a browser. If you hover over a node, it should contain more information

    BR,
    Reese