TDA4VM: The Model Converted From QAT ONNX model runs different result with model run using XNN

Yongchao Ding

Part Number: TDA4VM

We are using QAT ONNX model for developing in TDA4 which is sample resnet18 for example, part of onnx is as follows.

When try to convert above model to TIDL model, I found two problem.

Firstly, Maxpool and GlobalAvgPool will still use ptq for collect tensor range, it is resonable for GlobalAvgPool, because GlobalAvgPool Layer is not added Clip Layer using XNN quantized converterxnn.quantize.QuantTrainModule. But the Clip is added after Maxpool, but I see that Clip is transformed into BatchNorm Layer and Pooling need to recollect tensor from PTQ. So it is hard to understand.

Secondly, I run model using TIDL inference, and I can only get int format result even if i set writeOut=2, and I get the result from origin pytorch, xnn model and TIDL inference result.

Pytorch: tensor([[-5.1596, -0.8653, -3.6556, 14.2688, -8.3482, 10.9232, 2.4652, -5.4396, -2.9282, -3.8955]], grad_fn=<AddmmBackward>)

XNN Model: tensor([[-5.5000, -1.0000, -3.5000, 15.0000, -8.5000, 11.0000, 3.0000, -5.5000, -3.5000, -4.0000]], grad_fn=<CloneBackward>)

TIDL Inference: [ -5.0 -1.0 -4.0 14.0 -8.0 12.0 2.0 -5.0 -3.0 -4.0]

I Try to locate this error and I found that In last layer of tidl, the output tensorScale is set into 1, but from range -64, 64 the tensorScale should be 2. So I think it Scale is the reason for only returning int value, and How can I fix this problem.

Meanwhile, How can i upload onnx model and tidl svg file as attach file.

over 3 years ago

0 Manu Mathew over 3 years ago

TI__Genius 11276 points

Regarding the QAT question:

I think TIDL is not merging the MaxPool layer and the subsequent Clip layer, which causes the additional layer that you noticed to be present. One can argue that MaxPool layer does not need anything to be done for QAT, since the input of it is already quantized.

Can you try removing Maxpool2d from this list and try QAT once again: https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

Does this answer your question?

About your upload question:

You can zip it and upload (drag and drop) the zip file.

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

This answer is clearly for solving the problem about maxpool. So i am confused that why there not have clip operation added after global avgpool and other ops. I can see that only this node (torch.nn.ReLU, torch.nn.ReLU6, torch.nn.Hardtanh, layers.QAct, layers.PAct2, layers.AddBlock, layers.CatBlock, layers.MultBlock, torch.nn.MaxPool2d, torch.nn.AvgPool2d) will add clip layer.

Meanwhile, can you help me to check the second question about the TIDL inference result? Thanks.

Finally, I upload zip successfully as follows:

resent18.zip

0 Manu Mathew over 3 years ago in reply to Yongchao Ding

TI__Genius 11276 points

You can control where clip operation inserted by modifying the list that I showed: https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

For example you can use your global avg pool class name into that list and do the QAT training and export.

Can you try that and see if it solves your problem. If not please share the modified onnx and svg files and I shall discuss with the TIDL expert on how to solve the mismatch.

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

I will try to add clip op after avgpool and see the svg file.

However, I think that the TIDL Inference accuracy is totally different problem, which is the tensorScale problem, So can you help me to check why the tensorScale for last layer is always 1 in QAT model into TIDL.

Or maybe can you provide a QAT into TIDL model which can get float result in TIDL inference?

0 Manu Mathew over 3 years ago in reply to Yongchao Ding

TI__Genius 11276 points

After having the clip operator after avgpool, if this still issue persists, we would like to check this at our side to see what is going wrong. Please provide a sample onnx model (classification model if possible) with this issue.

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

upload zip, with onnx and svg in zip, please check

0284.resent18.zip

0 Manu Mathew over 3 years ago in reply to Yongchao Ding

TI__Genius 11276 points

The clip operator conveys the range to TIDL. Still there is no clip after GlobalAveragePool. That means the module name that you are using is not added to this list when you do QAT:

https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

If you are using torch.nn.AdaptiveAvgPool2d or something else, you can add that into that list and do QAT.

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

I think "there is no clip after GlobalAveragePool" is fine because the PTQ will process it and will and clip when doing PTQ.

My Question is about last Node GEMM, the output TensorScale of Gemm is 1, but the clip range is -64,64, So I think the Scale should be 2. And I think the last node Gemm is QAT, because XNN will add clip after gemm. The accuracy problem should be caused by this part, not by the "there is no clip after GlobalAveragePool".

0 Manu Mathew over 3 years ago in reply to Yongchao Ding

TI__Genius 11276 points

>>>I think "there is no clip after GlobalAveragePool" is fine because the PTQ will process it and will and clip when doing PTQ.

It depends. If QAT is not quantizing it and TIDL is quantizing it, there can be accuracy loss. There can be - I am not saying there will be.

How did you conclude that the range is -64, 64? Did you look at the feature map for all the calibration images and make sure that in no scenario, not even a small fraction of features, the range does not exceed 64? TIDL calculates if using histogram for the given set of calibration images - so it unlikely that it is wrong.

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

because in XNN, clip will be added after Gemm, and in ONNX file, the cllp range is from -64, 64, meanwhile, in svg I can see the range from -64, 64. and this range is collected from QAT process with XNN.

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

Manu Mathew said:
If QAT is not quantizing it and TIDL is quantizing it, there can be accuracy loss

And I am not confused about accuracy loss, but at the result of TIDL can only be int type "TIDL Inference: [ -5.0 -1.0 -4.0 14.0 -8.0 12.0 2.0 -5.0 -3.0 -4.0]".

+1 kumar.desappan over 3 years ago in reply to Yongchao Ding

TI__Mastermind 22145 points

We have some specific range handling code for Global Average Pooling and inner product layers.

We increase the range for the two layers 25% to avoid saturations. This needs to be avoided for QAT models.

We will file an internal Bug to track and fix this.

BTW, the import tool part of the code is available as a source, You can update and rebuild the import tool to get it fixed locally for now.

Current code

if (net->TIDLPCLayers[i].actParams.actType == TIDL_RelU6)

{

curMin = net->TIDLPCLayers[i].outData[0].minTensorValue = 0;

curMax = net->TIDLPCLayers[i].outData[0].maxTensorValue = 6.0f;

}

if ((net->TIDLPCLayers[i].layerType == TIDL_PoolingLayer) &&

(net->TIDLPCLayers[i].layerParams.poolParams.poolingType == TIDL_AveragePooling))

{

if((net->TIDLPCLayers[i].layerParams.poolParams.kernelW== 0) &&

(net->TIDLPCLayers[i].layerParams.poolParams.kernelH== 0))

{

curMax = curMax * 1.25;

outMin = outMin * 1.25;

}

if(net->TIDLPCLayers[i].layerType == TIDL_InnerProductLayer)

{

curMax = curMax * 1.25;

outMin = outMin * 1.25;

}

Fix

if ((net->TIDLPCLayers[i].layerType == TIDL_PoolingLayer) &&

(net->TIDLPCLayers[i].layerParams.poolParams.poolingType == TIDL_AveragePooling) &&

(net->TIDLPCLayers[i].actParams.actType != TIDL_Clip))

{

if((net->TIDLPCLayers[i].layerParams.poolParams.kernelW== 0) &&

(net->TIDLPCLayers[i].layerParams.poolParams.kernelH== 0))

{

curMax = curMax * 1.25;

outMin = outMin * 1.25;

}

if ((net->TIDLPCLayers[i].layerType == TIDL_InnerProductLayer) &&

(net->TIDLPCLayers[i].actParams.actType != TIDL_Clip))

{

curMax = curMax * 1.25;

outMin = outMin * 1.25;

}

if (net->TIDLPCLayers[i].actParams.actType == TIDL_RelU6)

{

curMin = net->TIDLPCLayers[i].outData[0].minTensorValue = 0;

curMax = net->TIDLPCLayers[i].outData[0].maxTensorValue = 6.0f;

}

0 Yongchao Ding over 3 years ago in reply to Manu Mathew

Prodigy 40 points

ONNX_with_globalavgpool_clip.zip

I add global avg pool in https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

and get the new onnx file in zip.

and the result is in format of int

Load resnet18_tidl_import.txt_stats_tool_out.bin With <class 'numpy.float32'>
[[ -5. -1. -3. 15. -8. 11. 2. -5. -4. -4.].

0 Yongchao Ding over 3 years ago in reply to kumar.desappan

Prodigy 40 points

kumar.desappan said:
if ((net->TIDLPCLayers[i].layerType == TIDL_InnerProductLayer) &&

(net->TIDLPCLayers[i].actParams.actType != TIDL_Clip))

{

curMax = curMax * 1.25;

outMin = outMin * 1.25;

}

This part seems the key reason, and I will try to recompile importer for test.

0 Yongchao Ding over 3 years ago in reply to kumar.desappan

Prodigy 40 points

I test it on our model, and the output is ok.

Load qat_export/resnet18_tidl_import.txt_stats_tool_out.bin With <class 'numpy.float32'>
[[ -5.5 -1. -3.5 14.5 -8. 11. 2.5 -5.5 -4. -3.5]

The output tensorScale is 2 now, Thanks.

Processors

Processors forum

TDA4VM: The Model Converted From QAT ONNX model runs different result with model run using XNN