This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: The Model Converted From QAT ONNX model runs different result with model run using XNN

Part Number: TDA4VM

We are using QAT ONNX model for developing in TDA4 which is sample resnet18 for example, part of onnx is as follows.

 .  

When try to convert above model to TIDL model, I found two problem.

Firstly, Maxpool and GlobalAvgPool will still use ptq for collect tensor range, it is resonable for GlobalAvgPool, because  GlobalAvgPool Layer is not added Clip Layer using XNN quantized converterxnn.quantize.QuantTrainModule. But the Clip is added after Maxpool, but I see that Clip is transformed into BatchNorm Layer and Pooling need to recollect tensor from PTQ. So it is hard to understand.

Secondly, I run model using TIDL inference, and I can only get int format result even if i set writeOut=2, and I get the result from origin pytorch, xnn model and TIDL inference result.

Pytorch: tensor([[-5.1596, -0.8653, -3.6556, 14.2688, -8.3482, 10.9232, 2.4652, -5.4396, -2.9282, -3.8955]], grad_fn=<AddmmBackward>) 

XNN Model: tensor([[-5.5000, -1.0000, -3.5000, 15.0000, -8.5000, 11.0000, 3.0000, -5.5000, -3.5000, -4.0000]], grad_fn=<CloneBackward>) 

TIDL Inference: [ -5.0 -1.0 -4.0 14.0 -8.0 12.0 2.0 -5.0 -3.0 -4.0]

I Try to locate this error and I found that In last layer of tidl, the output tensorScale is set into 1, but from range -64, 64 the tensorScale should be 2. So I think it Scale is the reason for only returning int value, and How can I fix this problem.

Meanwhile, How can i upload onnx model and tidl svg file as attach file.

  • Regarding the QAT question:

    I think TIDL is not merging the MaxPool layer and the subsequent Clip layer, which causes the additional layer that you noticed to be present. One can argue that MaxPool layer does not need anything to be done for QAT, since the input of it is already quantized.

    Can you try removing Maxpool2d from this list and try QAT once again: https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

    Does this answer your question?

    .

    About your upload question:

    You can zip it and upload (drag and drop) the zip file. 

  • This answer is clearly for solving the problem about maxpool. So i am confused that why there not have clip operation added after global avgpool and other ops. I can see that only this node (torch.nn.ReLU, torch.nn.ReLU6, torch.nn.Hardtanh, layers.QAct, layers.PAct2, layers.AddBlock, layers.CatBlock, layers.MultBlock, torch.nn.MaxPool2d, torch.nn.AvgPool2d) will add clip layer.

    Meanwhile, can you help me to check the second question about the TIDL inference result? Thanks.

    Finally, I upload zip successfully as follows:

    resent18.zip

  • You can control where clip operation inserted by modifying the list that I showed: https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

    For example you can use your global avg pool class name into that list and do the QAT training and export.

    .

    Can you try that and see if it solves your problem. If not please share the modified onnx and svg files and I shall discuss with the TIDL expert on how to solve the mismatch.

  • I will try to add clip op after avgpool and see the svg file.

    However, I think that the TIDL Inference accuracy is totally different problem, which is the tensorScale problem, So can you help me to check why the tensorScale for last layer is always 1 in QAT model into TIDL. 

    Or maybe can you provide a QAT into TIDL model which can get float result in TIDL inference?

  • After having the clip operator after avgpool, if this still issue persists, we would like to check this at our side to see what is going wrong. Please provide a sample onnx model (classification model if possible) with this issue.

  • upload zip, with onnx and svg in zip, please check

    0284.resent18.zip

  • The clip operator conveys the range to TIDL. Still there is no clip after GlobalAveragePool. That means the module name that you are using is not added to this list when you do QAT:

    https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

    If you are using torch.nn.AdaptiveAvgPool2d or something else, you can add that into that list and do QAT.

  • I think "there is no clip after GlobalAveragePool" is fine because the PTQ will process it and will and clip when doing PTQ.

    My Question is about last Node GEMM, the output TensorScale of Gemm is 1, but the clip range is -64,64, So I think the Scale should be 2. And I think the last node Gemm is QAT, because XNN will add clip after gemm. The accuracy problem should be caused by this part, not by the "there is no clip after GlobalAveragePool".

  • >>>I think "there is no clip after GlobalAveragePool" is fine because the PTQ will process it and will and clip when doing PTQ.

    It depends. If QAT is not quantizing it and TIDL is quantizing it, there can be accuracy loss. There can be - I am not saying there will be.

    How did you conclude that the range is -64, 64? Did you look at the feature map for all the calibration images and make sure that in no scenario, not even a small fraction of features, the range does not exceed 64? TIDL calculates if using histogram for the given set of calibration images - so it unlikely that it is wrong.

  • because in XNN, clip will be added after Gemm, and in ONNX file, the cllp range is from -64, 64, meanwhile, in svg I can see the range from -64, 64. and this range is collected from QAT process with XNN.

  • If QAT is not quantizing it and TIDL is quantizing it, there can be accuracy loss

    And I am not confused about accuracy loss, but at the result of TIDL can only be int type "TIDL Inference: [ -5.0 -1.0 -4.0 14.0 -8.0 12.0 2.0 -5.0 -3.0 -4.0]". 

  • We have some specific range handling code for Global Average Pooling and inner product layers.

    We increase the range for the two layers 25% to avoid saturations. This needs to be avoided for QAT models.

    We will file an internal Bug to track and fix this.

    BTW, the import tool part of the code is available as a source, You can update and rebuild the import tool to get it fixed locally for now.

    Current code

    if (net->TIDLPCLayers[i].actParams.actType == TIDL_RelU6)
    {
    curMin = net->TIDLPCLayers[i].outData[0].minTensorValue = 0;
    curMax = net->TIDLPCLayers[i].outData[0].maxTensorValue = 6.0f;
    }
    if ((net->TIDLPCLayers[i].layerType == TIDL_PoolingLayer) &&
    (net->TIDLPCLayers[i].layerParams.poolParams.poolingType == TIDL_AveragePooling))
    {
    if((net->TIDLPCLayers[i].layerParams.poolParams.kernelW== 0) &&
    (net->TIDLPCLayers[i].layerParams.poolParams.kernelH== 0))
    {
    curMax = curMax * 1.25;
    outMin = outMin * 1.25;
    }
    }
    if(net->TIDLPCLayers[i].layerType == TIDL_InnerProductLayer)
    {
    curMax = curMax * 1.25;
    outMin = outMin * 1.25;
    }
     
    Fix
    if ((net->TIDLPCLayers[i].layerType == TIDL_PoolingLayer) &&
    (net->TIDLPCLayers[i].layerParams.poolParams.poolingType == TIDL_AveragePooling) &&
    (net->TIDLPCLayers[i].actParams.actType != TIDL_Clip))
    {
    if((net->TIDLPCLayers[i].layerParams.poolParams.kernelW== 0) &&
    (net->TIDLPCLayers[i].layerParams.poolParams.kernelH== 0))
    {
    curMax = curMax * 1.25;
    outMin = outMin * 1.25;
    }
    }
    if ((net->TIDLPCLayers[i].layerType == TIDL_InnerProductLayer) &&
    (net->TIDLPCLayers[i].actParams.actType != TIDL_Clip))
    {
    curMax = curMax * 1.25;
    outMin = outMin * 1.25;
    }
    if (net->TIDLPCLayers[i].actParams.actType == TIDL_RelU6)
    {
    curMin = net->TIDLPCLayers[i].outData[0].minTensorValue = 0;
    curMax = net->TIDLPCLayers[i].outData[0].maxTensorValue = 6.0f;
    }

  • ONNX_with_globalavgpool_clip.zip

    I add global avg pool in https://github.com/TexasInstruments/edgeai-torchvision/blob/master/torchvision/edgeailite/xnn/quantize/quant_graph_module.py#L51

    and get the new onnx file in zip. 
     

    and the result is in format of int

    Load resnet18_tidl_import.txt_stats_tool_out.bin With <class 'numpy.float32'>
    [[ -5. -1. -3. 15. -8. 11. 2. -5. -4. -4.]. 

  • if ((net->TIDLPCLayers[i].layerType == TIDL_InnerProductLayer) &&
    (net->TIDLPCLayers[i].actParams.actType != TIDL_Clip))
    {
    curMax = curMax * 1.25;
    outMin = outMin * 1.25;
    }

    This part seems the key reason, and I will try to recompile importer for test.

  • I test it on our model, and the output is ok.


    Load qat_export/resnet18_tidl_import.txt_stats_tool_out.bin With <class 'numpy.float32'>
    [[ -5.5 -1. -3.5 14.5 -8. 11. 2.5 -5.5 -4. -3.5]

    The output tensorScale is 2 now, Thanks.