This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Is this understanding of outQ and quantization outputs correct?

Hi,

I went through the FAQ point #10 of TIDeepLearningLibrary_UserGuide.pdf as well as the E2E thread e2e.ti.com/.../642684

As per my understanding of the two, the output in stats_tool_out.bin consists of 8-bit unsigned Q8 values which need to be adjusted by a scaling factor.

The scaling factor can be found from the "Out Q" value printed by the host emulation app. Consider this example:

Processing Frame Number : 0

 Layer    1 : Max PASS :    16320 :    81600 Out Q :    45384 ,    16384, TIDL_ConvolutionLayer, PASSED  #MMACs =    74.76,    73.72,    95.53, Sparsity : -27.78,   1.39

 Layer    2 : Max PASS :       51 :   163037 Out Q : 25419489 ,       51, Failing at    0,    0,    0,    0 ref,out = 255,0
TIDL_ConvolutionLayer, FAILED!!!!!!  #MMACs =   199.36,   193.17,   198.84, Sparsity :   0.26,   3.10

 Layer    3 : Max FAIL :        0 :      256 Out Q : 1023047544 ,       51, TIDL_ConvolutionLayer, PASSED  #MMACs =   199.36,   190.86,   197.07, Sparsity :   1.15,   4.26

 Layer    4 : Max FAIL :        0 :      256 Out Q :        1 ,       51, TIDL_ConvolutionLayer, PASSED  #MMACs =   199.36,   192.13,   197.85, Sparsity :   0.76,   3.63

 Layer    5 : Max FAIL :        0 :      256 Out Q :      884 ,       51, TIDL_ConvolutionLayer, PASSED  #MMACs =   199.36,   191.87,   196.01, Sparsity :   1.68,   3.76

 Layer    6 : Max FAIL :        0 :      256 Out Q :   813523 ,       51, TIDL_ConvolutionLayer, PASSED  #MMACs =   199.36,   191.93,   196.07, Sparsity :   1.65,   3.73

 Layer    7 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.09,     0.00,     0.09, Sparsity :   0.00, 100.00

 Layer    8 : Max FAIL :        0 :      256 Out Q : 1875186466 ,       51, TIDL_ConvolutionLayer, PASSED  #MMACs =   797.44,   786.47,   796.49, Sparsity :   0.12,   1.38

Layer    9 : Max FAIL :        0 :      256 Out Q : 1192752606 ,       51, TIDL_ConvolutionLayer, PASSED  #MMACs =   797.44,   785.82,   796.15, Sparsity :   0.16,   1.46

 Layer   10 : Max FAIL : 188577562 : 2136502786 Out Q :        1 ,      206, Failing at    0,    0,    0,    0 ref,out = 127,-128
TIDL_ConvolutionLayer, FAILED!!!!!!  #MMACs =    36.77,    35.31,    37.35, Sparsity :  -1.58,   3.97

End of config list found !

In this case, for the laste layer, the scaling factor is 1 is Q8 format which is 1/256 in floating point. So to convert the output values from quantized to floating point format, I need to divide each of them by (1/256). Is this understanding correct?

I think there is some mistake in my understanding somewhere as the floating values I'm getting do not make sense at all. Please help me understand and correct it.

Thanks,

Bhargav

  • Your understanding is right. You can also refer the detection output layer in the 01.01 release where are converting fixed point nu8mbers in to floating poitn and using in this funtion (For better understanding)

    Refer section "3.8 Matching TIDL inference result" in users guide for debuging further
  • Thanks for the update Kumar.

    Can you please let me know where I will find the "detection output layer in the 01.01 release where are converting fixed point nu8mbers in to floating poitn and using in this funtion (For better understanding)" inside ti_components/algorithms/REL.TIDL.01.01.00.00?
  • This is object only release. You need to refer the Source release for "detection output layer " code.
  • Hi Kumar,

    I forgot to mention that I've converted a tensorflow model and not Caffe model. Does this change anything? I can see in the import tool that based of caffe/tf model, inQuantFactor changes:

    if(gParams.modelType == 0)
    {
    if(gParams.inQuantFactor == -1)
    {
    gParams.inQuantFactor = 255;
    }
    caffe_import(&gParams);
    }
    else if (gParams.modelType == 1)
    {
    if(gParams.inQuantFactor == -1)
    {
    gParams.inQuantFactor = 128*255;
    }
    tf_import(&gParams);
    }


    Does this affect output calculation as well?
  • Yes, it does affetc the calculation.
    For tensorflow models that we validated the input range is normalized to -1.0 to 1.0 (actaul input to TIDL is -127 to 128 in fixed point). So the scale factor is 128.

    If this not true for your model. You need to set the right scale factpr that you have used during training.
  • Can you please share the source release for 01.01 through this forum? I do not have access to a CDDS account.
  • Source release only available via CDDS.
  • Hi Kumar,

    I trimmed down the above network to a single convolution + batch norm + relu layer to compare the outputs between tensorflow and TIDL. they are not matching. Please find the relevant files here: https://drive.google.com/open?id=1SNDxAekQZs6JSbt4mWoxcEzoi4bbdxeP

    Here's the console spew during model conversion:

    Num of Layer Detected : 2
    0, TIDL_DataLayer 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 416 , 416 , 0 ,
    1, TIDL_ConvolutionLayer 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 416 , 416 , 1 , 16 , 416 , 416 , 74760192 ,
    Total Giga Macs : 0.0748
    1 file(s) copied.

    Processing config file .\tempDir\qunat_stats_config.txt !
    0, TIDL_DataLayer , 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 416 , 416 ,
    1, TIDL_ConvolutionLayer , 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 416 , 416 , 1 , 16 , 416 , 416 ,
    2, TIDL_DataLayer , 0, 1 , -1 , 1 , x , x , x , x , x , x , x , 0 , 1 , 16 , 416 , 416 , 0 , 0 , 0 , 0 ,
    Layer ID ,inBlkWidth ,inBlkHeight ,inBlkPitch ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs ,numOutChs ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
    1 40 34 40 32 32 32 3 16 3 1 8 1 3 13 13 1360 1024 1

    Processing Frame Number : 0

    Layer 1 : Max PASS : 0 : 182635 Out Q : 4075 , 183351, TIDL_ConvolutionLayer, PASSED #MMACs = 74.76, 74.07, 97.60, Sparsity : -30.56, 0.93
    End of config list found !


    As you can see from tf_outputs.bin and tidl_outputs.bin, the outputs do not match.
    As per Section 3.8, I am getting in touch with you for this.

  • Hi Bhargav,

    tf_outputs.bin is in floating point and tidl_outputs.bin is in fixed point, so please convert this tidl_outputs.bin to floating point by using OutQ (which is in Q8 format). Please divide each value in tidl_outputs.bin with (OutQ /256) and then compare with tf_outputs.bin.

    If it still does not match then first try to match the inputs to convolution in both.

    Thanks,
    Praveen
  • Hi Praveen,

    Sorry if this wasn't clear. Yes I converted the TIDL outputs to float with OutQ/256 and also transposed the axes (0,1,2 -> 1,2,0) to make them identical to TF axes order. The outputs still do not match.

    To make the test simpler, I removed batch norm as well. So now I have a single conv + relu layer on both sides and outputs still do not match. Attaching the new relevant files here: drive.google.com/open

    One more thing is that I added inElementType=0 as well (as inputs are unsigned) but they don't match as well. (Included outputs for both cases in the above link).

    This time the TIDL o/ps are converted float values (divded by OutQ/256 = 628/256). As you can see, the outputs are nowhere close.

    I also checked the inputs to the convolution: both the trace_dump_0_416x416.y and the actual inputs to the refConv2DProcess() are proper (I checked by single stepping through the code in Visual Studio).

    It would be a great help if you can look into what's going on here.

    Thanks,
    Bhargav
  • Hi Bhargav,
    Arre you uisng channel_first or channel_last configuration.
    We have only validated models with " channel_last"

    What range of the of the input data in tensorflow? is it -1 to 1? or -128 to 128?
  • Hi Kumar,

    TF inputs are in NHWC format. Is that the supported one?

    The input data to tensorflow is from 0 to 1 (pixel values from 0 to 255 normalized by dividing with 255)

    Thanks,
    Bhargav
  • Yes. NHWC is suported.
    Please set "inQuantFactor = 65280" in import config file.
    Default value used it for -1 to 1 range.
    Also make sure "inElementType = 0"
  • I tried with inQuantFactor = 65280 and inElementType = 0. Also confirmed that these values are taking effect in code by placing prints.

    The only difference seems to be that OutQ is slightly higher. It is 632 now which was 618 earlier. Outputs are similar as earlier and dont match TF outputs.
  • Looks like intializing the biases in tf.slim.conv2d is causing the outputs to not match.

    If I remove the biases initialization the outputs match with inQuantFactor = 65280 and inElementType = 0 !!

  • HI Kumar,

    I tried with Keras as well (using the example and commands from above location) and see the same problem:

    If I initialize biases with non-zero values the outputs don't match (using bias_initializer in Keras or biases_initializer in tf.slim).

  • Hi,

    Did you try to example model that Kumar shared in previous post? Because in that bias is initialised with non-zero values.

    Thanks,
    Praveen