Is this understanding of outQ and quantization outputs correct?

Bhargav Dave

Hi,

I went through the FAQ point #10 of TIDeepLearningLibrary_UserGuide.pdf as well as the E2E thread e2e.ti.com/.../642684

As per my understanding of the two, the output in stats_tool_out.bin consists of 8-bit unsigned Q8 values which need to be adjusted by a scaling factor.

The scaling factor can be found from the "Out Q" value printed by the host emulation app. Consider this example:

Processing Frame Number : 0

Layer    1 : Max PASS :    16320 :    81600 Out Q :    45384 ,    16384, TIDL_ConvolutionLayer, PASSED #MMACs =    74.76,    73.72,    95.53, Sparsity : -27.78,   1.39

Layer    2 : Max PASS :       51 :   163037 Out Q : 25419489 ,       51, Failing at    0,    0,    0,    0 ref,out = 255,0
TIDL_ConvolutionLayer, FAILED!!!!!! #MMACs =   199.36,   193.17,   198.84, Sparsity :   0.26,   3.10

Layer    3 : Max FAIL :        0 :      256 Out Q : 1023047544 ,       51, TIDL_ConvolutionLayer, PASSED #MMACs =   199.36,   190.86,   197.07, Sparsity :   1.15,   4.26

Layer    4 : Max FAIL :        0 :      256 Out Q :        1 ,       51, TIDL_ConvolutionLayer, PASSED #MMACs =   199.36,   192.13,   197.85, Sparsity :   0.76,   3.63

Layer    5 : Max FAIL :        0 :      256 Out Q :      884 ,       51, TIDL_ConvolutionLayer, PASSED #MMACs =   199.36,   191.87,   196.01, Sparsity :   1.68,   3.76

Layer    6 : Max FAIL :        0 :      256 Out Q :   813523 ,       51, TIDL_ConvolutionLayer, PASSED #MMACs =   199.36,   191.93,   196.07, Sparsity :   1.65,   3.73

Layer    7 :TIDL_PoolingLayer,     PASSED #MMACs =     0.09,     0.00,     0.09, Sparsity :   0.00, 100.00

Layer    8 : Max FAIL :        0 :      256 Out Q : 1875186466 ,       51, TIDL_ConvolutionLayer, PASSED #MMACs =   797.44,   786.47,   796.49, Sparsity :   0.12,   1.38

Layer    9 : Max FAIL :        0 :      256 Out Q : 1192752606 ,       51, TIDL_ConvolutionLayer, PASSED #MMACs =   797.44,   785.82,   796.15, Sparsity :   0.16,   1.46

Layer   10 : Max FAIL : 188577562 : 2136502786 Out Q :        1 ,      206, Failing at    0,    0,    0,    0 ref,out = 127,-128
TIDL_ConvolutionLayer, FAILED!!!!!! #MMACs =    36.77,    35.31,    37.35, Sparsity : -1.58,   3.97

End of config list found !

In this case, for the laste layer, the scaling factor is 1 is Q8 format which is 1/256 in floating point. So to convert the output values from quantized to floating point format, I need to divide each of them by (1/256). Is this understanding correct?

I think there is some mistake in my understanding somewhere as the floating values I'm getting do not make sense at all. Please help me understand and correct it.

Thanks,

Bhargav

over 6 years ago

0 kumar.desappan over 6 years ago

TI__Mastermind 22085 points

Your understanding is right. You can also refer the detection output layer in the 01.01 release where are converting fixed point nu8mbers in to floating poitn and using in this funtion (For better understanding)

Refer section "3.8 Matching TIDL inference result" in users guide for debuging further

0 Bhargav Dave over 6 years ago in reply to kumar.desappan

Intellectual 820 points

Thanks for the update Kumar.

Can you please let me know where I will find the "detection output layer in the 01.01 release where are converting fixed point nu8mbers in to floating poitn and using in this funtion (For better understanding)" inside ti_components/algorithms/REL.TIDL.01.01.00.00?

0 kumar.desappan over 6 years ago in reply to Bhargav Dave

TI__Mastermind 22085 points

This is object only release. You need to refer the Source release for "detection output layer " code.

0 Bhargav Dave over 6 years ago in reply to kumar.desappan

Intellectual 820 points

Hi Kumar,

I forgot to mention that I've converted a tensorflow model and not Caffe model. Does this change anything? I can see in the import tool that based of caffe/tf model, inQuantFactor changes:

if(gParams.modelType == 0)
{
if(gParams.inQuantFactor == -1)
{
gParams.inQuantFactor = 255;
}
caffe_import(&gParams);
}
else if (gParams.modelType == 1)
{
if(gParams.inQuantFactor == -1)
{
gParams.inQuantFactor = 128*255;
}
tf_import(&gParams);
}

Does this affect output calculation as well?

0 kumar.desappan over 6 years ago in reply to Bhargav Dave

TI__Mastermind 22085 points

Yes, it does affetc the calculation.
For tensorflow models that we validated the input range is normalized to -1.0 to 1.0 (actaul input to TIDL is -127 to 128 in fixed point). So the scale factor is 128.

If this not true for your model. You need to set the right scale factpr that you have used during training.

0 Bhargav Dave over 6 years ago in reply to kumar.desappan

Intellectual 820 points

Can you please share the source release for 01.01 through this forum? I do not have access to a CDDS account.

0 kumar.desappan over 6 years ago in reply to Bhargav Dave

TI__Mastermind 22085 points

Source release only available via CDDS.

0 Bhargav Dave over 5 years ago in reply to kumar.desappan

Intellectual 820 points

Hi Kumar,

I trimmed down the above network to a single convolution + batch norm + relu layer to compare the outputs between tensorflow and TIDL. they are not matching. Please find the relevant files here: https://drive.google.com/open?id=1SNDxAekQZs6JSbt4mWoxcEzoi4bbdxeP

Here's the console spew during model conversion:

Num of Layer Detected : 2
0, TIDL_DataLayer 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 416 , 416 , 0 ,
1, TIDL_ConvolutionLayer 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 416 , 416 , 1 , 16 , 416 , 416 , 74760192 ,
Total Giga Macs : 0.0748
1 file(s) copied.

Processing config file .\tempDir\qunat_stats_config.txt !
0, TIDL_DataLayer , 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 416 , 416 ,
1, TIDL_ConvolutionLayer , 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 416 , 416 , 1 , 16 , 416 , 416 ,
2, TIDL_DataLayer , 0, 1 , -1 , 1 , x , x , x , x , x , x , x , 0 , 1 , 16 , 416 , 416 , 0 , 0 , 0 , 0 ,
Layer ID ,inBlkWidth ,inBlkHeight ,inBlkPitch ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs ,numOutChs ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
1 40 34 40 32 32 32 3 16 3 1 8 1 3 13 13 1360 1024 1

Processing Frame Number : 0

Layer 1 : Max PASS : 0 : 182635 Out Q : 4075 , 183351, TIDL_ConvolutionLayer, PASSED #MMACs = 74.76, 74.07, 97.60, Sparsity : -30.56, 0.93
End of config list found !

As you can see from tf_outputs.bin and tidl_outputs.bin, the outputs do not match.
As per Section 3.8, I am getting in touch with you for this.

0 Praveen Eppa1 over 5 years ago in reply to Bhargav Dave

TI__Genius 17580 points

Hi Bhargav,

tf_outputs.bin is in floating point and tidl_outputs.bin is in fixed point, so please convert this tidl_outputs.bin to floating point by using OutQ (which is in Q8 format). Please divide each value in tidl_outputs.bin with (OutQ /256) and then compare with tf_outputs.bin.

If it still does not match then first try to match the inputs to convolution in both.

Thanks,
Praveen

0 Bhargav Dave over 5 years ago in reply to Praveen Eppa1

Intellectual 820 points

Hi Praveen,

Sorry if this wasn't clear. Yes I converted the TIDL outputs to float with OutQ/256 and also transposed the axes (0,1,2 -> 1,2,0) to make them identical to TF axes order. The outputs still do not match.

To make the test simpler, I removed batch norm as well. So now I have a single conv + relu layer on both sides and outputs still do not match. Attaching the new relevant files here: drive.google.com/open

One more thing is that I added inElementType=0 as well (as inputs are unsigned) but they don't match as well. (Included outputs for both cases in the above link).

This time the TIDL o/ps are converted float values (divded by OutQ/256 = 628/256). As you can see, the outputs are nowhere close.

I also checked the inputs to the convolution: both the trace_dump_0_416x416.y and the actual inputs to the refConv2DProcess() are proper (I checked by single stepping through the code in Visual Studio).

It would be a great help if you can look into what's going on here.

Thanks,
Bhargav

0 kumar.desappan over 5 years ago in reply to Bhargav Dave

TI__Mastermind 22085 points

Hi Bhargav,
Arre you uisng channel_first or channel_last configuration.
We have only validated models with " channel_last"

What range of the of the input data in tensorflow? is it -1 to 1? or -128 to 128?

0 Bhargav Dave over 5 years ago in reply to kumar.desappan

Intellectual 820 points

Hi Kumar,

TF inputs are in NHWC format. Is that the supported one?

The input data to tensorflow is from 0 to 1 (pixel values from 0 to 255 normalized by dividing with 255)

Thanks,
Bhargav

0 kumar.desappan over 5 years ago in reply to Bhargav Dave

TI__Mastermind 22085 points

Yes. NHWC is suported.
Please set "inQuantFactor = 65280" in import config file.
Default value used it for -1 to 1 range.
Also make sure "inElementType = 0"

0 Bhargav Dave over 5 years ago in reply to kumar.desappan

Intellectual 820 points

I tried with inQuantFactor = 65280 and inElementType = 0. Also confirmed that these values are taking effect in code by placing prints.

The only difference seems to be that OutQ is slightly higher. It is 632 now which was 618 earlier. Outputs are similar as earlier and dont match TF outputs.

0 Bhargav Dave over 5 years ago in reply to Bhargav Dave

Intellectual 820 points

Looks like intializing the biases in tf.slim.conv2d is causing the outputs to not match.

If I remove the biases initialization the outputs match with inQuantFactor = 65280 and inElementType = 0 !!

0 kumar.desappan over 5 years ago in reply to Bhargav Dave

TI__Mastermind 22085 points

Please refer below to build tensoflow model for TIDL

https://e2e.ti.com/support/arm/automotive_processors/f/1021/t/689876

0 Bhargav Dave over 5 years ago in reply to kumar.desappan

Intellectual 820 points

HI Kumar,

I tried with Keras as well (using the example and commands from above location) and see the same problem:

If I initialize biases with non-zero values the outputs don't match (using bias_initializer in Keras or biases_initializer in tf.slim).

0 Praveen Eppa1 over 5 years ago in reply to Bhargav Dave

TI__Genius 17580 points

Hi,

Did you try to example model that Kumar shared in previous post? Because in that bias is initialised with non-zero values.

Thanks,
Praveen

Processors

Processors forum

Is this understanding of outQ and quantization outputs correct?