The 16bit model's tidl Giga Macs is 11.1551 and inference speed is 200ms. I tried to reduce the size of the input image, The tidl Giga Macs reduce to 10.2614.The 8-bit model is faster ? Do you have any suggestions for improving the inference speed. And the import config as follow:
modelType = 2
numParamBits = 8
numFeatureBits = 8
inElementType = 0
calibrationOption = 7
quantizationStyle = 3
biasCalibrationIterations = 2
inputNetFile = "../../test/testvecs/models/public/onnx/hyposelr18_infered.onnx"
outputNetFile = "../../test/testvecs/config/tidl_models/onnx/tidl_net_hylr18_8.bin"
outputParamsFile = "../../test/testvecs/config/tidl_models/onnx/tidl_io_hylr18_8_"
inFileFormat = 2
inDataNorm = 1
inMean = 0.0 0.0 0.0
inScale = 0.003921 0.003921 0.003921
resizeWidth = 256
resizeHeight = 256
inDataFormat = 1
inNumChannels = 3
inWidth = 256
inHeight = 256
inData = "../../test/testvecs/config/hrpose.txt"
inDataNamesList = x:0
outDataNamesList = "Identity:0, Identity_1:0"
postProcType = 0
tidl_net_hylr18_8.bin_paramDebug.csv
Num of Layer Detected : 83 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Num|TIDL Layer Name |Out Data Name |Group |#Ins |#Outs |Inbuf Ids |Outbuf Id |In NCHW |Out NCHW |MACS | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 0|TIDL_DataLayer |x:0_original | 0| -1| 1| x x x x x x x x | 0 | 0 0 0 0 0 0 | 1 1 1 3 256 256 | 0 | 1|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_4/sub:0 | 0| -1| 1| x x x x x x x x | 1 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 2|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_4/mul:0 | 0| -1| 1| x x x x x x x x | 2 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 3|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_2/sub:0 | 0| -1| 1| x x x x x x x x | 3 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 4|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_2/mul:0 | 0| -1| 1| x x x x x x x x | 4 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 5|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_3/sub:0 | 0| -1| 1| x x x x x x x x | 5 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 6|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_3/mul:0 | 0| -1| 1| x x x x x x x x | 6 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 7|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_1/sub:0 | 0| -1| 1| x x x x x x x x | 7 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 8|TIDL_ConstDataLayer |StatefulPartitionedCall/batchnorm_1/mul:0 | 0| -1| 1| x x x x x x x x | 8 | 0 0 0 0 0 0 | 1 1 1 64 64 64 | 0 | 9|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu:0 | 0| 1| 1| 0 x x x x x x x | 9 | 1 1 1 3 256 256 | 1 1 1 64 128 128 | 154140672 | 10|TIDL_PoolingLayer |StatefulPartitionedCall/maxpool_1:0 | 0| 1| 1| 9 x x x x x x x | 10 | 1 1 1 64 128 128 | 1 1 1 64 64 64 | 2359296 | 11|TIDL_ConvolutionLayer |StatefulPartitionedCall/block_2_1_conv_1:0 | 0| 1| 1| 10 x x x x x x x | 11 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 150994944 | 12|TIDL_EltWiseLayer |StatefulPartitionedCall/batchnorm_1/mul_2:0 | 0| 2| 1| 11 8 x x x x x x | 12 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 13|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_1:0 | 0| 2| 1| 12 7 x x x x x x | 13 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 14|TIDL_ConvolutionLayer |StatefulPartitionedCall/block_2_1_conv_2:0 | 0| 1| 1| 13 x x x x x x x | 14 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 150994944 | 15|TIDL_EltWiseLayer |StatefulPartitionedCall/batchnorm_2/mul_2:0 | 0| 2| 1| 14 4 x x x x x x | 15 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 16|TIDL_EltWiseLayer |StatefulPartitionedCall/batchnorm_2/Add_1:0 | 0| 2| 1| 15 3 x x x x x x | 16 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 17|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_2:0 | 0| 2| 1| 16 10 x x x x x x | 17 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 18|TIDL_ConvolutionLayer |StatefulPartitionedCall/block_2_2_conv_1:0 | 0| 1| 1| 17 x x x x x x x | 18 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 150994944 | 19|TIDL_EltWiseLayer |StatefulPartitionedCall/batchnorm_3/mul_2:0 | 0| 2| 1| 18 6 x x x x x x | 19 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 20|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_3:0 | 0| 2| 1| 19 5 x x x x x x | 20 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 21|TIDL_ConvolutionLayer |StatefulPartitionedCall/block_2_2_conv_2:0 | 0| 1| 1| 20 x x x x x x x | 21 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 150994944 | 22|TIDL_EltWiseLayer |StatefulPartitionedCall/batchnorm_4/mul_2:0 | 0| 2| 1| 21 2 x x x x x x | 22 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 23|TIDL_EltWiseLayer |StatefulPartitionedCall/batchnorm_4/Add_1:0 | 0| 2| 1| 22 1 x x x x x x | 23 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 24|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_4:0 | 0| 2| 1| 23 17 x x x x x x | 24 | 1 1 1 64 64 64 | 1 1 1 64 64 64 | 262144 | 25|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_5:0 | 0| 1| 1| 24 x x x x x x x | 25 | 1 1 1 64 64 64 | 1 1 1 128 32 32 | 75497472 | 26|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_7/Add_1:0 | 0| 1| 1| 24 x x x x x x x | 26 | 1 1 1 64 64 64 | 1 1 1 128 32 32 | 8388608 | 27|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_6/Add_1:0 | 0| 1| 1| 25 x x x x x x x | 27 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 28|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_6:0 | 0| 2| 1| 27 26 x x x x x x | 28 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 29|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_7:0 | 0| 1| 1| 28 x x x x x x x | 29 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 30|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_9/Add_1:0 | 0| 1| 1| 29 x x x x x x x | 30 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 31|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_8:0 | 0| 2| 1| 30 28 x x x x x x | 31 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 32|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_9:0 | 0| 1| 1| 31 x x x x x x x | 32 | 1 1 1 128 32 32 | 1 1 1 256 32 32 | 301989888 | 33|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_12/Add_1:0 | 0| 1| 1| 31 x x x x x x x | 33 | 1 1 1 128 32 32 | 1 1 1 256 32 32 | 33554432 | 34|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_11/Add_1:0 | 0| 1| 1| 32 x x x x x x x | 34 | 1 1 1 256 32 32 | 1 1 1 256 32 32 | 603979776 | 35|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_10:0 | 0| 2| 1| 34 33 x x x x x x | 35 | 1 1 1 256 32 32 | 1 1 1 256 32 32 | 262144 | 36|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_11:0 | 0| 1| 1| 35 x x x x x x x | 36 | 1 1 1 256 32 32 | 1 1 1 256 32 32 | 603979776 | 37|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_14/Add_1:0 | 0| 1| 1| 36 x x x x x x x | 37 | 1 1 1 256 32 32 | 1 1 1 256 32 32 | 603979776 | 38|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_12:0 | 0| 2| 1| 37 35 x x x x x x | 38 | 1 1 1 256 32 32 | 1 1 1 256 32 32 | 262144 | 39|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_13:0 | 0| 1| 1| 38 x x x x x x x | 39 | 1 1 1 256 32 32 | 1 1 1 512 32 32 |1207959552 | 40|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_17/Add_1:0 | 0| 1| 1| 38 x x x x x x x | 40 | 1 1 1 256 32 32 | 1 1 1 512 32 32 | 134217728 | 41|TIDL_ConvolutionLayer |StatefulPartitionedCall/batchnorm_16/Add_1:0 | 0| 1| 1| 39 x x x x x x x | 41 | 1 1 1 512 32 32 | 1 1 1 512 32 32 |2415919104 | 42|TIDL_EltWiseLayer |StatefulPartitionedCall/Relu_14:0 | 0| 2| 1| 41 40 x x x x x x | 42 | 1 1 1 512 32 32 | 1 1 1 512 32 32 | 524288 | 43|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_15:0 | 0| 1| 1| 42 x x x x x x x | 43 | 1 1 1 512 32 32 | 1 1 1 128 32 32 | 67108864 | 44|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_16:0 | 0| 1| 1| 43 x x x x x x x | 44 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 45|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_17:0 | 0| 1| 1| 44 x x x x x x x | 45 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 46|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_18:0 | 0| 1| 1| 45 x x x x x x x | 46 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 47|TIDL_EltWiseLayer |StatefulPartitionedCall/add_7:0 | 0| 2| 1| 43 46 x x x x x x | 47 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 48|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_19:0 | 0| 1| 1| 47 x x x x x x x | 48 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 49|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_20:0 | 0| 1| 1| 48 x x x x x x x | 49 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 50|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_21:0 | 0| 1| 1| 49 x x x x x x x | 50 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 51|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_22:0 | 0| 1| 1| 50 x x x x x x x | 51 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 52|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_23:0 | 0| 1| 1| 51 x x x x x x x | 52 | 1 1 1 128 32 32 | 1 1 1 512 32 32 | 67108864 | 53|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_24:0 | 0| 1| 1| 51 x x x x x x x | 53 | 1 1 1 128 32 32 | 1 1 1 512 32 32 | 67108864 | 54|TIDL_ConvolutionLayer |StatefulPartitionedCall/bias_add_9:0 | 0| 1| 1| 52 x x x x x x x | 54 | 1 1 1 512 32 32 | 1 1 1 19 32 32 | 9961472 | 55|TIDL_ConvolutionLayer |StatefulPartitionedCall/bias_add_11:0 | 0| 1| 1| 53 x x x x x x x | 55 | 1 1 1 512 32 32 | 1 1 1 38 32 32 | 19922944 | 56|TIDL_ConcatLayer |StatefulPartitionedCall/concat:0 | 0| 3| 1| 48 54 55 x x x x x | 56 | 1 1 1 128 32 32 | 1 1 1 185 32 32 | 189440 | 57|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_25:0 | 0| 1| 1| 56 x x x x x x x | 57 | 1 1 1 185 32 32 | 1 1 1 128 32 32 | 24248320 | 58|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_26:0 | 0| 1| 1| 57 x x x x x x x | 58 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 59|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_27:0 | 0| 1| 1| 58 x x x x x x x | 59 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 60|TIDL_EltWiseLayer |StatefulPartitionedCall/add_8:0 | 0| 2| 1| 57 59 x x x x x x | 60 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 61|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_28:0 | 0| 1| 1| 60 x x x x x x x | 61 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 16777216 | 62|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_29:0 | 0| 1| 1| 61 x x x x x x x | 62 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 63|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_30:0 | 0| 1| 1| 62 x x x x x x x | 63 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 64|TIDL_EltWiseLayer |StatefulPartitionedCall/add_9:0 | 0| 2| 1| 61 63 x x x x x x | 64 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 65|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_31:0 | 0| 1| 1| 64 x x x x x x x | 65 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 16777216 | 66|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_32:0 | 0| 1| 1| 65 x x x x x x x | 66 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 67|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_33:0 | 0| 1| 1| 66 x x x x x x x | 67 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 68|TIDL_EltWiseLayer |StatefulPartitionedCall/add_10:0 | 0| 2| 1| 65 67 x x x x x x | 68 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 69|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_34:0 | 0| 1| 1| 68 x x x x x x x | 69 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 16777216 | 70|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_35:0 | 0| 1| 1| 69 x x x x x x x | 70 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 71|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_36:0 | 0| 1| 1| 70 x x x x x x x | 71 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 72|TIDL_EltWiseLayer |StatefulPartitionedCall/add_11:0 | 0| 2| 1| 69 71 x x x x x x | 72 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 73|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_37:0 | 0| 1| 1| 72 x x x x x x x | 73 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 16777216 | 74|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_38:0 | 0| 1| 1| 73 x x x x x x x | 74 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 75|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_39:0 | 0| 1| 1| 74 x x x x x x x | 75 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 150994944 | 76|TIDL_EltWiseLayer |StatefulPartitionedCall/add_12:0 | 0| 2| 1| 73 75 x x x x x x | 76 | 1 1 1 128 32 32 | 1 1 1 128 32 32 | 131072 | 77|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_40:0 | 0| 1| 1| 76 x x x x x x x | 77 | 1 1 1 128 32 32 | 1 1 1 512 32 32 | 67108864 | 78|TIDL_ConvolutionLayer |StatefulPartitionedCall/Relu_41:0 | 0| 1| 1| 76 x x x x x x x | 78 | 1 1 1 128 32 32 | 1 1 1 512 32 32 | 67108864 | 79|TIDL_ConvolutionLayer |Identity:0 | 0| 1| 1| 77 x x x x x x x | 79 | 1 1 1 512 32 32 | 1 1 1 19 32 32 | 9961472 | 80|TIDL_ConvolutionLayer |Identity_1:0 | 0| 1| 1| 78 x x x x x x x | 80 | 1 1 1 512 32 32 | 1 1 1 38 32 32 | 19922944 | 81|TIDL_DataLayer |Identity:0 | 0| 1| -1| 79 x x x x x x x | 0 | 1 1 1 19 32 32 | 0 0 0 0 0 0 | 0 | 82|TIDL_DataLayer |Identity_1:0 | 0| 1| -1| 80 x x x x x x x | 0 | 1 1 1 38 32 32 | 0 0 0 0 0 0 | 0 | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Total Giga Macs : 10.2614 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------