This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-TDAX: The ssdJacintoNetV2 is ported to the EVM board to display 1 frame, and DSP runs a long time?

Part Number: PROCESSOR-SDK-TDAX

Hi,

I have trained ssdJacintoNetV2 into 27 classes and converted them to bin files using the import utility。

I also modify the deploy.prototxt in the end filed " keep_top_k: 20   confidence_threshold: 0.15 ", But this tidl od_usecase just can only run for 1 frame ,

I also use the openvx_TIDL usecase test,list below:  the DSP still  runs a long time,why?

# Default - 0
randParams         = 0 

# 0: Caffe, 1: TensorFlow, Default - 0
modelType          = 0 

# 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
quantizationStyle  = 1 

# quantRoundAdd/100 will be added while rounding to integer, Default - 50
quantRoundAdd      = 25

numParamBits       = 8
# 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
inElementType      = 0 

inputNetFile       = "deploy.prototxt"
inputParamsFile    = "voc0712_ssdJacintoNetV2_iter_120000_spare.caffemodel"
outputNetFile      = "tidl_net_jdetNet_ssd.bin"
outputParamsFile   = "tidl_param_jdetNet_ssd.bin"

rawSampleInData = 1
preProcType   = 4
sampleInData = "trace_dump_0_768x320.y"
tidlStatsTool = "eve_test_dl_algo.out.exe"
layersGroupId = 0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	2	0
conv2dKernelType = 0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1


[IPU1-0]     18.804611 s:  Enter Choice:
[IPU1-0]     18.804764 s:  vx_tutorial_tidl: Tutorial Started !!!
[IPU1-0]     18.821692 s:  Reading config file sd:test_data//tivx/tidl/tidl_infer.cfg ...
[IPU1-0]     18.867809 s:  Reading network file sd:test_data//./tivx/tidl/tidl_net_jdetNet_ssd.bin
[IPU1-0]  ...
[IPU1-0]     19.644878 s:  Reading network params file sd:test_data//./tivx/tidl/tidl_param_jdetNet_ssd.bin
[IPU1-0]  ...
[IPU1-0]     23.586561 s:
[IPU1-0]     23.586683 s: Thread #1: Create graph ...
[IPU1-0]     23.587659 s: Thread #1: Create input and output tensors for node 1 ...
[IPU1-0]     23.588635 s: Thread #1: Create node 1 ...
[IPU1-0]     23.605075 s: Thread #1: Create output tensors for node 2 ...
[IPU1-0]     23.605532 s: Thread #1: Create node 2 ...
[IPU1-0]     23.621576 s:
[IPU1-0]     23.621759 s: Thread #1: Verify graph ...
[IPU1-0]     26.613707 s:
[IPU1-0]     26.614103 s: Thread #1: Start graph ...
[IPU1-0]     26.614256 s:
[IPU1-0]     26.614378 s: Thread #1: Wait for graph ...
[IPU1-0]     26.924723 s:
[IPU1-0]     26.924876 s: Thread #1: Results
[IPU1-0]     26.924998 s: ---------------------
[IPU1-0]     26.925211 s:
[IPU1-0]     26.925333 s: ObjId|label|score| xmin| ymin| xmax| ymax|
[IPU1-0]     26.925486 s: ------------------------------------------
[IPU1-0]     26.925730 s:     0|   14| 1.00| 0.64| 0.33| 0.80| 0.90|
[IPU1-0]     26.926005 s:     1|   20| 1.00| 0.44| 0.46| 0.65| 0.90|
[IPU1-0]     26.926218 s:     2|   20| 1.00| 0.05| 0.35| 0.20| 0.68|
[IPU1-0]     26.926432 s:     3|   11| 1.00| 0.73| 0.42| 0.85| 0.87|
[IPU1-0]     26.926676 s:     4|    2| 1.00| 0.55| 0.16| 0.78| 0.91|
[IPU1-0]     26.926920 s:     5|    2| 1.00|-0.25| 0.03| 0.18| 0.59|
[IPU1-0]     26.927133 s:     6|   12| 1.00| 0.12| 0.26| 0.27| 0.65|
[IPU1-0]     26.927347 s:     7|   16| 1.00|-0.01| 0.14| 0.26| 0.59|
[IPU1-0]     26.927560 s:     8|   16| 1.00| 0.35| 0.18| 0.57| 0.61|
[IPU1-0]     26.927804 s:     9|   21| 1.00| 0.84| 0.61| 0.92| 0.96|
[IPU1-0]     26.928018 s:    10|    2| 1.00| 0.85| 0.56| 0.90| 0.93|
[IPU1-0]     26.928262 s:    11|   21| 1.00| 0.27| 0.46| 0.33| 0.83|
[IPU1-0]     26.928475 s:    12|   21| 1.00| 0.51| 0.36| 0.58| 0.72|
[IPU1-0]     26.928689 s:    13|   15| 1.00| 0.16| 0.25| 0.25| 0.70|
[IPU1-0]     26.929116 s:    14|   21| 1.00| 0.01| 0.07| 0.12| 0.50|
[IPU1-0]     26.929360 s:    15|   21| 1.00| 0.84|-0.06| 0.91| 0.41|
[IPU1-0]     26.929573 s:    16|   21| 1.00| 0.59|-0.08| 0.66| 0.39|
[IPU1-0]     26.929817 s:    17|   15| 1.00| 0.85| 0.59| 0.95| 0.83|
[IPU1-0]     26.930061 s:    18|   26| 1.00| 0.10| 0.58| 0.19| 0.74|
[IPU1-0]     26.930305 s:    19|   26| 1.00| 0.76| 0.45| 0.97| 0.66|
[IPU1-0]     26.930366 s:
[IPU1-0]     26.930458 s: Number of detected objects: 20
[IPU1-0]     26.930519 s:
[IPU1-0]     26.930610 s:
[IPU1-0]     26.930793 s: ---- Thread #1: Node 1 (EVE-1) Execution time: 138.622000 ms
[IPU1-0]     26.931068 s: ---- Thread #1: Node 2 (DSP-1) Execution time: 171.431000 ms
[IPU1-0]     26.931312 s: ---- Thread #1: Total Graph Execution time: 310.458000 ms

and my import tools log,and what is meaning of the red word?:

randParams = 0
modelType = 0
quantizationStyle = 1
quantRoundAdd = 25
numParamBits = 8
preProcType = 4
inElementType = 0
numFrames = -1
rawSampleInData = 1
numSampleInData = 1
foldBnInConv2D = 1
inWidth = -1
inHeight = -1
inNumChannels = -1
sampleInData = trace_dump_0_768x320.y
tidlStatsTool = eve_test_dl_algo.out.exe
inputNetFile = deploy.prototxt
inputParamsFile = voc0712_ssdJacintoNetV2_iter_120000_spare.caffemodel
outputNetFile = tidl_net_jdetNet_ssd.bin
outputParamsFile = tidl_param_jdetNet_ssd.bin
conv2dKernelType = 0
layersGroupId = 0
Caffe Network File : deploy.prototxt
Caffe Model File   : voc0712_ssdJacintoNetV2_iter_120000_spare.caffemodel
TIDL Network File  : tidl_net_jdetNet_ssd.bin
TIDL Model File    : tidl_param_jdetNet_ssd.bin
Name of the Network : ssdJacintoNetV2_deploy
Num Inputs :               1
Kernel Size not matching 65536 !!Setting RAND Kernel Params for Layer ctx_output1
Kernel Size not matching 21504 !!Setting RAND Kernel Params for Layer ctx_output1/relu_mbox_conf
Bias Size not matching!!Setting RAND BIAS Params for Layer ctx_output1/relu_mbox_conf
Kernel Size not matching 32256 !!Setting RAND Kernel Params for Layer ctx_output2/relu_mbox_conf
Bias Size not matching!!Setting RAND BIAS Params for Layer ctx_output2/relu_mbox_conf
Kernel Size not matching 32256 !!Setting RAND Kernel Params for Layer ctx_output3/relu_mbox_conf
Bias Size not matching!!Setting RAND BIAS Params for Layer ctx_output3/relu_mbox_conf
Kernel Size not matching 32256 !!Setting RAND Kernel Params for Layer ctx_output4/relu_mbox_conf
Bias Size not matching!!Setting RAND BIAS Params for Layer ctx_output4/relu_mbox_conf
Kernel Size not matching 21504 !!Setting RAND Kernel Params for Layer ctx_output5/relu_mbox_conf
Bias Size not matching!!Setting RAND BIAS Params for Layer ctx_output5/relu_mbox_conf
 Num of Layer Detected :  50
  0, TIDL_DataLayer                , data                                      0,  -1 ,  1 ,   x ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  0 ,       0 ,       0 ,       0 ,       0 ,       1 ,       3 ,     320 ,     768 ,         0 ,
  1, TIDL_BatchNormLayer           , data/bias                                 1,   1 ,  1 ,   0 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  1 ,       1 ,       3 ,     320 ,     768 ,       1 ,       3 ,     320 ,     768 ,    737280 ,
  2, TIDL_ConvolutionLayer         , conv1a                                    1,   1 ,  1 ,   1 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  2 ,       1 ,       3 ,     320 ,     768 ,       1 ,      32 ,     160 ,     384 , 147456000 ,
  3, TIDL_ConvolutionLayer         , conv1b                                    1,   1 ,  1 ,   2 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  3 ,       1 ,      32 ,     160 ,     384 ,       1 ,      32 ,      80 ,     192 , 141557760 ,
  4, TIDL_ConvolutionLayer         , res2a_branch2a                            1,   1 ,  1 ,   3 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  4 ,       1 ,      32 ,      80 ,     192 ,       1 ,      64 ,      80 ,     192 , 283115520 ,
  5, TIDL_ConvolutionLayer         , res2a_branch2b                            1,   1 ,  1 ,   4 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  5 ,       1 ,      64 ,      80 ,     192 ,       1 ,      64 ,      40 ,      96 , 141557760 ,
  6, TIDL_ConvolutionLayer         , res3a_branch2a                            1,   1 ,  1 ,   5 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  6 ,       1 ,      64 ,      40 ,      96 ,       1 ,     128 ,      40 ,      96 , 283115520 ,
  7, TIDL_ConvolutionLayer         , res3a_branch2b                            1,   1 ,  1 ,   6 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  7 ,       1 ,     128 ,      40 ,      96 ,       1 ,     128 ,      40 ,      96 , 141557760 ,
  8, TIDL_PoolingLayer             , pool3                                     1,   1 ,  1 ,   7 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  8 ,       1 ,     128 ,      40 ,      96 ,       1 ,     128 ,      20 ,      48 ,    491520 ,
  9, TIDL_ConvolutionLayer         , res4a_branch2a                            1,   1 ,  1 ,   8 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  9 ,       1 ,     128 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 , 283115520 ,
 10, TIDL_ConvolutionLayer         , res4a_branch2b                            1,   1 ,  1 ,   9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 10 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      10 ,      24 , 141557760 ,
 11, TIDL_ConvolutionLayer         , res5a_branch2a                            1,   1 ,  1 ,  10 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 11 ,       1 ,     256 ,      10 ,      24 ,       1 ,     512 ,      10 ,      24 , 283115520 ,
 12, TIDL_ConvolutionLayer         , res5a_branch2b                            1,   1 ,  1 ,  11 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 12 ,       1 ,     512 ,      10 ,      24 ,       1 ,     512 ,      10 ,      24 , 141557760 ,
 13, TIDL_PoolingLayer             , pool6                                     1,   1 ,  1 ,  12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 13 ,       1 ,     512 ,      10 ,      24 ,       1 ,     512 ,       5 ,      12 ,    122880 ,
 14, TIDL_PoolingLayer             , pool7                                     1,   1 ,  1 ,  13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 14 ,       1 ,     512 ,       5 ,      12 ,       1 ,     512 ,       3 ,       6 ,     36864 ,
 15, TIDL_PoolingLayer             , pool8                                     1,   1 ,  1 ,  14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 15 ,       1 ,     512 ,       3 ,       6 ,       1 ,     512 ,       2 ,       3 ,     12288 ,
 16, TIDL_ConvolutionLayer         , ctx_output1                               1,   1 ,  1 ,   7 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 16 ,       1 ,     128 ,      40 ,      96 ,       1 ,     256 ,      40 ,      96 , 125829120 ,
 17, TIDL_ConvolutionLayer         , ctx_output2                               1,   1 ,  1 ,  12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 17 ,       1 ,     512 ,      10 ,      24 ,       1 ,     256 ,      10 ,      24 ,  31457280 ,
 18, TIDL_ConvolutionLayer         , ctx_output3                               1,   1 ,  1 ,  13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 18 ,       1 ,     512 ,       5 ,      12 ,       1 ,     256 ,       5 ,      12 ,   7864320 ,
 19, TIDL_ConvolutionLayer         , ctx_output4                               1,   1 ,  1 ,  14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 19 ,       1 ,     512 ,       3 ,       6 ,       1 ,     256 ,       3 ,       6 ,   2359296 ,
 20, TIDL_ConvolutionLayer         , ctx_output5                               1,   1 ,  1 ,  15 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 20 ,       1 ,     512 ,       2 ,       3 ,       1 ,     256 ,       2 ,       3 ,    786432 ,
 21, TIDL_ConvolutionLayer         , ctx_output1/relu_mbox_loc                 1,   1 ,  1 ,  16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 21 ,       1 ,     256 ,      40 ,      96 ,       1 ,      16 ,      40 ,      96 ,  15728640 ,
 22, TIDL_FlattenLayer             , ctx_output1/relu_mbox_loc_perm            1,   1 ,  1 ,  21 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 22 ,       1 ,      16 ,      40 ,      96 ,       1 ,       1 ,       1 ,   61440 ,         1 ,
 23, TIDL_ConvolutionLayer         , ctx_output1/relu_mbox_conf                1,   1 ,  1 ,  16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 23 ,       1 ,     256 ,      40 ,      96 ,       1 ,     112 ,      40 ,      96 , 110100480 ,
 24, TIDL_FlattenLayer             , ctx_output1/relu_mbox_conf_perm           1,   1 ,  1 ,  23 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 24 ,       1 ,     112 ,      40 ,      96 ,       1 ,       1 ,       1 ,  430080 ,         1 ,
 26, TIDL_ConvolutionLayer         , ctx_output2/relu_mbox_loc                 1,   1 ,  1 ,  17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 26 ,       1 ,     256 ,      10 ,      24 ,       1 ,      24 ,      10 ,      24 ,   1474560 ,
 27, TIDL_FlattenLayer             , ctx_output2/relu_mbox_loc_perm            1,   1 ,  1 ,  26 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 27 ,       1 ,      24 ,      10 ,      24 ,       1 ,       1 ,       1 ,    5760 ,         1 ,
 28, TIDL_ConvolutionLayer         , ctx_output2/relu_mbox_conf                1,   1 ,  1 ,  17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 28 ,       1 ,     256 ,      10 ,      24 ,       1 ,     168 ,      10 ,      24 ,  10321920 ,
 29, TIDL_FlattenLayer             , ctx_output2/relu_mbox_conf_perm           1,   1 ,  1 ,  28 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 29 ,       1 ,     168 ,      10 ,      24 ,       1 ,       1 ,       1 ,   40320 ,         1 ,
 31, TIDL_ConvolutionLayer         , ctx_output3/relu_mbox_loc                 1,   1 ,  1 ,  18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 31 ,       1 ,     256 ,       5 ,      12 ,       1 ,      24 ,       5 ,      12 ,    368640 ,
 32, TIDL_FlattenLayer             , ctx_output3/relu_mbox_loc_perm            1,   1 ,  1 ,  31 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 32 ,       1 ,      24 ,       5 ,      12 ,       1 ,       1 ,       1 ,    1440 ,         1 ,
 33, TIDL_ConvolutionLayer         , ctx_output3/relu_mbox_conf                1,   1 ,  1 ,  18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 33 ,       1 ,     256 ,       5 ,      12 ,       1 ,     168 ,       5 ,      12 ,   2580480 ,
 34, TIDL_FlattenLayer             , ctx_output3/relu_mbox_conf_perm           1,   1 ,  1 ,  33 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 34 ,       1 ,     168 ,       5 ,      12 ,       1 ,       1 ,       1 ,   10080 ,         1 ,
 36, TIDL_ConvolutionLayer         , ctx_output4/relu_mbox_loc                 1,   1 ,  1 ,  19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 36 ,       1 ,     256 ,       3 ,       6 ,       1 ,      24 ,       3 ,       6 ,    110592 ,
 37, TIDL_FlattenLayer             , ctx_output4/relu_mbox_loc_perm            1,   1 ,  1 ,  36 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 37 ,       1 ,      24 ,       3 ,       6 ,       1 ,       1 ,       1 ,     432 ,         1 ,
 38, TIDL_ConvolutionLayer         , ctx_output4/relu_mbox_conf                1,   1 ,  1 ,  19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 38 ,       1 ,     256 ,       3 ,       6 ,       1 ,     168 ,       3 ,       6 ,    774144 ,
 39, TIDL_FlattenLayer             , ctx_output4/relu_mbox_conf_perm           1,   1 ,  1 ,  38 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 39 ,       1 ,     168 ,       3 ,       6 ,       1 ,       1 ,       1 ,    3024 ,         1 ,
 41, TIDL_ConvolutionLayer         , ctx_output5/relu_mbox_loc                 1,   1 ,  1 ,  20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 41 ,       1 ,     256 ,       2 ,       3 ,       1 ,      16 ,       2 ,       3 ,     24576 ,
 42, TIDL_FlattenLayer             , ctx_output5/relu_mbox_loc_perm            1,   1 ,  1 ,  41 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 42 ,       1 ,      16 ,       2 ,       3 ,       1 ,       1 ,       1 ,      96 ,         1 ,
 43, TIDL_ConvolutionLayer         , ctx_output5/relu_mbox_conf                1,   1 ,  1 ,  20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 43 ,       1 ,     256 ,       2 ,       3 ,       1 ,     112 ,       2 ,       3 ,    172032 ,
 44, TIDL_FlattenLayer             , ctx_output5/relu_mbox_conf_perm           1,   1 ,  1 ,  43 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 44 ,       1 ,     112 ,       2 ,       3 ,       1 ,       1 ,       1 ,     672 ,         1 ,
 46, TIDL_ConcatLayer              , mbox_loc                                  1,   5 ,  1 ,  22 , 27 , 32 , 37 , 42 ,  x ,  x ,  x , 46 ,       1 ,       1 ,       1 ,   61440 ,       1 ,       1 ,       1 ,   69168 ,         1 ,
 47, TIDL_ConcatLayer              , mbox_conf                                 1,   5 ,  1 ,  24 , 29 , 34 , 39 , 44 ,  x ,  x ,  x , 47 ,       1 ,       1 ,       1 ,  430080 ,       1 ,       1 ,       1 ,  484176 ,         1 ,
 49, TIDL_DetectionOutputLayer     , detection_out                             2,   2 ,  1 ,  46 , 47 ,  x ,  x ,  x ,  x ,  x ,  x , 49 ,       1 ,       1 ,       1 ,   69168 ,       1 ,       1 ,       1 ,     560 ,         1 ,
Total Giga Macs : 2.2991
已复制         1 个文件。

Processing config file .\tempDir\qunat_stats_config.txt !
noZeroCoeffsPercentage = 100
updateNetWithStats = 1
rawImage = 1
randInput = 0
writeInput = 0
writeOutput = 1
compareRef = 0
numFrames = 1
preProcType = 4
netBinFile = .\tempDir\temp_net.bin
outputNetBinFile = tidl_net_jdetNet_ssd.bin
paramsBinFile = tidl_param_jdetNet_ssd.bin
inData = trace_dump_0_768x320.y
outData = .\tempDir\stats_tool_out.bin
traceDumpBaseName = .\tempDir\trace_dump_
testCaseName =
testCaseDesc =
performanceTestcase = 0
layersGroupId = 1
writeQ = 0
readQ = 0
runFullNet = 1
read layer 0 param
read layer 1 param
read layer 2 param
read layer 3 param
read layer 4 param
read layer 5 param
read layer 6 param
read layer 7 param
read layer 8 param
read layer 9 param
read layer 10 param
read layer 11 param
read layer 12 param
read layer 13 param
read layer 14 param
read layer 15 param
read layer 16 param
read layer 17 param
read layer 18 param
read layer 19 param
read layer 20 param
read layer 21 param
read layer 22 param
read layer 23 param
read layer 24 param
read layer 25 param
read layer 26 param
read layer 27 param
read layer 28 param
read layer 29 param
read layer 30 param
read layer 31 param
read layer 32 param
read layer 33 param
read layer 34 param
read layer 35 param
read layer 36 param
read layer 37 param
read layer 38 param
read layer 39 param
read layer 40 param
read layer 41 param
read layer 42 param
read layer 43 param
read layer 44 param

weightsElementSize = 1
slopeElementSize   = 1
biasElementSize    = 2
dataElementSize    = 1
interElementSize   = 4
quantizationStyle  = 1
strideOffsetMethod = 0
reserved           = 0

Layer ID    ,inBlkWidth  ,inBlkHeight ,inBlkPitch  ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs    ,numOutChs   ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs  ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
      2           72           72           72           32           32           32            3           32            3            1            8            1            3           12            5         5184         1024            1
      3           40           34           40           32           32           32            8            8            8            4            8            1            2           12            5         1360         1024            1
      4           40           22           40           32           20           32           32           64           32            8            8            1            4            6            4          880          640            1
      5           40           22           40           32           20           32           16           16           16            8            8            1            2            6            4          880          640            1
      6           40           22           40           32           20           32           64          128           64            8            8            1            8            3            2          880          640            1
      7           40           22           40           32           20           32           32           32           32            8            8            1            4            3            2          880          640            1
      9           56           22           56           48           20           48          128          256          128            7            8            1           19            1            1         1232          960            1
     10           56           22           56           48           20           48           64           64           64            7            8            1           10            1            1         1232          960            1
     11           40           12           40           32           10           32          256          512          256            8            8            1           32            1            1          480          320            1
     12           40           12           40           32           10           32          128          128          128            8            8            1           16            1            1          480          320            1
     16           96            4           96           96            4           96          128          256          128           32            8            1            4            1           10          384          384            1
     17           24           10           24           24           10           24          512          256          512           32           32            1           16            1            1          240          240            1
     18           12            5           12           12            5           12          512          256          512           32           32            1           16            1            1           60           60            1
     19            6            3            6            6            3            6          512          256          512           32           32            1           16            1            1           18           18            1
     20            3            2            3            3            2            3          512          256          512           32           32            1           16            1            1            6            6            1
     21           96            4           96           96            4           96          256           16          256           32            8            1            8            1           10          384          384            1
     23           96            4           96           96            4           96          256          112          256           32            8            1            8            1           10          384          384            1
     25           24           10           24           24           10           24          256           24          256           32           24            1            8            1            1          240          240            1
     27           24           10           24           24           10           24          256          192          256           32           32            1            8            1            1          240          240            1
     29           12            5           12           12            5           12          256           24          256           32           24            1            8            1            1           60           60            1
     31           12            5           12           12            5           12          256          192          256           32           32            1            8            1            1           60           60            1
     33            6            3            6            6            3            6          256           24          256           32           24            1            8            1            1           18           18            1
     35            6            3            6            6            3            6          256          192          256           32           32            1            8            1            1           18           18            1
     37            3            2            3            3            2            3          256           16          256           32           16            1            8            1            1            6            6            1
     39            3            2            3            3            2            3          256          128          256           32           32            1            8            1            1            6            6            1

Processing Frame Number : 0

Not belongs to this group!
inPtrs 0xccbad0
 Layer    1 : Out Q :      254 , TIDL_BatchNormLayer  , PASSED  #MMACs =     0.74,     0.74, Sparsity :   0.00
 Layer    2 : Out Q :     6011 , TIDL_ConvolutionLayer, PASSED  #MMACs =   147.46,    92.65, Sparsity :  37.17
 Layer    3 : Out Q :     6157 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    53.33, Sparsity :  62.33
 Layer    4 : Out Q :    11692 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    83.44, Sparsity :  70.53
 Layer    5 : Out Q :    10495 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    66.11, Sparsity :  53.30
 Layer    6 : Out Q :    13681 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    91.59, Sparsity :  67.65
 Layer    7 : Out Q :    16771 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    57.32, Sparsity :  59.51
 Layer    8 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.12,     0.12, Sparsity :   0.00
 Layer    9 : Out Q :    18587 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    96.27, Sparsity :  66.00
 Layer   10 : Out Q :    12886 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    52.28, Sparsity :  63.07
 Layer   11 : Out Q :    20462 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    76.31, Sparsity :  73.04
 Layer   12 : Out Q :     5854 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    31.40, Sparsity :  77.82
 Layer   13 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.03,     0.03, Sparsity :   0.00
 Layer   14 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
 Layer   15 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
 Layer   16 : Out Q :     2609 , TIDL_ConvolutionLayer, PASSED  #MMACs =   125.83,   125.83, Sparsity :   0.00
 Layer   17 : Out Q :    11558 , TIDL_ConvolutionLayer, PASSED  #MMACs =    31.46,    31.46, Sparsity :   0.00
 Layer   18 : Out Q :     7859 , TIDL_ConvolutionLayer, PASSED  #MMACs =     7.86,     7.86, Sparsity :   0.00
 Layer   19 : Out Q :     9041 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.36,     2.36, Sparsity :   0.00
 Layer   20 : Out Q :     7197 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.79,     0.79, Sparsity :   0.00
 Layer   21 : Out Q :     1077 , TIDL_ConvolutionLayer, PASSED  #MMACs =    15.73,    15.73, Sparsity :   0.00
 Layer   22 :TIDL_FlattenLayer, PASSED  #MMACs =     0.06,     0.06, Sparsity :   0.00
 Layer   23 : Out Q :       77 , TIDL_ConvolutionLayer, PASSED  #MMACs =   110.10,   110.10, Sparsity :   0.00
 Layer   24 :TIDL_FlattenLayer, PASSED  #MMACs =     0.43,     0.43, Sparsity :   0.00
 Layer   25 : Out Q :     7738 , TIDL_ConvolutionLayer, PASSED  #MMACs =     1.47,     1.47, Sparsity :   0.00
 Layer   26 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
 Layer   27 : Out Q :      173 , TIDL_ConvolutionLayer, PASSED  #MMACs =    11.80,    11.80, Sparsity :   0.00
 Layer   28 :TIDL_FlattenLayer, PASSED  #MMACs =     0.04,     0.04, Sparsity :   0.00
 Layer   29 : Out Q :     5752 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.37,     0.37, Sparsity :   0.00
 Layer   30 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
 Layer   31 : Out Q :      157 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.95,     2.95, Sparsity :   0.00
 Layer   32 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
 Layer   33 : Out Q :     3139 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.11,     0.11, Sparsity :   0.00
 Layer   34 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
 Layer   35 : Out Q :      173 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.88,     0.88, Sparsity :   0.00
 Layer   36 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
 Layer   37 : Out Q :     2333 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
 Layer   38 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
 Layer   39 : Out Q :      156 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.20,     0.20, Sparsity :   0.00
 Layer   40 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
 Layer   41 : Out Q :     1081 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity : -nan(ind)
 Layer   42 : Out Q :       76 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity : -nan(ind)
 Layer   43 :
Target: number label value xmin  ymin  xmax  ymax
Target:  0.00 25.00  1.00  0.66  0.56  0.71  0.88
Target:  1.00  4.00  1.00  0.63  0.19  0.81  0.88
Target:  2.00 13.00  1.00  0.33  0.46  0.48  0.97
Target:  3.00 13.00  1.00  0.64  0.33  0.80  0.90
Target:  4.00 13.00  1.00  0.39  0.64  0.64  1.01
Target:  5.00 20.00  1.00  0.05  0.35  0.20  0.68
Target:  6.00 13.00  1.00  0.73  0.42  0.85  0.87
Target:  7.00 13.00  1.00  0.57  0.32  0.75  0.89
Target:  8.00 12.00  1.00  0.12  0.26  0.27  0.65
Target:  9.00  6.00  1.00  0.30  0.21  0.47  0.68
Target: 10.00  6.00  1.00  0.07  0.27  0.19  0.70
Target: 11.00 21.00  1.00  0.84  0.61  0.92  0.96
Target: 12.00  2.00  1.00  0.85  0.56  0.90  0.93
Target: 13.00 21.00  1.00  0.59  0.45  0.65  0.90
Target: 14.00 21.00  1.00  0.27  0.46  0.33  0.83
Target: 15.00 21.00  1.00  0.83  0.31  0.91  0.78
Target: 16.00 21.00  1.00  0.51  0.36  0.58  0.72
Target: 17.00 21.00  1.00  0.50  0.06  0.59  0.57
Target: 18.00 21.00  1.00  0.42  0.06  0.51  0.60
Target: 19.00 21.00  1.00  0.04  0.08  0.16  0.52
 #MMACs =     0.00,     0.00, Sparsity :   0.00
Not belongs to this group!
End of config list found !


  • Hi,

    You need to set  “layersGroupId” and “conv2dKernelType” parameters properly in the import config file to get optimal performance.

    Please refer to FAQ 21 and 22 in the TIDL user guide (TIDeepLearningLibrary_UserGuide.pdf) on how set these parameters.

    Thanks,

    Praveen

  • Hi, Praveen:

    I have been configured strictly according to the documentation(TIDL user guide).

    Here is my import configuration:

    # Default - 0
    randParams         = 0

    # 0: Caffe, 1: TensorFlow, Default - 0
    modelType          = 0

    # 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
    quantizationStyle  = 1

    # quantRoundAdd/100 will be added while rounding to integer, Default - 50
    quantRoundAdd      = 25

    numParamBits       = 8
    # 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
    inElementType      = 0

    inputNetFile       = "deploy.prototxt"
    inputParamsFile    = "voc0712_ssdJacintoNetV2_iter_120000_spare.caffemodel"
    outputNetFile      = "tidl_net_jdetNet_ssd.bin"
    outputParamsFile   = "tidl_param_jdetNet_ssd.bin"

    rawSampleInData = 1
    preProcType   = 4
    sampleInData = "trace_dump_0_768x320.y"
    tidlStatsTool = "eve_test_dl_algo.out.exe"
    layersGroupId = 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0
    conv2dKernelType = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

    Nothing has changed:

    [IPU1-0]     17.859696 s:  vx_tutorial_tidl: Tutorial Started !!!
    [IPU1-0]     17.876898 s:  Reading config file sd:test_data//tivx/tidl/tidl_infer.cfg ...
    [IPU1-0]     17.923229 s:  Reading network file sd:test_data//./tivx/tidl/tidl_net_jdetNet_ssd.bin
    [IPU1-0]  ...
    [IPU1-0]     18.722534 s:  Reading network params file sd:test_data//./tivx/tidl/tidl_param_jdetNet_ssd.bin
    [IPU1-0]  ...
    [IPU1-0]     22.747026 s:
    [IPU1-0]     22.747148 s: Thread #1: Create graph ...
    [IPU1-0]     22.748124 s: Thread #1: Create input and output tensors for node 1 ...
    [IPU1-0]     22.749100 s: Thread #1: Create node 1 ...
    [IPU1-0]     22.765418 s: Thread #1: Create output tensors for node 2 ...
    [IPU1-0]     22.766119 s: Thread #1: Create node 2 ...
    [IPU1-0]     22.782346 s:
    [IPU1-0]     22.782468 s: Thread #1: Verify graph ...
    [IPU1-0]     25.789361 s:
    [IPU1-0]     25.789483 s: Thread #1: Start graph ...
    [IPU1-0]     25.789635 s:
    [IPU1-0]     25.789757 s: Thread #1: Wait for graph ...
    [IPU1-0]     26.100347 s:
    [IPU1-0]     26.100439 s: Thread #1: Results
    [IPU1-0]     26.100561 s: ---------------------
    [IPU1-0]     26.100774 s:
    [IPU1-0]     26.101110 s: ObjId|label|score| xmin| ymin| xmax| ymax|
    [IPU1-0]     26.101293 s: ------------------------------------------
    [IPU1-0]     26.101537 s:     0|   14| 1.00| 0.64| 0.33| 0.80| 0.90|
    [IPU1-0]     26.101781 s:     1|   20| 1.00| 0.44| 0.46| 0.65| 0.90|
    [IPU1-0]     26.102025 s:     2|   20| 1.00| 0.05| 0.35| 0.20| 0.68|
    [IPU1-0]     26.102269 s:     3|   11| 1.00| 0.73| 0.42| 0.85| 0.87|
    [IPU1-0]     26.102482 s:     4|    2| 1.00| 0.55| 0.16| 0.78| 0.91|
    [IPU1-0]     26.102696 s:     5|    2| 1.00|-0.25| 0.03| 0.18| 0.59|
    [IPU1-0]     26.102940 s:     6|   12| 1.00| 0.12| 0.26| 0.27| 0.65|
    [IPU1-0]     26.103153 s:     7|   16| 1.00|-0.01| 0.14| 0.26| 0.59|
    [IPU1-0]     26.103397 s:     8|   16| 1.00| 0.35| 0.18| 0.57| 0.61|
    [IPU1-0]     26.103611 s:     9|   21| 1.00| 0.84| 0.61| 0.92| 0.96|
    [IPU1-0]     26.103824 s:    10|    2| 1.00| 0.85| 0.56| 0.90| 0.93|
    [IPU1-0]     26.104068 s:    11|   21| 1.00| 0.27| 0.46| 0.33| 0.83|
    [IPU1-0]     26.104312 s:    12|   21| 1.00| 0.51| 0.36| 0.58| 0.72|
    [IPU1-0]     26.104526 s:    13|   15| 1.00| 0.16| 0.25| 0.25| 0.70|
    [IPU1-0]     26.104739 s:    14|   21| 1.00| 0.01| 0.07| 0.12| 0.50|
    [IPU1-0]     26.104983 s:    15|   21| 1.00| 0.84|-0.06| 0.91| 0.41|
    [IPU1-0]     26.105197 s:    16|   21| 1.00| 0.59|-0.08| 0.66| 0.39|
    [IPU1-0]     26.105441 s:    17|   15| 1.00| 0.85| 0.59| 0.95| 0.83|
    [IPU1-0]     26.105654 s:    18|   26| 1.00| 0.10| 0.58| 0.19| 0.74|
    [IPU1-0]     26.105868 s:    19|   26| 1.00| 0.76| 0.45| 0.97| 0.66|
    [IPU1-0]     26.106112 s:
    [IPU1-0]     26.106234 s: Number of detected objects: 20
    [IPU1-0]     26.106295 s:
    [IPU1-0]     26.106356 s:
    [IPU1-0]     26.106569 s: ---- Thread #1: Node 1 (EVE-1) Execution time: 138.745000 ms
    [IPU1-0]     26.106813 s: ---- Thread #1: Node 2 (DSP-1) Execution time: 171.417000 ms
    [IPU1-0]     26.107088 s: ---- Thread #1: Total Graph Execution time: 310.673000 ms
    [IPU1-0]     26.107210 s:
    [IPU1-0]     26.107393 s: Execution time of all the threads running in parallel: 310.966000 ms
    [IPU1-0]     26.115415 s:
    [IPU1-0]     26.115567 s:  vx_tutorial_tidl: Tutorial Done !!!

    2、My last layer output file of import and infer tools (trace_dump_49_560x1.y_float.txt) are all like this , why:

    -nan(ind)
    -nan(ind)
    -nan(ind)
    -nan(ind)
    -nan(ind)
    -nan(ind)
    -inf
    inf
    -nan(ind)
    -inf
    inf
    inf
    -inf
    inf
    inf
    inf
    -inf
    -inf
    inf
    inf
    -inf
    inf
    inf
    inf
    -inf
    inf
    inf
    inf
    -nan(ind)
    -nan(ind)
    -inf
    inf
    -nan(ind)
    -nan(ind)
    -inf
    inf
    -nan(ind)
    -inf

  • Hi, Praveen:

    I did a test.

    I used the same import config file to convert my trained caffe model and the voc0712_ssdJacintoNetV2_iter_120000_spare.caffemodel in caffe-jacinto-models/trained/object_detection/voc0712.

    my model result(num_classes = 28):

    [IPU1-0]     26.106569 s: ---- Thread #1: Node 1 (EVE-1) Execution time: 138.745000 ms
    [IPU1-0]     26.106813 s: ---- Thread #1: Node 2 (DSP-1) Execution time: 171.417000 ms
    [IPU1-0]     26.107088 s: ---- Thread #1: Total Graph Execution time: 310.673000 ms

    The voc0712 result(num_classes = 21):

    [IPU1-0]     24.803422 s: ---- Thread #1: Node 1 (EVE-1) Execution time: 121.974000 ms
    [IPU1-0]     24.803666 s: ---- Thread #1: Node 2 (DSP-1) Execution time: 10.697000 ms
    [IPU1-0]     24.803940 s: ---- Thread #1: Total Graph Execution time: 133.063000 ms

    Is something wrong with my training?

  • Hi,

    It could be that the last control layer on DSP takes more time to execute with your model because it has 27 classes. I think the original object_detection model was trained for around 6 classes. What you can do is remove the last layer in the caffe prototext, import again and re-run and see if the execution time has come down, so you know it is the last layer that is causing the higher execution time. 

    regards,

    Victor

  • Hi , Victor:

    The target detection network I use is ssdJacintoNetV2, the last layer is dectectionOutput layer, According to the TI's documentation, this layer must run on DSP, and I have tested that my model runs on DSP much longer than normal -VOC0712_21classes。

    I think it is because of the poor recognition rate of my model training。Here are some log of my training: My loss is very high and the mAP is very low.

    I1129 15:26:39.556344 21146 solver.cpp:283] [MultiGPU] Tests completed in 183.588s
    I1129 15:26:40.249297 21146 solver.cpp:352] Iteration 118000 (0.544697 iter/s, 183.588s/100 iter), 628.8/639.5ep, loss = 4.6682
    I1129 15:26:40.249347 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.66864 (* 1 = 4.66864 loss)
    I1129 15:26:40.249366 21146 sgd_solver.cpp:172] Iteration 118000, lr = 7.71602e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:28:39.954125 21146 solver.cpp:352] Iteration 118100 (0.83542 iter/s, 119.7s/100 iter), 629.3/639.5ep, loss = 4.61505
    I1129 15:28:39.954372 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.41359 (* 1 = 4.41359 loss)
    I1129 15:28:39.954386 21146 sgd_solver.cpp:172] Iteration 118100, lr = 6.28475e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:30:31.769484 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:30:45.070871 21146 solver.cpp:352] Iteration 118200 (0.799283 iter/s, 125.112s/100 iter), 629.9/639.5ep, loss = 4.78497
    I1129 15:30:45.071470 21146 solver.cpp:376]     Train net output #0: mbox_loss = 5.40193 (* 1 = 5.40193 loss)
    I1129 15:30:45.071489 21146 sgd_solver.cpp:172] Iteration 118200, lr = 5.06248e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:32:51.986060 21146 solver.cpp:352] Iteration 118300 (0.787957 iter/s, 126.91s/100 iter), 630.4/639.5ep, loss = 4.61206
    I1129 15:32:51.986279 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.48147 (* 1 = 4.48147 loss)
    I1129 15:32:51.986292 21146 sgd_solver.cpp:172] Iteration 118300, lr = 4.02781e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:34:31.422524 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:34:57.130116 21146 solver.cpp:352] Iteration 118400 (0.799109 iter/s, 125.139s/100 iter), 630.9/639.5ep, loss = 4.77914
    I1129 15:34:57.130172 21146 solver.cpp:376]     Train net output #0: mbox_loss = 5.37836 (* 1 = 5.37836 loss)
    I1129 15:34:57.130185 21146 sgd_solver.cpp:172] Iteration 118400, lr = 3.16048e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:37:03.126978 21146 solver.cpp:352] Iteration 118500 (0.7937 iter/s, 125.992s/100 iter), 631.5/639.5ep, loss = 4.79248
    I1129 15:37:03.128094 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.79322 (* 1 = 4.79322 loss)
    I1129 15:37:03.129670 21146 sgd_solver.cpp:172] Iteration 118500, lr = 2.4414e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:38:26.783601 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:39:08.512965 21146 solver.cpp:352] Iteration 118600 (0.797567 iter/s, 125.381s/100 iter), 632/639.5ep, loss = 4.57885
    I1129 15:39:08.513186 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.56001 (* 1 = 4.56001 loss)
    I1129 15:39:08.513204 21146 sgd_solver.cpp:172] Iteration 118600, lr = 1.85262e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:41:12.488221 21146 solver.cpp:352] Iteration 118700 (0.806651 iter/s, 123.969s/100 iter), 632.5/639.5ep, loss = 4.71439
    I1129 15:41:12.488400 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.88418 (* 1 = 4.88418 loss)
    I1129 15:41:12.488420 21146 sgd_solver.cpp:172] Iteration 118700, lr = 1.37736e-11, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:42:25.945415 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:43:23.064604 21146 solver.cpp:352] Iteration 118800 (0.765872 iter/s, 130.57s/100 iter), 633.1/639.5ep, loss = 4.65215
    I1129 15:43:23.065228 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.49348 (* 1 = 4.49348 loss)
    I1129 15:43:23.065248 21146 sgd_solver.cpp:172] Iteration 118800, lr = 9.99996e-12, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:45:29.084137 21146 solver.cpp:352] Iteration 118900 (0.793564 iter/s, 126.014s/100 iter), 633.6/639.5ep, loss = 4.47607
    I1129 15:45:29.084333 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.68292 (* 1 = 4.68292 loss)
    I1129 15:45:29.084347 21146 sgd_solver.cpp:172] Iteration 118900, lr = 7.06064e-12, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:46:21.534723 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:47:35.754905 21146 solver.cpp:352] Iteration 119000 (0.789483 iter/s, 126.665s/100 iter), 634.1/639.5ep, loss = 4.76812
    I1129 15:47:35.755177 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.87465 (* 1 = 4.87465 loss)
    I1129 15:47:35.755195 21146 sgd_solver.cpp:172] Iteration 119000, lr = 4.82251e-12, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:49:39.172040 21146 solver.cpp:352] Iteration 119100 (0.810295 iter/s, 123.412s/100 iter), 634.7/639.5ep, loss = 4.65164
    I1129 15:49:39.172271 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.70018 (* 1 = 4.70018 loss)
    I1129 15:49:39.172288 21146 sgd_solver.cpp:172] Iteration 119100, lr = 3.16405e-12, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:50:18.669898 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:51:42.933648 21146 solver.cpp:352] Iteration 119200 (0.808039 iter/s, 123.756s/100 iter), 635.2/639.5ep, loss = 4.6568
    I1129 15:51:42.933894 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.60532 (* 1 = 4.60532 loss)
    I1129 15:51:42.933908 21146 sgd_solver.cpp:172] Iteration 119200, lr = 1.9753e-12, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:53:47.310586 21146 solver.cpp:352] Iteration 119300 (0.804041 iter/s, 124.372s/100 iter), 635.7/639.5ep, loss = 4.63884
    I1129 15:53:47.311041 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.90643 (* 1 = 4.90643 loss)
    I1129 15:53:47.311125 21146 sgd_solver.cpp:172] Iteration 119300, lr = 1.15789e-12, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:54:16.721161 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 15:55:54.929405 21146 solver.cpp:352] Iteration 119400 (0.783615 iter/s, 127.614s/100 iter), 636.3/639.5ep, loss = 4.75873
    I1129 15:55:54.929741 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.94623 (* 1 = 4.94623 loss)
    I1129 15:55:54.929759 21146 sgd_solver.cpp:172] Iteration 119400, lr = 6.24998e-13, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:57:56.920142 21146 solver.cpp:352] Iteration 119500 (0.819767 iter/s, 121.986s/100 iter), 636.8/639.5ep, loss = 4.6531
    I1129 15:57:56.920322 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.29456 (* 1 = 4.29456 loss)
    I1129 15:57:56.920343 21146 sgd_solver.cpp:172] Iteration 119500, lr = 3.01407e-13, m = 0.9, wd = 0.0005, gs = 1
    I1129 15:58:03.032177 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 16:00:08.557597 21146 solver.cpp:352] Iteration 119600 (0.759692 iter/s, 131.632s/100 iter), 637.3/639.5ep, loss = 4.74823
    I1129 16:00:08.557796 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.68489 (* 1 = 4.68489 loss)
    I1129 16:00:08.557811 21146 sgd_solver.cpp:172] Iteration 119600, lr = 1.23456e-13, m = 0.9, wd = 0.0005, gs = 1
    I1129 16:02:02.819576 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 16:02:13.384778 21146 solver.cpp:352] Iteration 119700 (0.801139 iter/s, 124.822s/100 iter), 637.9/639.5ep, loss = 4.59903
    I1129 16:02:13.384896 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.44052 (* 1 = 4.44052 loss)
    I1129 16:02:13.384927 21146 sgd_solver.cpp:172] Iteration 119700, lr = 3.90624e-14, m = 0.9, wd = 0.0005, gs = 1
    I1129 16:04:20.210744 21146 solver.cpp:352] Iteration 119800 (0.788513 iter/s, 126.821s/100 iter), 638.4/639.5ep, loss = 4.68713
    I1129 16:04:20.210966 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.71188 (* 1 = 4.71188 loss)
    I1129 16:04:20.210980 21146 sgd_solver.cpp:172] Iteration 119800, lr = 7.71602e-15, m = 0.9, wd = 0.0005, gs = 1
    I1129 16:06:04.340409 21089 data_reader.cpp:320] Restarting data pre-fetching
    I1129 16:06:27.648304 21146 solver.cpp:352] Iteration 119900 (0.784729 iter/s, 127.433s/100 iter), 638.9/639.5ep, loss = 4.7759
    I1129 16:06:27.648358 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.87287 (* 1 = 4.87287 loss)
    I1129 16:06:27.648373 21146 sgd_solver.cpp:172] Iteration 119900, lr = 4.82251e-16, m = 0.9, wd = 0.0005, gs = 1
    I1129 16:08:30.773595 21146 solver.cpp:352] Iteration 119999 (0.80409 iter/s, 123.12s/99 iter), 639.5/639.5ep, loss = 4.80602
    I1129 16:08:30.773825 21146 solver.cpp:376]     Train net output #0: mbox_loss = 4.67276 (* 1 = 4.67276 loss)
    I1129 16:08:30.773844 21146 solver.cpp:905] Snapshotting to binary proto file training/ti-custom-cfg1/JDetNet/20191127_21-01_ds_PSP_dsFac_32_hdDS8_1/initial/ti-custom-cfg1_ssdJacintoNetV2_iter_120000.caffemodel
    I1129 16:08:30.805562 21146 sgd_solver.cpp:398] Snapshotting solver state to binary proto file training/ti-custom-cfg1/JDetNet/20191127_21-01_ds_PSP_dsFac_32_hdDS8_1/initial/ti-custom-cfg1_ssdJacintoNetV2_iter_120000.solverstate
    I1129 16:08:31.286188 21146 solver.cpp:501] Iteration 120000, loss = 4.80152
    I1129 16:08:31.286237 21146 solver.cpp:635] Iteration 120000, Testing net (#0)
    I1129 16:09:23.452903 21147 solver.cpp:747] class AP 1: 0.381347
    I1129 16:09:23.454003 21147 solver.cpp:747] class AP 2: 0.630648
    I1129 16:09:23.466156 21147 solver.cpp:747] class AP 3: 0.709659
    I1129 16:09:23.466188 21147 solver.cpp:747] class AP 4: 0
    I1129 16:09:23.466275 21147 solver.cpp:747] class AP 5: 0.377345
    I1129 16:09:23.466521 21147 solver.cpp:747] class AP 6: 0.386261
    I1129 16:09:23.466544 21147 solver.cpp:747] class AP 7: 0.181818
    I1129 16:09:23.466604 21147 solver.cpp:747] class AP 8: 0.432078
    I1129 16:09:23.467139 21147 solver.cpp:747] class AP 9: 0.437407
    W1129 16:09:23.467187 21147 solver.cpp:731] Missing true_pos for label: 10
    I1129 16:09:23.467473 21147 solver.cpp:747] class AP 11: 0.560271
    I1129 16:09:23.469290 21147 solver.cpp:747] class AP 12: 0.524232
    I1129 16:09:23.539141 21147 solver.cpp:747] class AP 13: 0.219965
    I1129 16:09:23.539696 21147 solver.cpp:747] class AP 14: 0.386813
    I1129 16:09:23.539714 21147 solver.cpp:747] class AP 15: 0.121212
    W1129 16:09:23.539732 21147 solver.cpp:731] Missing true_pos for label: 16
    I1129 16:09:23.539887 21147 solver.cpp:747] class AP 17: 0.524951
    I1129 16:09:23.539917 21147 solver.cpp:747] class AP 18: 0.101843
    I1129 16:09:23.539973 21147 solver.cpp:747] class AP 19: 0.312211
    I1129 16:09:23.540062 21147 solver.cpp:747] class AP 20: 0.348366
    I1129 16:09:23.540089 21147 solver.cpp:747] class AP 21: 0.292424
    I1129 16:09:23.540127 21147 solver.cpp:747] class AP 22: 0.589608
    W1129 16:09:23.540133 21147 solver.cpp:731] Missing true_pos for label: 23
    W1129 16:09:23.540163 21147 solver.cpp:731] Missing true_pos for label: 24
    I1129 16:09:23.540182 21147 solver.cpp:747] class AP 25: 0
    I1129 16:09:23.540191 21147 solver.cpp:747] class AP 26: 0.0909091
    W1129 16:09:23.540197 21147 solver.cpp:731] Missing true_pos for label: 27
    I1129 16:09:23.540210 21147 solver.cpp:753] Test net output mAP #0: detection_eval = 0.281828
    I1129 16:09:25.107331 21149 solver.cpp:747] class AP 1: 0.36167
    I1129 16:09:25.108279 21149 solver.cpp:747] class AP 2: 0.619743
    I1129 16:09:25.121924 21149 solver.cpp:747] class AP 3: 0.710708
    I1129 16:09:25.121961 21149 solver.cpp:747] class AP 4: 0
    I1129 16:09:25.122045 21149 solver.cpp:747] class AP 5: 0.351564
    I1129 16:09:25.122313 21149 solver.cpp:747] class AP 6: 0.387215
    I1129 16:09:25.122339 21149 solver.cpp:747] class AP 7: 0.272727
    I1129 16:09:25.122416 21149 solver.cpp:747] class AP 8: 0.472996
    I1129 16:09:25.122936 21149 solver.cpp:747] class AP 9: 0.429806
    W1129 16:09:25.122947 21149 solver.cpp:731] Missing true_pos for label: 10
    I1129 16:09:25.123345 21149 solver.cpp:747] class AP 11: 0.611098
    I1129 16:09:25.125095 21149 solver.cpp:747] class AP 12: 0.536305
    I1129 16:09:25.199580 21149 solver.cpp:747] class AP 13: 0.210039
    I1129 16:09:25.200213 21149 solver.cpp:747] class AP 14: 0.405853
    I1129 16:09:25.200245 21149 solver.cpp:747] class AP 15: 0.0844156
    W1129 16:09:25.200251 21149 solver.cpp:731] Missing true_pos for label: 16
    I1129 16:09:25.200403 21149 solver.cpp:747] class AP 17: 0.555228
    I1129 16:09:25.200433 21149 solver.cpp:747] class AP 18: 0.153977
    I1129 16:09:25.200489 21149 solver.cpp:747] class AP 19: 0.170937
    I1129 16:09:25.200584 21149 solver.cpp:747] class AP 20: 0.426276
    I1129 16:09:25.200613 21149 solver.cpp:747] class AP 21: 0.454762
    I1129 16:09:25.200640 21149 solver.cpp:747] class AP 22: 0.426132
    W1129 16:09:25.200649 21149 solver.cpp:731] Missing true_pos for label: 23
    W1129 16:09:25.200664 21149 solver.cpp:731] Missing true_pos for label: 24
    I1129 16:09:25.200680 21149 solver.cpp:747] class AP 25: 0.181818
    I1129 16:09:25.200696 21149 solver.cpp:747] class AP 26: 0.0909091
    W1129 16:09:25.200702 21149 solver.cpp:731] Missing true_pos for label: 27
    I1129 16:09:25.200716 21149 solver.cpp:753] Test net output mAP #0: detection_eval = 0.293118
    I1129 16:09:25.520231 21148 solver.cpp:747] class AP 1: 0.382491
    I1129 16:09:25.520834 21148 solver.cpp:747] class AP 2: 0.636459
    I1129 16:09:25.530647 21148 solver.cpp:747] class AP 3: 0.706751
    W1129 16:09:25.530660 21148 solver.cpp:731] Missing true_pos for label: 4
    I1129 16:09:25.530786 21148 solver.cpp:747] class AP 5: 0.452379
    I1129 16:09:25.530957 21148 solver.cpp:747] class AP 6: 0.394808
    I1129 16:09:25.531002 21148 solver.cpp:747] class AP 7: 0.245455
    I1129 16:09:25.531040 21148 solver.cpp:747] class AP 8: 0.402182
    I1129 16:09:25.531417 21148 solver.cpp:747] class AP 9: 0.425494
    W1129 16:09:25.531425 21148 solver.cpp:731] Missing true_pos for label: 10
    I1129 16:09:25.531558 21148 solver.cpp:747] class AP 11: 0.59094
    I1129 16:09:25.532706 21148 solver.cpp:747] class AP 12: 0.546023
    I1129 16:09:25.580085 21148 solver.cpp:747] class AP 13: 0.242165
    I1129 16:09:25.580623 21148 solver.cpp:747] class AP 14: 0.393377
    I1129 16:09:25.580653 21148 solver.cpp:747] class AP 15: 0.131313
    W1129 16:09:25.580658 21148 solver.cpp:731] Missing true_pos for label: 16
    I1129 16:09:25.580790 21148 solver.cpp:747] class AP 17: 0.484417
    I1129 16:09:25.580813 21148 solver.cpp:747] class AP 18: 0.115252
    I1129 16:09:25.580860 21148 solver.cpp:747] class AP 19: 0.157315
    I1129 16:09:25.580930 21148 solver.cpp:747] class AP 20: 0.426454
    I1129 16:09:25.580956 21148 solver.cpp:747] class AP 21: 0.409479
    I1129 16:09:25.580981 21148 solver.cpp:747] class AP 22: 0.346734
    W1129 16:09:25.580987 21148 solver.cpp:731] Missing true_pos for label: 23
    W1129 16:09:25.581017 21148 solver.cpp:731] Missing true_pos for label: 24
    I1129 16:09:25.581029 21148 solver.cpp:747] class AP 25: 0.257576
    W1129 16:09:25.581034 21148 solver.cpp:731] Missing true_pos for label: 26
    W1129 16:09:25.581043 21148 solver.cpp:731] Missing true_pos for label: 27
    I1129 16:09:25.581073 21148 solver.cpp:753] Test net output mAP #0: detection_eval = 0.286928
    I1129 16:09:28.583726 21142 data_reader.cpp:320] Restarting data pre-fetching
    I1129 16:09:29.957794 21146 solver.cpp:747] class AP 1: 0.397464
    I1129 16:09:29.958667 21146 solver.cpp:747] class AP 2: 0.634234
    I1129 16:09:29.970098 21146 solver.cpp:747] class AP 3: 0.717162
    I1129 16:09:29.970118 21146 solver.cpp:747] class AP 4: 0
    I1129 16:09:29.970191 21146 solver.cpp:747] class AP 5: 0.413864
    I1129 16:09:29.970414 21146 solver.cpp:747] class AP 6: 0.389498
    I1129 16:09:29.970432 21146 solver.cpp:747] class AP 7: 0.181818
    I1129 16:09:29.970496 21146 solver.cpp:747] class AP 8: 0.409622
    I1129 16:09:29.970965 21146 solver.cpp:747] class AP 9: 0.455078
    W1129 16:09:29.970976 21146 solver.cpp:731] Missing true_pos for label: 10
    I1129 16:09:29.971295 21146 solver.cpp:747] class AP 11: 0.582943
    I1129 16:09:29.972836 21146 solver.cpp:747] class AP 12: 0.517631
    I1129 16:09:30.036936 21146 solver.cpp:747] class AP 13: 0.226908
    I1129 16:09:30.037420 21146 solver.cpp:747] class AP 14: 0.395536
    I1129 16:09:30.037436 21146 solver.cpp:747] class AP 15: 0.010101
    W1129 16:09:30.037441 21146 solver.cpp:731] Missing true_pos for label: 16
    I1129 16:09:30.037559 21146 solver.cpp:747] class AP 17: 0.525359
    I1129 16:09:30.037585 21146 solver.cpp:747] class AP 18: 0.0685477
    I1129 16:09:30.037633 21146 solver.cpp:747] class AP 19: 0.23786
    I1129 16:09:30.037719 21146 solver.cpp:747] class AP 20: 0.435295
    I1129 16:09:30.037742 21146 solver.cpp:747] class AP 21: 0.3342
    I1129 16:09:30.037772 21146 solver.cpp:747] class AP 22: 0.380002
    W1129 16:09:30.037777 21146 solver.cpp:731] Missing true_pos for label: 23
    I1129 16:09:30.037793 21146 solver.cpp:747] class AP 24: 0
    I1129 16:09:30.037806 21146 solver.cpp:747] class AP 25: 0.121212
    I1129 16:09:30.037814 21146 solver.cpp:747] class AP 26: 0
    I1129 16:09:30.037820 21146 solver.cpp:747] class AP 27: 0
    I1129 16:09:30.037832 21146 solver.cpp:753] Test net output mAP #0: detection_eval = 0.275346
    I1129 16:09:30.038211 21031 parallel.cpp:67] Root Solver performance on device 0: 0.7738 * 32 = 24.76 img/sec (120000 itr in 1.551e+05 sec)
    I1129 16:09:30.038249 21031 parallel.cpp:72]      Solver performance on device 1: 0.7738 * 32 = 24.76 img/sec (120000 itr in 1.551e+05 sec)
    I1129 16:09:30.038262 21031 parallel.cpp:72]      Solver performance on device 2: 0.7738 * 32 = 24.76 img/sec (120000 itr in 1.551e+05 sec)
    I1129 16:09:30.038273 21031 parallel.cpp:72]      Solver performance on device 3: 0.7738 * 32 = 24.76 img/sec (120000 itr in 1.551e+05 sec)
    I1129 16:09:30.038278 21031 parallel.cpp:75] Overall multi-GPU performance: 99.0437 img/sec
    I1129 16:09:33.562302 21031 caffe.cpp:271] Optimization Done in 43h 7m 52s

  • Hi,

    Yes, it could be because of the poor recognition rate of my model. Please improve your model and then try again.

    Thanks,

    Praveen

  • Hi,

    I have improved my model, the output mAP can reach  0.705838。But the DSP still runs a long time!

    there is no "Kernel Size not matching 65536 !!Setting RAND Kernel Params for Layer ctx_output1" error any more!

    I1211 10:37:33.404964  8344 solver.cpp:352] Iteration 66100 (0.825696 iter/s, 121.11s/100 iter), 88.1/159.9ep, loss = 3.53407
    I1211 10:37:33.405225  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.93163 (* 1 = 3.93163 loss)
    I1211 10:37:33.405259  8344 sgd_solver.cpp:172] Iteration 66100, lr = 4.07033e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:39:37.823043  8344 solver.cpp:352] Iteration 66200 (0.803772 iter/s, 124.413s/100 iter), 88.2/159.9ep, loss = 3.7464
    I1211 10:39:37.823238  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.84559 (* 1 = 3.84559 loss)
    I1211 10:39:37.823251  8344 sgd_solver.cpp:172] Iteration 66200, lr = 4.04021e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:41:46.432399  8344 solver.cpp:352] Iteration 66300 (0.777577 iter/s, 128.605s/100 iter), 88.3/159.9ep, loss = 3.69812
    I1211 10:41:46.432647  8344 solver.cpp:376]     Train net output #0: mbox_loss = 4.22606 (* 1 = 4.22606 loss)
    I1211 10:41:46.432660  8344 sgd_solver.cpp:172] Iteration 66300, lr = 4.01026e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:43:54.920900  8344 solver.cpp:352] Iteration 66400 (0.778308 iter/s, 128.484s/100 iter), 88.5/159.9ep, loss = 3.77567
    I1211 10:43:54.921126  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.52246 (* 1 = 3.52246 loss)
    I1211 10:43:54.921144  8344 sgd_solver.cpp:172] Iteration 66400, lr = 3.98047e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:46:02.121598  8344 solver.cpp:352] Iteration 66500 (0.786191 iter/s, 127.196s/100 iter), 88.6/159.9ep, loss = 3.58005
    I1211 10:46:02.121822  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.27991 (* 1 = 3.27991 loss)
    I1211 10:46:02.121839  8344 sgd_solver.cpp:172] Iteration 66500, lr = 3.95085e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:48:12.601910  8344 solver.cpp:352] Iteration 66600 (0.766432 iter/s, 130.475s/100 iter), 88.7/159.9ep, loss = 3.63979
    I1211 10:48:12.602138  8344 solver.cpp:376]     Train net output #0: mbox_loss = 4.77363 (* 1 = 4.77363 loss)
    I1211 10:48:12.602154  8344 sgd_solver.cpp:172] Iteration 66600, lr = 3.92139e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:50:18.932494  8344 solver.cpp:352] Iteration 66700 (0.791606 iter/s, 126.325s/100 iter), 88.9/159.9ep, loss = 3.80512
    I1211 10:50:18.932755  8344 solver.cpp:376]     Train net output #0: mbox_loss = 4.06938 (* 1 = 4.06938 loss)
    I1211 10:50:18.932785  8344 sgd_solver.cpp:172] Iteration 66700, lr = 3.8921e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:52:14.792783  8392 data_reader.cpp:320] Restarting data pre-fetching
    I1211 10:52:17.909117  8344 solver.cpp:352] Iteration 66800 (0.840535 iter/s, 118.972s/100 iter), 89/159.9ep, loss = 3.63556
    I1211 10:52:17.909184  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.51179 (* 1 = 3.51179 loss)
    I1211 10:52:17.909214  8344 sgd_solver.cpp:172] Iteration 66800, lr = 3.86297e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:54:23.116819  8344 solver.cpp:352] Iteration 66900 (0.798705 iter/s, 125.203s/100 iter), 89.1/159.9ep, loss = 3.82072
    I1211 10:54:23.117004  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.37434 (* 1 = 3.37434 loss)
    I1211 10:54:23.117022  8344 sgd_solver.cpp:172] Iteration 66900, lr = 3.83401e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:56:28.875239  8344 solver.cpp:352] Iteration 67000 (0.795207 iter/s, 125.753s/100 iter), 89.3/159.9ep, loss = 3.61223
    I1211 10:56:28.875525  8344 solver.cpp:376]     Train net output #0: mbox_loss = 2.36557 (* 1 = 2.36557 loss)
    I1211 10:56:28.875541  8344 sgd_solver.cpp:172] Iteration 67000, lr = 3.80521e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 10:58:37.363348  8344 solver.cpp:352] Iteration 67100 (0.778313 iter/s, 128.483s/100 iter), 89.4/159.9ep, loss = 3.4078
    I1211 10:58:37.363605  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.66195 (* 1 = 3.66195 loss)
    I1211 10:58:37.363623  8344 sgd_solver.cpp:172] Iteration 67100, lr = 3.77657e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:00:44.127131  8344 solver.cpp:352] Iteration 67200 (0.788899 iter/s, 126.759s/100 iter), 89.5/159.9ep, loss = 3.76676
    I1211 11:00:44.127331  8344 solver.cpp:376]     Train net output #0: mbox_loss = 4.32008 (* 1 = 4.32008 loss)
    I1211 11:00:44.127346  8344 sgd_solver.cpp:172] Iteration 67200, lr = 3.7481e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:02:46.281440  8344 solver.cpp:352] Iteration 67300 (0.818668 iter/s, 122.15s/100 iter), 89.7/159.9ep, loss = 3.8021
    I1211 11:02:46.281684  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.71104 (* 1 = 3.71104 loss)
    I1211 11:02:46.281698  8344 sgd_solver.cpp:172] Iteration 67300, lr = 3.71978e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:04:48.982823  8344 solver.cpp:352] Iteration 67400 (0.815018 iter/s, 122.697s/100 iter), 89.8/159.9ep, loss = 3.84141
    I1211 11:04:48.983158  8344 solver.cpp:376]     Train net output #0: mbox_loss = 4.84792 (* 1 = 4.84792 loss)
    I1211 11:04:48.983189  8344 sgd_solver.cpp:172] Iteration 67400, lr = 3.69163e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:07:02.976258  8344 solver.cpp:352] Iteration 67500 (0.746334 iter/s, 133.988s/100 iter), 89.9/159.9ep, loss = 3.62158
    I1211 11:07:02.976516  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.0299 (* 1 = 3.0299 loss)
    I1211 11:07:02.976534  8344 sgd_solver.cpp:172] Iteration 67500, lr = 3.66364e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:08:00.196483  8392 data_reader.cpp:320] Restarting data pre-fetching
    I1211 11:09:01.596217  8344 solver.cpp:352] Iteration 67600 (0.84306 iter/s, 118.615s/100 iter), 90.1/159.9ep, loss = 3.79511
    I1211 11:09:01.596388  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.16419 (* 1 = 3.16419 loss)
    I1211 11:09:01.596402  8344 sgd_solver.cpp:172] Iteration 67600, lr = 3.6358e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:10:58.724314  8344 solver.cpp:352] Iteration 67700 (0.853798 iter/s, 117.124s/100 iter), 90.2/159.9ep, loss = 3.84323
    I1211 11:10:58.724568  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.3396 (* 1 = 3.3396 loss)
    I1211 11:10:58.724596  8344 sgd_solver.cpp:172] Iteration 67700, lr = 3.60813e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:12:56.148473  8344 solver.cpp:352] Iteration 67800 (0.851646 iter/s, 117.42s/100 iter), 90.3/159.9ep, loss = 3.73468
    I1211 11:12:56.148681  8344 solver.cpp:376]     Train net output #0: mbox_loss = 4.53421 (* 1 = 4.53421 loss)
    I1211 11:12:56.148695  8344 sgd_solver.cpp:172] Iteration 67800, lr = 3.58061e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:14:56.689833  8344 solver.cpp:352] Iteration 67900 (0.829622 iter/s, 120.537s/100 iter), 90.5/159.9ep, loss = 3.86499
    I1211 11:14:56.690120  8344 solver.cpp:376]     Train net output #0: mbox_loss = 3.459 (* 1 = 3.459 loss)
    I1211 11:14:56.690150  8344 sgd_solver.cpp:172] Iteration 67900, lr = 3.55325e-05, m = 0.9, wd = 1e-05, gs = 1
    I1211 11:16:56.353896  8344 solver.cpp:905] Snapshotting to binary proto file training/voc0712/JDetNet/20191207_16-28_ds_PSP_dsFac_32_hdDS8_1/sparse/voc0712_ssdJacintoNetV2_iter_68000.caffemodel
    I1211 11:16:56.382869  8344 sgd_solver.cpp:398] Snapshotting solver state to binary proto file training/voc0712/JDetNet/20191207_16-28_ds_PSP_dsFac_32_hdDS8_1/sparse/voc0712_ssdJacintoNetV2_iter_68000.solverstate
    I1211 11:16:56.393767  8344 solver.cpp:635] Iteration 68000, Testing net (#0)
    I1211 11:20:52.639706  8450 data_reader.cpp:320] Restarting data pre-fetching
    I1211 11:20:57.121232  8344 solver.cpp:747] class AP 1: 0.725286
    I1211 11:20:57.127454  8344 solver.cpp:747] class AP 2: 0.822513
    I1211 11:20:57.189450  8344 solver.cpp:747] class AP 3: 0.835595
    I1211 11:20:57.189630  8344 solver.cpp:747] class AP 4: 0.546982
    I1211 11:20:57.190145  8344 solver.cpp:747] class AP 5: 0.770749
    I1211 11:20:57.193840  8344 solver.cpp:747] class AP 6: 0.692767
    I1211 11:20:57.194069  8344 solver.cpp:747] class AP 7: 0.689982
    I1211 11:20:57.194425  8344 solver.cpp:747] class AP 8: 0.769123
    I1211 11:20:57.201663  8344 solver.cpp:747] class AP 9: 0.732153
    I1211 11:20:57.201771  8344 solver.cpp:747] class AP 10: 0.463603
    I1211 11:20:57.202775  8344 solver.cpp:747] class AP 11: 0.803503
    I1211 11:20:57.225550  8344 solver.cpp:747] class AP 12: 0.740474
    I1211 11:20:57.424868  8344 solver.cpp:747] class AP 13: 0.474934
    I1211 11:20:57.432471  8344 solver.cpp:747] class AP 14: 0.646229
    I1211 11:20:57.432581  8344 solver.cpp:747] class AP 15: 0.854343
    I1211 11:20:57.432617  8344 solver.cpp:747] class AP 16: 0.412076
    I1211 11:20:57.433063  8344 solver.cpp:747] class AP 17: 0.887588
    I1211 11:20:57.433281  8344 solver.cpp:747] class AP 18: 0.723274
    I1211 11:20:57.433910  8344 solver.cpp:747] class AP 19: 0.649638
    I1211 11:20:57.438482  8344 solver.cpp:747] class AP 20: 0.804004
    I1211 11:20:57.438675  8344 solver.cpp:747] class AP 21: 0.822653
    I1211 11:20:57.440227  8344 solver.cpp:747] class AP 22: 0.797649
    I1211 11:20:57.440418  8344 solver.cpp:747] class AP 23: 0.610381
    I1211 11:20:57.440910  8344 solver.cpp:747] class AP 24: 0.56701
    I1211 11:20:57.440979  8344 solver.cpp:747] class AP 25: 0.901549
    I1211 11:20:57.441049  8344 solver.cpp:747] class AP 26: 0.595326
    I1211 11:20:57.441198  8344 solver.cpp:747] class AP 27: 0.718235
    I1211 11:20:57.441224  8344 solver.cpp:753] Test net output mAP #0: detection_eval = 0.705838

     

    [IPU1-0]     19.633898 s:  vx_tutorial_tidl: Tutorial Started !!!
    [IPU1-0]     19.651223 s:  Reading config file sd:test_data//tivx/tidl/tidl_infer.cfg ...
    [IPU1-0]     19.697828 s:  Reading network file sd:test_data//./tivx/tidl/tidl_net_jdetNet_ssd.bin
    [IPU1-0]  ...
    [IPU1-0]     20.537516 s:  Reading network params file sd:test_data//./tivx/tidl/tidl_param_jdetNet_ssd.bin
    [IPU1-0]  ...
    [IPU1-0]     24.927591 s:
    [IPU1-0]     24.927713 s: Thread #1: Create graph ...
    [IPU1-0]     24.928719 s: Thread #1: Create input and output tensors for node 1 ...
    [IPU1-0]     24.929695 s: Thread #1: Create node 1 ...
    [IPU1-0]     24.946166 s: Thread #1: Create output tensors for node 2 ...
    [IPU1-0]     24.946623 s: Thread #1: Create node 2 ...
    [IPU1-0]     24.962666 s:
    [IPU1-0]     24.962819 s: Thread #1: Verify graph ...
    [IPU1-0]     28.184682 s:
    [IPU1-0]     28.184804 s: Thread #1: Start graph ...
    [IPU1-0]     28.184956 s:
    [IPU1-0]     28.185323 s: Thread #1: Wait for graph ...
    [IPU1-0]     28.492374 s:
    [IPU1-0]     28.492466 s: Thread #1: Results
    [IPU1-0]     28.492588 s: ---------------------
    [IPU1-0]     28.492801 s:
    [IPU1-0]     28.492923 s: ObjId|label|score| xmin| ymin| xmax| ymax|
    [IPU1-0]     28.493137 s: ------------------------------------------
    [IPU1-0]     28.493411 s:     0|   13| 1.00| 0.43| 0.56| 0.45| 0.68|
    [IPU1-0]     28.493625 s:     1|   14| 1.00| 0.79| 0.70| 0.85| 0.79|
    [IPU1-0]     28.493869 s:     2|    2| 1.00| 0.12| 0.69| 0.18| 0.77|
    [IPU1-0]     28.494113 s:     3|   12| 1.00| 0.46| 0.60| 0.51| 0.68|
    [IPU1-0]     28.494326 s:     4|   12| 1.00| 0.78| 0.52| 0.84| 0.59|
    [IPU1-0]     28.494540 s:     5|   13| 1.00| 0.12| 0.90| 0.16| 1.01|
    [IPU1-0]     28.494784 s:     6|   12| 1.00| 0.46| 0.76| 0.50| 0.93|
    [IPU1-0]     28.495211 s:     7|   16| 1.00| 0.92|-0.01| 0.96| 0.10|
    [IPU1-0]     28.495455 s:     8|   12| 1.00| 0.77| 0.50| 0.85| 0.65|
    [IPU1-0]     28.495668 s:     9|   20| 1.00| 0.72| 0.16| 0.81| 0.33|
    [IPU1-0]     28.495882 s:    10|   17| 1.00| 0.46| 0.78| 0.50| 0.91|
    [IPU1-0]     28.496156 s:    11|    1| 1.00| 0.79| 0.51| 0.83| 0.62|
    [IPU1-0]     28.496370 s:    12|    4| 1.00| 0.92| 0.08| 0.97| 0.22|
    [IPU1-0]     28.496583 s:    13|    1| 1.00| 0.90| 0.90| 0.91| 0.93|
    [IPU1-0]     28.496797 s:    14|    2| 1.00| 0.90| 0.75| 0.91| 0.78|
    [IPU1-0]     28.497071 s:    15|   22| 1.00| 0.21| 0.74| 0.22| 0.79|
    [IPU1-0]     28.497285 s:    16|    6| 1.00| 0.23| 0.69| 0.25| 0.73|
    [IPU1-0]     28.497529 s:    17|    1| 1.00| 0.57| 0.60| 0.58| 0.71|
    [IPU1-0]     28.497742 s:    18|    1| 1.00| 0.23| 0.65| 0.25| 0.68|
    [IPU1-0]     28.497956 s:    19|    1| 1.00| 0.21| 0.64| 0.22| 0.68|
    [IPU1-0]     28.498047 s:
    [IPU1-0]     28.498169 s: Number of detected objects: 20
    [IPU1-0]     28.498230 s:
    [IPU1-0]     28.498291 s:
    [IPU1-0]     28.498474 s: ---- Thread #1: Node 1 (EVE-1) Execution time: 190.762000 ms
    [IPU1-0]     28.498718 s: ---- Thread #1: Node 2 (DSP-1) Execution time: 116.203000 ms
    [IPU1-0]     28.498993 s: ---- Thread #1: Total Graph Execution time: 307.395000 ms

  • the output of my import tools and infer tools are same!

    All the test use the same input image, why the result of my model on the EVM is different?

     Layer   43 :
    Target: number label value xmin  ymin  xmax  ymax
    Target:  0.00  1.00  0.91  0.02  0.57  0.12  0.76
    Target:  1.00  3.00  0.81  0.02  0.57  0.17  0.75
    Target:  2.00  3.00  0.35  0.33  0.56  0.36  0.61
    Target:  3.00  3.00  0.26  0.87  0.55  1.00  0.84
    Target:  4.00  6.00  0.20  0.87  0.55  1.00  0.84
    Target:  5.00 13.00  0.59  0.82  0.54  0.89  0.85
    Target:  6.00 14.00  0.53  0.55  0.56  0.58  0.71
    Target:  7.00 14.00  0.20  0.30  0.56  0.33  0.68
     #MMACs =     0.00,     0.00, Sparsity :   0.00
    Not belongs to this group!
    End of config list found !

  • >> the output of my import tools and infer tools are same!

    Can you elaborate more on this? 

    Thanks,

    Praveen

  • this is my import tools result:

    PS C:\Users\lh\Desktop\lh\SSD_JacintoNetV2\de_27k> .\tidl_model_import.out.exe .\tidl_import_JDetNet.txt
    randParams = 0
    modelType = 0
    quantizationStyle = 1
    quantRoundAdd = 25
    numParamBits = 8
    preProcType = 4
    inElementType = 0
    numFrames = -1
    rawSampleInData = 1
    numSampleInData = 1
    foldBnInConv2D = 1
    inWidth = -1
    inHeight = -1
    inNumChannels = -1
    sampleInData = trace_dump_0_768x320.y
    tidlStatsTool = eve_test_dl_algo.out.exe
    inputNetFile = deploy.prototxt
    inputParamsFile = voc0712_ssdJacintoNetV2_iter_68000.caffemodel
    outputNetFile = tidl_net_jdetNet_ssd.bin
    outputParamsFile = tidl_param_jdetNet_ssd.bin
    conv2dKernelType = 0
    layersGroupId = 0
    Caffe Network File : deploy.prototxt
    Caffe Model File   : voc0712_ssdJacintoNetV2_iter_68000.caffemodel
    TIDL Network File  : tidl_net_jdetNet_ssd.bin
    TIDL Model File    : tidl_param_jdetNet_ssd.bin
    Name of the Network : ssdJacintoNetV2_deploy
    Num Inputs :               1
     Num of Layer Detected :  50
      0, TIDL_DataLayer                , data                                      0,  -1 ,  1 ,   x ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  0 ,       0 ,       0 ,       0 ,       0 ,       1 ,       3 ,     320 ,     768 ,         0 ,
      1, TIDL_BatchNormLayer           , data/bias                                 1,   1 ,  1 ,   0 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  1 ,       1 ,       3 ,     320 ,     768 ,       1 ,       3 ,     320 ,     768 ,    737280 ,
      2, TIDL_ConvolutionLayer         , conv1a                                    1,   1 ,  1 ,   1 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  2 ,       1 ,       3 ,     320 ,     768 ,       1 ,      32 ,     160 ,     384 , 147456000 ,
      3, TIDL_ConvolutionLayer         , conv1b                                    1,   1 ,  1 ,   2 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  3 ,       1 ,      32 ,     160 ,     384 ,       1 ,      32 ,      80 ,     192 , 141557760 ,
      4, TIDL_ConvolutionLayer         , res2a_branch2a                            1,   1 ,  1 ,   3 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  4 ,       1 ,      32 ,      80 ,     192 ,       1 ,      64 ,      80 ,     192 , 283115520 ,
      5, TIDL_ConvolutionLayer         , res2a_branch2b                            1,   1 ,  1 ,   4 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  5 ,       1 ,      64 ,      80 ,     192 ,       1 ,      64 ,      40 ,      96 , 141557760 ,
      6, TIDL_ConvolutionLayer         , res3a_branch2a                            1,   1 ,  1 ,   5 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  6 ,       1 ,      64 ,      40 ,      96 ,       1 ,     128 ,      40 ,      96 , 283115520 ,
      7, TIDL_ConvolutionLayer         , res3a_branch2b                            1,   1 ,  1 ,   6 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  7 ,       1 ,     128 ,      40 ,      96 ,       1 ,     128 ,      20 ,      48 , 141557760 ,
      8, TIDL_ConvolutionLayer         , res4a_branch2a                            1,   1 ,  1 ,   7 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  8 ,       1 ,     128 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 , 283115520 ,
      9, TIDL_ConvolutionLayer         , res4a_branch2b                            1,   1 ,  1 ,   8 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  9 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 , 141557760 ,
     10, TIDL_PoolingLayer             , pool4                                     1,   1 ,  1 ,   9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 10 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      10 ,      24 ,    245760 ,
     11, TIDL_ConvolutionLayer         , res5a_branch2a                            1,   1 ,  1 ,  10 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 11 ,       1 ,     256 ,      10 ,      24 ,       1 ,     512 ,      10 ,      24 , 283115520 ,
     12, TIDL_ConvolutionLayer         , res5a_branch2b                            1,   1 ,  1 ,  11 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 12 ,       1 ,     512 ,      10 ,      24 ,       1 ,     512 ,      10 ,      24 , 141557760 ,
     13, TIDL_PoolingLayer             , pool6                                     1,   1 ,  1 ,  12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 13 ,       1 ,     512 ,      10 ,      24 ,       1 ,     512 ,       5 ,      12 ,    122880 ,
     14, TIDL_PoolingLayer             , pool7                                     1,   1 ,  1 ,  13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 14 ,       1 ,     512 ,       5 ,      12 ,       1 ,     512 ,       3 ,       6 ,     36864 ,
     15, TIDL_PoolingLayer             , pool8                                     1,   1 ,  1 ,  14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 15 ,       1 ,     512 ,       3 ,       6 ,       1 ,     512 ,       2 ,       3 ,     12288 ,
     16, TIDL_ConvolutionLayer         , ctx_output1                               1,   1 ,  1 ,   9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 16 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 ,  62914560 ,
     17, TIDL_ConvolutionLayer         , ctx_output2                               1,   1 ,  1 ,  12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 17 ,       1 ,     512 ,      10 ,      24 ,       1 ,     256 ,      10 ,      24 ,  31457280 ,
     18, TIDL_ConvolutionLayer         , ctx_output3                               1,   1 ,  1 ,  13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 18 ,       1 ,     512 ,       5 ,      12 ,       1 ,     256 ,       5 ,      12 ,   7864320 ,
     19, TIDL_ConvolutionLayer         , ctx_output4                               1,   1 ,  1 ,  14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 19 ,       1 ,     512 ,       3 ,       6 ,       1 ,     256 ,       3 ,       6 ,   2359296 ,
     20, TIDL_ConvolutionLayer         , ctx_output5                               1,   1 ,  1 ,  15 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 20 ,       1 ,     512 ,       2 ,       3 ,       1 ,     256 ,       2 ,       3 ,    786432 ,
     21, TIDL_ConvolutionLayer         , ctx_output1/relu_mbox_loc                 1,   1 ,  1 ,  16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 21 ,       1 ,     256 ,      20 ,      48 ,       1 ,      16 ,      20 ,      48 ,   3932160 ,
     22, TIDL_FlattenLayer             , ctx_output1/relu_mbox_loc_perm            1,   1 ,  1 ,  21 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 22 ,       1 ,      16 ,      20 ,      48 ,       1 ,       1 ,       1 ,   15360 ,         1 ,
     23, TIDL_ConvolutionLayer         , ctx_output1/relu_mbox_conf                1,   1 ,  1 ,  16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 23 ,       1 ,     256 ,      20 ,      48 ,       1 ,     112 ,      20 ,      48 ,  27525120 ,
     24, TIDL_FlattenLayer             , ctx_output1/relu_mbox_conf_perm           1,   1 ,  1 ,  23 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 24 ,       1 ,     112 ,      20 ,      48 ,       1 ,       1 ,       1 ,  107520 ,         1 ,
     26, TIDL_ConvolutionLayer         , ctx_output2/relu_mbox_loc                 1,   1 ,  1 ,  17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 26 ,       1 ,     256 ,      10 ,      24 ,       1 ,      24 ,      10 ,      24 ,   1474560 ,
     27, TIDL_FlattenLayer             , ctx_output2/relu_mbox_loc_perm            1,   1 ,  1 ,  26 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 27 ,       1 ,      24 ,      10 ,      24 ,       1 ,       1 ,       1 ,    5760 ,         1 ,
     28, TIDL_ConvolutionLayer         , ctx_output2/relu_mbox_conf                1,   1 ,  1 ,  17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 28 ,       1 ,     256 ,      10 ,      24 ,       1 ,     168 ,      10 ,      24 ,  10321920 ,
     29, TIDL_FlattenLayer             , ctx_output2/relu_mbox_conf_perm           1,   1 ,  1 ,  28 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 29 ,       1 ,     168 ,      10 ,      24 ,       1 ,       1 ,       1 ,   40320 ,         1 ,
     31, TIDL_ConvolutionLayer         , ctx_output3/relu_mbox_loc                 1,   1 ,  1 ,  18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 31 ,       1 ,     256 ,       5 ,      12 ,       1 ,      24 ,       5 ,      12 ,    368640 ,
     32, TIDL_FlattenLayer             , ctx_output3/relu_mbox_loc_perm            1,   1 ,  1 ,  31 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 32 ,       1 ,      24 ,       5 ,      12 ,       1 ,       1 ,       1 ,    1440 ,         1 ,
     33, TIDL_ConvolutionLayer         , ctx_output3/relu_mbox_conf                1,   1 ,  1 ,  18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 33 ,       1 ,     256 ,       5 ,      12 ,       1 ,     168 ,       5 ,      12 ,   2580480 ,
     34, TIDL_FlattenLayer             , ctx_output3/relu_mbox_conf_perm           1,   1 ,  1 ,  33 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 34 ,       1 ,     168 ,       5 ,      12 ,       1 ,       1 ,       1 ,   10080 ,         1 ,
     36, TIDL_ConvolutionLayer         , ctx_output4/relu_mbox_loc                 1,   1 ,  1 ,  19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 36 ,       1 ,     256 ,       3 ,       6 ,       1 ,      24 ,       3 ,       6 ,    110592 ,
     37, TIDL_FlattenLayer             , ctx_output4/relu_mbox_loc_perm            1,   1 ,  1 ,  36 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 37 ,       1 ,      24 ,       3 ,       6 ,       1 ,       1 ,       1 ,     432 ,         1 ,
     38, TIDL_ConvolutionLayer         , ctx_output4/relu_mbox_conf                1,   1 ,  1 ,  19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 38 ,       1 ,     256 ,       3 ,       6 ,       1 ,     168 ,       3 ,       6 ,    774144 ,
     39, TIDL_FlattenLayer             , ctx_output4/relu_mbox_conf_perm           1,   1 ,  1 ,  38 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 39 ,       1 ,     168 ,       3 ,       6 ,       1 ,       1 ,       1 ,    3024 ,         1 ,
     41, TIDL_ConvolutionLayer         , ctx_output5/relu_mbox_loc                 1,   1 ,  1 ,  20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 41 ,       1 ,     256 ,       2 ,       3 ,       1 ,      16 ,       2 ,       3 ,     24576 ,
     42, TIDL_FlattenLayer             , ctx_output5/relu_mbox_loc_perm            1,   1 ,  1 ,  41 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 42 ,       1 ,      16 ,       2 ,       3 ,       1 ,       1 ,       1 ,      96 ,         1 ,
     43, TIDL_ConvolutionLayer         , ctx_output5/relu_mbox_conf                1,   1 ,  1 ,  20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 43 ,       1 ,     256 ,       2 ,       3 ,       1 ,     112 ,       2 ,       3 ,    172032 ,
     44, TIDL_FlattenLayer             , ctx_output5/relu_mbox_conf_perm           1,   1 ,  1 ,  43 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 44 ,       1 ,     112 ,       2 ,       3 ,       1 ,       1 ,       1 ,     672 ,         1 ,
     46, TIDL_ConcatLayer              , mbox_loc                                  1,   5 ,  1 ,  22 , 27 , 32 , 37 , 42 ,  x ,  x ,  x , 46 ,       1 ,       1 ,       1 ,   15360 ,       1 ,       1 ,       1 ,   23088 ,         1 ,
     47, TIDL_ConcatLayer              , mbox_conf                                 1,   5 ,  1 ,  24 , 29 , 34 , 39 , 44 ,  x ,  x ,  x , 47 ,       1 ,       1 ,       1 ,  107520 ,       1 ,       1 ,       1 ,  161616 ,         1 ,
     49, TIDL_DetectionOutputLayer     , detection_out                             2,   2 ,  1 ,  46 , 47 ,  x ,  x ,  x ,  x ,  x ,  x , 49 ,       1 ,       1 ,       1 ,   23088 ,       1 ,       1 ,       1 ,     560 ,         1 ,
    Total Giga Macs : 2.1415
    已复制         1 个文件。

    Processing config file .\tempDir\qunat_stats_config.txt !
    noZeroCoeffsPercentage = 100
    updateNetWithStats = 1
    rawImage = 1
    randInput = 0
    writeInput = 0
    writeOutput = 1
    compareRef = 0
    numFrames = 1
    preProcType = 4
    netBinFile = .\tempDir\temp_net.bin
    outputNetBinFile = tidl_net_jdetNet_ssd.bin
    paramsBinFile = tidl_param_jdetNet_ssd.bin
    inData = trace_dump_0_768x320.y
    outData = .\tempDir\stats_tool_out.bin
    traceDumpBaseName = .\tempDir\trace_dump_
    testCaseName =
    testCaseDesc =
    performanceTestcase = 0
    layersGroupId = 1
    writeQ = 0
    readQ = 0
    runFullNet = 1
    read layer 0 param
    read layer 1 param
    read layer 2 param
    read layer 3 param
    read layer 4 param
    read layer 5 param
    read layer 6 param
    read layer 7 param
    read layer 8 param
    read layer 9 param
    read layer 10 param
    read layer 11 param
    read layer 12 param
    read layer 13 param
    read layer 14 param
    read layer 15 param
    read layer 16 param
    read layer 17 param
    read layer 18 param
    read layer 19 param
    read layer 20 param
    read layer 21 param
    read layer 22 param
    read layer 23 param
    read layer 24 param
    read layer 25 param
    read layer 26 param
    read layer 27 param
    read layer 28 param
    read layer 29 param
    read layer 30 param
    read layer 31 param
    read layer 32 param
    read layer 33 param
    read layer 34 param
    read layer 35 param
    read layer 36 param
    read layer 37 param
    read layer 38 param
    read layer 39 param
    read layer 40 param
    read layer 41 param
    read layer 42 param
    read layer 43 param
    read layer 44 param

    weightsElementSize = 1
    slopeElementSize   = 1
    biasElementSize    = 2
    dataElementSize    = 1
    interElementSize   = 4
    quantizationStyle  = 1
    strideOffsetMethod = 0
    reserved           = 0

    Layer ID    ,inBlkWidth  ,inBlkHeight ,inBlkPitch  ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs    ,numOutChs   ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs  ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
          2           72           72           72           32           32           32            3           32            3            1            8            1            3           12            5         5184         1024            1
          3           40           34           40           32           32           32            8            8            8            4            8            1            2           12            5         1360         1024            1
          4           40           22           40           32           20           32           32           64           32            8            8            1            4            6            4          880          640            1
          5           40           22           40           32           20           32           16           16           16            8            8            1            2            6            4          880          640            1
          6           40           22           40           32           20           32           64          128           64            8            8            1            8            3            2          880          640            1
          7           40           22           40           32           20           32           32           32           32            8            8            1            4            3            2          880          640            1
          8           56           22           56           48           20           48          128          256          128            7            8            1           19            1            1         1232          960            1
          9           56           22           56           48           20           48           64           64           64            7            8            1           10            1            1         1232          960            1
         11           40           12           40           32           10           32          256          512          256            8            8            1           32            1            1          480          320            1
         12           40           12           40           32           10           32          128          128          128            8            8            1           16            1            1          480          320            1
         16           48            4           48           48            4           48          256          256          256           32            8            1            8            1            5          192          192            1
         17           24           10           24           24           10           24          512          256          512           32           32            1           16            1            1          240          240            1
         18           12            5           12           12            5           12          512          256          512           32           32            1           16            1            1           60           60            1
         19            6            3            6            6            3            6          512          256          512           32           32            1           16            1            1           18           18            1
         20            3            2            3            3            2            3          512          256          512           32           32            1           16            1            1            6            6            1
         21           48            4           48           48            4           48          256           16          256           32            8            1            8            1            5          192          192            1
         23           48            4           48           48            4           48          256          112          256           32            8            1            8            1            5          192          192            1
         25           24           10           24           24           10           24          256           24          256           32           24            1            8            1            1          240          240            1
         27           24           10           24           24           10           24          256          192          256           32           32            1            8            1            1          240          240            1
         29           12            5           12           12            5           12          256           24          256           32           24            1            8            1            1           60           60            1
         31           12            5           12           12            5           12          256          192          256           32           32            1            8            1            1           60           60            1
         33            6            3            6            6            3            6          256           24          256           32           24            1            8            1            1           18           18            1
         35            6            3            6            6            3            6          256          192          256           32           32            1            8            1            1           18           18            1
         37            3            2            3            3            2            3          256           16          256           32           16            1            8            1            1            6            6            1
         39            3            2            3            3            2            3          256          128          256           32           32            1            8            1            1            6            6            1

    Processing Frame Number : 0

    Not belongs to this group!
    inPtrs 0x593ad0
     Layer    1 : Out Q :      254 , TIDL_BatchNormLayer  , PASSED  #MMACs =     0.74,     0.74, Sparsity :   0.00
     Layer    2 : Out Q :     6749 , TIDL_ConvolutionLayer, PASSED  #MMACs =   147.46,    87.74, Sparsity :  40.50
     Layer    3 : Out Q :     7904 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    53.82, Sparsity :  61.98
     Layer    4 : Out Q :    11927 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    95.11, Sparsity :  66.41
     Layer    5 : Out Q :    14430 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    67.46, Sparsity :  52.34
     Layer    6 : Out Q :    17579 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    99.79, Sparsity :  64.75
     Layer    7 : Out Q :    17542 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    56.52, Sparsity :  60.07
     Layer    8 : Out Q :    20111 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    90.46, Sparsity :  68.05
     Layer    9 : Out Q :    17480 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    51.04, Sparsity :  63.94
     Layer   10 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.06,     0.06, Sparsity :   0.00
     Layer   11 : Out Q :    23594 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    69.78, Sparsity :  75.35
     Layer   12 : Out Q :     8375 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    31.31, Sparsity :  77.88
     Layer   13 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.03,     0.03, Sparsity :   0.00
     Layer   14 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   15 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   16 : Out Q :    11730 , TIDL_ConvolutionLayer, PASSED  #MMACs =    62.91,    62.91, Sparsity :   0.00
     Layer   17 : Out Q :     8750 , TIDL_ConvolutionLayer, PASSED  #MMACs =    31.46,    31.46, Sparsity :   0.00
     Layer   18 : Out Q :     8418 , TIDL_ConvolutionLayer, PASSED  #MMACs =     7.86,     7.86, Sparsity :   0.00
     Layer   19 : Out Q :    24666 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.36,     2.36, Sparsity :   0.00
     Layer   20 : Out Q :    28673 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.79,     0.79, Sparsity :   0.00
     Layer   21 : Out Q :     4724 , TIDL_ConvolutionLayer, PASSED  #MMACs =     3.93,     3.93, Sparsity :   0.00
     Layer   22 :TIDL_FlattenLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
     Layer   23 : Out Q :     2637 , TIDL_ConvolutionLayer, PASSED  #MMACs =    27.53,    27.53, Sparsity :   0.00
     Layer   24 :TIDL_FlattenLayer, PASSED  #MMACs =     0.11,     0.11, Sparsity :   0.00
     Layer   25 : Out Q :     6462 , TIDL_ConvolutionLayer, PASSED  #MMACs =     1.47,     1.47, Sparsity :   0.00
     Layer   26 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   27 : Out Q :     2463 , TIDL_ConvolutionLayer, PASSED  #MMACs =    11.80,    11.80, Sparsity :   0.00
     Layer   28 :TIDL_FlattenLayer, PASSED  #MMACs =     0.04,     0.04, Sparsity :   0.00
     Layer   29 : Out Q :     9325 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.37,     0.37, Sparsity :   0.00
     Layer   30 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   31 : Out Q :     1843 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.95,     2.95, Sparsity :   0.00
     Layer   32 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   33 : Out Q :    12733 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.11,     0.11, Sparsity :   0.00
     Layer   34 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   35 : Out Q :     3238 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.88,     0.88, Sparsity :   0.00
     Layer   36 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   37 : Out Q :    13180 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
     Layer   38 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   39 : Out Q :     3513 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.20,     0.20, Sparsity :   0.00
     Layer   40 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   41 : Out Q :     4743 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity : -nan(ind)
     Layer   42 : Out Q :     1850 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity : -nan(ind)
     Layer   43 :
    Target: number label value xmin  ymin  xmax  ymax
    Target:  0.00  1.00  0.91  0.02  0.57  0.12  0.76
    Target:  1.00  3.00  0.81  0.02  0.57  0.17  0.75
    Target:  2.00  3.00  0.35  0.33  0.56  0.36  0.61
    Target:  3.00  3.00  0.26  0.87  0.55  1.00  0.84
    Target:  4.00  6.00  0.20  0.87  0.55  1.00  0.84
    Target:  5.00 13.00  0.59  0.82  0.54  0.89  0.85
    Target:  6.00 14.00  0.53  0.55  0.56  0.58  0.71
    Target:  7.00 14.00  0.20  0.30  0.56  0.33  0.68
     #MMACs =     0.00,     0.00, Sparsity :   0.00
    Not belongs to this group!
    End of config list found !

    and this infer tools result, first run on eve and then run on dsp:

    PS C:\Users\lh\Desktop\lh\SSD_JacintoNetV2\de_27k\infer> .\eve-release-infer.exe .\config_list.txt

    Processing config file .\config_tidl.txt !
    noZeroCoeffsPercentage = 100
    updateNetWithStats = 0
    rawImage = 1
    randInput = 0
    writeInput = 0
    writeOutput = 1
    compareRef = 0
    numFrames = 1
    preProcType = 4
    netBinFile = ..\tidl_net_jdetNet_ssd.bin
    outputNetBinFile =
    paramsBinFile = ..\tidl_param_jdetNet_ssd.bin
    inData = ..\trace_dump_0_768x320.y
    outData = .\stats_tool_out_eve.bin
    traceDumpBaseName = .\trace_dump_
    testCaseName =
    testCaseDesc =
    performanceTestcase = 0
    layersGroupId = 1
    writeQ = 1
    readQ = 1
    runFullNet = 0
    read layer 0 param
    read layer 1 param
    read layer 2 param
    read layer 3 param
    read layer 4 param
    read layer 5 param
    read layer 6 param
    read layer 7 param
    read layer 8 param
    read layer 9 param
    read layer 10 param
    read layer 11 param
    read layer 12 param
    read layer 13 param
    read layer 14 param
    read layer 15 param
    read layer 16 param
    read layer 17 param
    read layer 18 param
    read layer 19 param
    read layer 20 param
    read layer 21 param
    read layer 22 param
    read layer 23 param
    read layer 24 param
    read layer 25 param
    read layer 26 param
    read layer 27 param
    read layer 28 param
    read layer 29 param
    read layer 30 param
    read layer 31 param
    read layer 32 param
    read layer 33 param
    read layer 34 param
    read layer 35 param
    read layer 36 param
    read layer 37 param
    read layer 38 param
    read layer 39 param
    read layer 40 param
    read layer 41 param
    read layer 42 param
    read layer 43 param
    read layer 44 param

    weightsElementSize = 1
    slopeElementSize   = 1
    biasElementSize    = 2
    dataElementSize    = 1
    interElementSize   = 4
    quantizationStyle  = 1
    strideOffsetMethod = 0
    reserved           = 0

    Layer ID    ,inBlkWidth  ,inBlkHeight ,inBlkPitch  ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs    ,numOutChs   ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs  ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
          2           72           72           72           32           32           32            3           32            3            1            8            1            3           12            5         5184         1024            1
          3           40           34           40           32           32           32            8            8            8            4            8            1            2           12            5         1360         1024            1
          4           40           22           40           32           20           32           32           64           32            8            8            1            4            6            4          880          640            1
          5           40           22           40           32           20           32           16           16           16            8            8            1            2            6            4          880          640            1
          6           40           22           40           32           20           32           64          128           64            8            8            1            8            3            2          880          640            1
          7           40           22           40           32           20           32           32           32           32            8            8            1            4            3            2          880          640            1
          8           56           22           56           48           20           48          128          256          128            7            8            1           19            1            1         1232          960            1
          9           56           22           56           48           20           48           64           64           64            7            8            1           10            1            1         1232          960            1
         11           40           12           40           32           10           32          256          512          256            8            8            1           32            1            1          480          320            1
         12           40           12           40           32           10           32          128          128          128            8            8            1           16            1            1          480          320            1
         16           48            4           48           48            4           48          256          256          256           32            8            1            8            1            5          192          192            1
         17           24           10           24           24           10           24          512          256          512           32           32            1           16            1            1          240          240            1
         18           12            5           12           12            5           12          512          256          512           32           32            1           16            1            1           60           60            1
         19            6            3            6            6            3            6          512          256          512           32           32            1           16            1            1           18           18            1
         20            3            2            3            3            2            3          512          256          512           32           32            1           16            1            1            6            6            1
         21           48            4           48           48            4           48          256           16          256           32            8            1            8            1            5          192          192            1
         23           48            4           48           48            4           48          256          112          256           32            8            1            8            1            5          192          192            1
         25           24           10           24           24           10           24          256           24          256           32           24            1            8            1            1          240          240            1
         27           24           10           24           24           10           24          256          192          256           32           32            1            8            1            1          240          240            1
         29           12            5           12           12            5           12          256           24          256           32           24            1            8            1            1           60           60            1
         31           12            5           12           12            5           12          256          192          256           32           32            1            8            1            1           60           60            1
         33            6            3            6            6            3            6          256           24          256           32           24            1            8            1            1           18           18            1
         35            6            3            6            6            3            6          256          192          256           32           32            1            8            1            1           18           18            1
         37            3            2            3            3            2            3          256           16          256           32           16            1            8            1            1            6            6            1
         39            3            2            3            3            2            3          256          128          256           32           32            1            8            1            1            6            6            1

    Processing Frame Number : 0

    Not belongs to this group!
    inPtrs 0x1008ad0
     Layer    1 : Out Q :      254 , TIDL_BatchNormLayer  , PASSED  #MMACs =     0.74,     0.74, Sparsity :   0.00
     Layer    2 : Out Q :     6749 , TIDL_ConvolutionLayer, PASSED  #MMACs =   147.46,    87.74, Sparsity :  40.50
     Layer    3 : Out Q :     7904 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    53.82, Sparsity :  61.98
     Layer    4 : Out Q :    11927 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    95.11, Sparsity :  66.41
     Layer    5 : Out Q :    14430 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    67.46, Sparsity :  52.34
     Layer    6 : Out Q :    17579 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    99.79, Sparsity :  64.75
     Layer    7 : Out Q :    17542 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    56.52, Sparsity :  60.07
     Layer    8 : Out Q :    20111 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    90.46, Sparsity :  68.05
     Layer    9 : Out Q :    17480 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    51.04, Sparsity :  63.94
     Layer   10 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.06,     0.06, Sparsity :   0.00
     Layer   11 : Out Q :    23594 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    69.78, Sparsity :  75.35
     Layer   12 : Out Q :     8375 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    31.31, Sparsity :  77.88
     Layer   13 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.03,     0.03, Sparsity :   0.00
     Layer   14 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   15 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   16 : Out Q :    11730 , TIDL_ConvolutionLayer, PASSED  #MMACs =    62.91,    62.91, Sparsity :   0.00
     Layer   17 : Out Q :     8750 , TIDL_ConvolutionLayer, PASSED  #MMACs =    31.46,    31.46, Sparsity :   0.00
     Layer   18 : Out Q :     8418 , TIDL_ConvolutionLayer, PASSED  #MMACs =     7.86,     7.86, Sparsity :   0.00
     Layer   19 : Out Q :    24666 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.36,     2.36, Sparsity :   0.00
     Layer   20 : Out Q :    28673 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.79,     0.79, Sparsity :   0.00
     Layer   21 : Out Q :     4724 , TIDL_ConvolutionLayer, PASSED  #MMACs =     3.93,     3.93, Sparsity :   0.00
     Layer   22 :TIDL_FlattenLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
     Layer   23 : Out Q :     2637 , TIDL_ConvolutionLayer, PASSED  #MMACs =    27.53,    27.53, Sparsity :   0.00
     Layer   24 :TIDL_FlattenLayer, PASSED  #MMACs =     0.11,     0.11, Sparsity :   0.00
     Layer   25 : Out Q :     6462 , TIDL_ConvolutionLayer, PASSED  #MMACs =     1.47,     1.47, Sparsity :   0.00
     Layer   26 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   27 : Out Q :     2463 , TIDL_ConvolutionLayer, PASSED  #MMACs =    11.80,    11.80, Sparsity :   0.00
     Layer   28 :TIDL_FlattenLayer, PASSED  #MMACs =     0.04,     0.04, Sparsity :   0.00
     Layer   29 : Out Q :     9325 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.37,     0.37, Sparsity :   0.00
     Layer   30 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   31 : Out Q :     1843 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.95,     2.95, Sparsity :   0.00
     Layer   32 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   33 : Out Q :    12733 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.11,     0.11, Sparsity :   0.00
     Layer   34 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   35 : Out Q :     3238 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.88,     0.88, Sparsity :   0.00
     Layer   36 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   37 : Out Q :    13180 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
     Layer   38 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   39 : Out Q :     3513 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.20,     0.20, Sparsity :   0.00
     Layer   40 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   41 : Out Q :     4743 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity : -nan(ind)
     Layer   42 : Out Q :     1850 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity : -nan(ind)
    Not belongs to this group!
    Not belongs to this group!
    End of config list found !
    PS C:\Users\lh\Desktop\lh\SSD_JacintoNetV2\de_27k\infer> .\eve-release-infer.exe .\config_list.txt

    Processing config file .\config_tidl.txt !
    noZeroCoeffsPercentage = 100
    updateNetWithStats = 0
    rawImage = 1
    randInput = 0
    writeInput = 0
    writeOutput = 1
    compareRef = 0
    numFrames = 1
    preProcType = 4
    netBinFile = ..\tidl_net_jdetNet_ssd.bin
    outputNetBinFile =
    paramsBinFile = ..\tidl_param_jdetNet_ssd.bin
    inData = .\stats_tool_out_eve.bin
    outData = .\stats_tool_out.bin
    traceDumpBaseName = .\trace_dump_
    testCaseName =
    testCaseDesc =
    performanceTestcase = 0
    layersGroupId = 2
    writeQ = 1
    readQ = 1
    runFullNet = 0
    read layer 0 param
    read layer 1 param
    read layer 2 param
    read layer 3 param
    read layer 4 param
    read layer 5 param
    read layer 6 param
    read layer 7 param
    read layer 8 param
    read layer 9 param
    read layer 10 param
    read layer 11 param
    read layer 12 param
    read layer 13 param
    read layer 14 param
    read layer 15 param
    read layer 16 param
    read layer 17 param
    read layer 18 param
    read layer 19 param
    read layer 20 param
    read layer 21 param
    read layer 22 param
    read layer 23 param
    read layer 24 param
    read layer 25 param
    read layer 26 param
    read layer 27 param
    read layer 28 param
    read layer 29 param
    read layer 30 param
    read layer 31 param
    read layer 32 param
    read layer 33 param
    read layer 34 param
    read layer 35 param
    read layer 36 param
    read layer 37 param
    read layer 38 param
    read layer 39 param
    read layer 40 param
    read layer 41 param
    read layer 42 param
    read layer 43 param
    read layer 44 param

    weightsElementSize = 1
    slopeElementSize   = 1
    biasElementSize    = 2
    dataElementSize    = 1
    interElementSize   = 4
    quantizationStyle  = 1
    strideOffsetMethod = 0
    reserved           = 0


    Processing Frame Number : 0

    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    Not belongs to this group!
    inPtrs 0x1008ad0
    inPtrs 0x103b6c8
     Layer   43 :
    Target: number label value xmin  ymin  xmax  ymax
    Target:  0.00  1.00  0.91  0.02  0.57  0.12  0.76
    Target:  1.00  3.00  0.81  0.02  0.57  0.17  0.75
    Target:  2.00  3.00  0.35  0.33  0.56  0.36  0.61
    Target:  3.00  3.00  0.26  0.87  0.55  1.00  0.84
    Target:  4.00  6.00  0.20  0.87  0.55  1.00  0.84
    Target:  5.00 13.00  0.59  0.82  0.54  0.89  0.85
    Target:  6.00 14.00  0.53  0.55  0.56  0.58  0.71
    Target:  7.00 14.00  0.20  0.30  0.56  0.33  0.68
     #MMACs =     0.00,     0.00, Sparsity :   0.00
    Not belongs to this group!
    End of config list found !

  • Hi , Praveen:

    I also find a Unexplained problems, when I use the train_image_object_detection.sh to generate the deploy.txt , all floating point numbers become scientific counting representations. like confidence_threshold: should be 0.01 but my is 0.00999999977648. eps: should  be 0.0001 my is 9.99999974738e-05 .

    Why not? Will it matter?

    name: "ssdJacintoNetV2_deploy"
    input: "data"
    input_shape {
      dim: 1
      dim: 3
      dim: 320
      dim: 768
    }
    layer {
      name: "data/bias"
      type: "Bias"
      bottom: "data"
      top: "data/bias"
      param {
        lr_mult: 0.0
        decay_mult: 0.0
      }
      bias_param {
        filler {
          type: "constant"
          value: -128.0
        }
      }
    }
    layer {
      name: "conv1a"
      type: "Convolution"
      bottom: "data/bias"
      top: "conv1a"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 32
        bias_term: true
        pad: 2
        kernel_size: 5
        group: 1
        stride: 2
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "conv1a/bn"
      type: "BatchNorm"
      bottom: "conv1a"
      top: "conv1a"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "conv1a/relu"
      type: "ReLU"
      bottom: "conv1a"
      top: "conv1a"
    }
    layer {
      name: "conv1b"
      type: "Convolution"
      bottom: "conv1a"
      top: "conv1b"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 32
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 4
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "conv1b/bn"
      type: "BatchNorm"
      bottom: "conv1b"
      top: "conv1b"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "conv1b/relu"
      type: "ReLU"
      bottom: "conv1b"
      top: "conv1b"
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1b"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "res2a_branch2a"
      type: "Convolution"
      bottom: "pool1"
      top: "res2a_branch2a"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 64
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res2a_branch2a/bn"
      type: "BatchNorm"
      bottom: "res2a_branch2a"
      top: "res2a_branch2a"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res2a_branch2a/relu"
      type: "ReLU"
      bottom: "res2a_branch2a"
      top: "res2a_branch2a"
    }
    layer {
      name: "res2a_branch2b"
      type: "Convolution"
      bottom: "res2a_branch2a"
      top: "res2a_branch2b"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 64
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 4
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res2a_branch2b/bn"
      type: "BatchNorm"
      bottom: "res2a_branch2b"
      top: "res2a_branch2b"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res2a_branch2b/relu"
      type: "ReLU"
      bottom: "res2a_branch2b"
      top: "res2a_branch2b"
    }
    layer {
      name: "pool2"
      type: "Pooling"
      bottom: "res2a_branch2b"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "res3a_branch2a"
      type: "Convolution"
      bottom: "pool2"
      top: "res3a_branch2a"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 128
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res3a_branch2a/bn"
      type: "BatchNorm"
      bottom: "res3a_branch2a"
      top: "res3a_branch2a"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res3a_branch2a/relu"
      type: "ReLU"
      bottom: "res3a_branch2a"
      top: "res3a_branch2a"
    }
    layer {
      name: "res3a_branch2b"
      type: "Convolution"
      bottom: "res3a_branch2a"
      top: "res3a_branch2b"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 128
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 4
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res3a_branch2b/bn"
      type: "BatchNorm"
      bottom: "res3a_branch2b"
      top: "res3a_branch2b"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res3a_branch2b/relu"
      type: "ReLU"
      bottom: "res3a_branch2b"
      top: "res3a_branch2b"
    }
    layer {
      name: "pool3"
      type: "Pooling"
      bottom: "res3a_branch2b"
      top: "pool3"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "res4a_branch2a"
      type: "Convolution"
      bottom: "pool3"
      top: "res4a_branch2a"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res4a_branch2a/bn"
      type: "BatchNorm"
      bottom: "res4a_branch2a"
      top: "res4a_branch2a"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res4a_branch2a/relu"
      type: "ReLU"
      bottom: "res4a_branch2a"
      top: "res4a_branch2a"
    }
    layer {
      name: "res4a_branch2b"
      type: "Convolution"
      bottom: "res4a_branch2a"
      top: "res4a_branch2b"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 4
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res4a_branch2b/bn"
      type: "BatchNorm"
      bottom: "res4a_branch2b"
      top: "res4a_branch2b"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res4a_branch2b/relu"
      type: "ReLU"
      bottom: "res4a_branch2b"
      top: "res4a_branch2b"
    }
    layer {
      name: "pool4"
      type: "Pooling"
      bottom: "res4a_branch2b"
      top: "pool4"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "res5a_branch2a"
      type: "Convolution"
      bottom: "pool4"
      top: "res5a_branch2a"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 512
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res5a_branch2a/bn"
      type: "BatchNorm"
      bottom: "res5a_branch2a"
      top: "res5a_branch2a"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res5a_branch2a/relu"
      type: "ReLU"
      bottom: "res5a_branch2a"
      top: "res5a_branch2a"
    }
    layer {
      name: "res5a_branch2b"
      type: "Convolution"
      bottom: "res5a_branch2a"
      top: "res5a_branch2b"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 512
        bias_term: true
        pad: 1
        kernel_size: 3
        group: 4
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "res5a_branch2b/bn"
      type: "BatchNorm"
      bottom: "res5a_branch2b"
      top: "res5a_branch2b"
      batch_norm_param {
        moving_average_fraction: 0.990000009537
        eps: 9.99999974738e-05
        scale_bias: true
      }
    }
    layer {
      name: "res5a_branch2b/relu"
      type: "ReLU"
      bottom: "res5a_branch2b"
      top: "res5a_branch2b"
    }
    layer {
      name: "pool6"
      type: "Pooling"
      bottom: "res5a_branch2b"
      top: "pool6"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
        pad: 0
      }
    }
    layer {
      name: "pool7"
      type: "Pooling"
      bottom: "pool6"
      top: "pool7"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
        pad: 0
      }
    }
    layer {
      name: "pool8"
      type: "Pooling"
      bottom: "pool7"
      top: "pool8"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
        pad: 0
      }
    }
    layer {
      name: "ctx_output1"
      type: "Convolution"
      bottom: "res3a_branch2b"
      top: "ctx_output1"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output1/relu"
      type: "ReLU"
      bottom: "ctx_output1"
      top: "ctx_output1"
    }
    layer {
      name: "ctx_output2"
      type: "Convolution"
      bottom: "res5a_branch2b"
      top: "ctx_output2"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output2/relu"
      type: "ReLU"
      bottom: "ctx_output2"
      top: "ctx_output2"
    }
    layer {
      name: "ctx_output3"
      type: "Convolution"
      bottom: "pool6"
      top: "ctx_output3"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output3/relu"
      type: "ReLU"
      bottom: "ctx_output3"
      top: "ctx_output3"
    }
    layer {
      name: "ctx_output4"
      type: "Convolution"
      bottom: "pool7"
      top: "ctx_output4"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output4/relu"
      type: "ReLU"
      bottom: "ctx_output4"
      top: "ctx_output4"
    }
    layer {
      name: "ctx_output5"
      type: "Convolution"
      bottom: "pool8"
      top: "ctx_output5"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 256
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output5/relu"
      type: "ReLU"
      bottom: "ctx_output5"
      top: "ctx_output5"
    }
    layer {
      name: "ctx_output1/relu_mbox_loc"
      type: "Convolution"
      bottom: "ctx_output1"
      top: "ctx_output1/relu_mbox_loc"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 16
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output1/relu_mbox_loc_perm"
      type: "Permute"
      bottom: "ctx_output1/relu_mbox_loc"
      top: "ctx_output1/relu_mbox_loc_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output1/relu_mbox_loc_flat"
      type: "Flatten"
      bottom: "ctx_output1/relu_mbox_loc_perm"
      top: "ctx_output1/relu_mbox_loc_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output1/relu_mbox_conf"
      type: "Convolution"
      bottom: "ctx_output1"
      top: "ctx_output1/relu_mbox_conf"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 112
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output1/relu_mbox_conf_perm"
      type: "Permute"
      bottom: "ctx_output1/relu_mbox_conf"
      top: "ctx_output1/relu_mbox_conf_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output1/relu_mbox_conf_flat"
      type: "Flatten"
      bottom: "ctx_output1/relu_mbox_conf_perm"
      top: "ctx_output1/relu_mbox_conf_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output1/relu_mbox_priorbox"
      type: "PriorBox"
      bottom: "ctx_output1"
      bottom: "data"
      top: "ctx_output1/relu_mbox_priorbox"
      prior_box_param {
        min_size: 14.720000267
        max_size: 36.7999992371
        aspect_ratio: 2.0
        flip: true
        clip: false
        variance: 0.10000000149
        variance: 0.10000000149
        variance: 0.20000000298
        variance: 0.20000000298
        offset: 0.5
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_loc"
      type: "Convolution"
      bottom: "ctx_output2"
      top: "ctx_output2/relu_mbox_loc"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 24
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_loc_perm"
      type: "Permute"
      bottom: "ctx_output2/relu_mbox_loc"
      top: "ctx_output2/relu_mbox_loc_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_loc_flat"
      type: "Flatten"
      bottom: "ctx_output2/relu_mbox_loc_perm"
      top: "ctx_output2/relu_mbox_loc_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_conf"
      type: "Convolution"
      bottom: "ctx_output2"
      top: "ctx_output2/relu_mbox_conf"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 168
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_conf_perm"
      type: "Permute"
      bottom: "ctx_output2/relu_mbox_conf"
      top: "ctx_output2/relu_mbox_conf_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_conf_flat"
      type: "Flatten"
      bottom: "ctx_output2/relu_mbox_conf_perm"
      top: "ctx_output2/relu_mbox_conf_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output2/relu_mbox_priorbox"
      type: "PriorBox"
      bottom: "ctx_output2"
      bottom: "data"
      top: "ctx_output2/relu_mbox_priorbox"
      prior_box_param {
        min_size: 36.7999992371
        max_size: 110.400001526
        aspect_ratio: 2.0
        aspect_ratio: 3.0
        flip: true
        clip: false
        variance: 0.10000000149
        variance: 0.10000000149
        variance: 0.20000000298
        variance: 0.20000000298
        offset: 0.5
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_loc"
      type: "Convolution"
      bottom: "ctx_output3"
      top: "ctx_output3/relu_mbox_loc"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 24
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_loc_perm"
      type: "Permute"
      bottom: "ctx_output3/relu_mbox_loc"
      top: "ctx_output3/relu_mbox_loc_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_loc_flat"
      type: "Flatten"
      bottom: "ctx_output3/relu_mbox_loc_perm"
      top: "ctx_output3/relu_mbox_loc_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_conf"
      type: "Convolution"
      bottom: "ctx_output3"
      top: "ctx_output3/relu_mbox_conf"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 168
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_conf_perm"
      type: "Permute"
      bottom: "ctx_output3/relu_mbox_conf"
      top: "ctx_output3/relu_mbox_conf_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_conf_flat"
      type: "Flatten"
      bottom: "ctx_output3/relu_mbox_conf_perm"
      top: "ctx_output3/relu_mbox_conf_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output3/relu_mbox_priorbox"
      type: "PriorBox"
      bottom: "ctx_output3"
      bottom: "data"
      top: "ctx_output3/relu_mbox_priorbox"
      prior_box_param {
        min_size: 110.400001526
        max_size: 184.0
        aspect_ratio: 2.0
        aspect_ratio: 3.0
        flip: true
        clip: false
        variance: 0.10000000149
        variance: 0.10000000149
        variance: 0.20000000298
        variance: 0.20000000298
        offset: 0.5
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_loc"
      type: "Convolution"
      bottom: "ctx_output4"
      top: "ctx_output4/relu_mbox_loc"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 24
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_loc_perm"
      type: "Permute"
      bottom: "ctx_output4/relu_mbox_loc"
      top: "ctx_output4/relu_mbox_loc_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_loc_flat"
      type: "Flatten"
      bottom: "ctx_output4/relu_mbox_loc_perm"
      top: "ctx_output4/relu_mbox_loc_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_conf"
      type: "Convolution"
      bottom: "ctx_output4"
      top: "ctx_output4/relu_mbox_conf"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 168
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_conf_perm"
      type: "Permute"
      bottom: "ctx_output4/relu_mbox_conf"
      top: "ctx_output4/relu_mbox_conf_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_conf_flat"
      type: "Flatten"
      bottom: "ctx_output4/relu_mbox_conf_perm"
      top: "ctx_output4/relu_mbox_conf_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output4/relu_mbox_priorbox"
      type: "PriorBox"
      bottom: "ctx_output4"
      bottom: "data"
      top: "ctx_output4/relu_mbox_priorbox"
      prior_box_param {
        min_size: 184.0
        max_size: 257.600006104
        aspect_ratio: 2.0
        aspect_ratio: 3.0
        flip: true
        clip: false
        variance: 0.10000000149
        variance: 0.10000000149
        variance: 0.20000000298
        variance: 0.20000000298
        offset: 0.5
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_loc"
      type: "Convolution"
      bottom: "ctx_output5"
      top: "ctx_output5/relu_mbox_loc"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 16
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_loc_perm"
      type: "Permute"
      bottom: "ctx_output5/relu_mbox_loc"
      top: "ctx_output5/relu_mbox_loc_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_loc_flat"
      type: "Flatten"
      bottom: "ctx_output5/relu_mbox_loc_perm"
      top: "ctx_output5/relu_mbox_loc_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_conf"
      type: "Convolution"
      bottom: "ctx_output5"
      top: "ctx_output5/relu_mbox_conf"
      param {
        lr_mult: 1.0
        decay_mult: 1.0
      }
      param {
        lr_mult: 2.0
        decay_mult: 0.0
      }
      convolution_param {
        num_output: 112
        bias_term: true
        pad: 0
        kernel_size: 1
        group: 1
        stride: 1
        weight_filler {
          type: "msra"
        }
        bias_filler {
          type: "constant"
          value: 0.0
        }
        dilation: 1
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_conf_perm"
      type: "Permute"
      bottom: "ctx_output5/relu_mbox_conf"
      top: "ctx_output5/relu_mbox_conf_perm"
      permute_param {
        order: 0
        order: 2
        order: 3
        order: 1
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_conf_flat"
      type: "Flatten"
      bottom: "ctx_output5/relu_mbox_conf_perm"
      top: "ctx_output5/relu_mbox_conf_flat"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "ctx_output5/relu_mbox_priorbox"
      type: "PriorBox"
      bottom: "ctx_output5"
      bottom: "data"
      top: "ctx_output5/relu_mbox_priorbox"
      prior_box_param {
        min_size: 257.600006104
        max_size: 331.200012207
        aspect_ratio: 2.0
        flip: true
        clip: false
        variance: 0.10000000149
        variance: 0.10000000149
        variance: 0.20000000298
        variance: 0.20000000298
        offset: 0.5
      }
    }
    layer {
      name: "mbox_loc"
      type: "Concat"
      bottom: "ctx_output1/relu_mbox_loc_flat"
      bottom: "ctx_output2/relu_mbox_loc_flat"
      bottom: "ctx_output3/relu_mbox_loc_flat"
      bottom: "ctx_output4/relu_mbox_loc_flat"
      bottom: "ctx_output5/relu_mbox_loc_flat"
      top: "mbox_loc"
      concat_param {
        axis: 1
      }
    }
    layer {
      name: "mbox_conf"
      type: "Concat"
      bottom: "ctx_output1/relu_mbox_conf_flat"
      bottom: "ctx_output2/relu_mbox_conf_flat"
      bottom: "ctx_output3/relu_mbox_conf_flat"
      bottom: "ctx_output4/relu_mbox_conf_flat"
      bottom: "ctx_output5/relu_mbox_conf_flat"
      top: "mbox_conf"
      concat_param {
        axis: 1
      }
    }
    layer {
      name: "mbox_priorbox"
      type: "Concat"
      bottom: "ctx_output1/relu_mbox_priorbox"
      bottom: "ctx_output2/relu_mbox_priorbox"
      bottom: "ctx_output3/relu_mbox_priorbox"
      bottom: "ctx_output4/relu_mbox_priorbox"
      bottom: "ctx_output5/relu_mbox_priorbox"
      top: "mbox_priorbox"
      concat_param {
        axis: 2
      }
    }
    layer {
      name: "mbox_conf_reshape"
      type: "Reshape"
      bottom: "mbox_conf"
      top: "mbox_conf_reshape"
      reshape_param {
        shape {
          dim: 0
          dim: -1
          dim: 28
        }
      }
    }
    layer {
      name: "mbox_conf_softmax"
      type: "Softmax"
      bottom: "mbox_conf_reshape"
      top: "mbox_conf_softmax"
      softmax_param {
        axis: 2
      }
    }
    layer {
      name: "mbox_conf_flatten"
      type: "Flatten"
      bottom: "mbox_conf_softmax"
      top: "mbox_conf_flatten"
      flatten_param {
        axis: 1
      }
    }
    layer {
      name: "detection_out"
      type: "DetectionOutput"
      bottom: "mbox_loc"
      bottom: "mbox_conf_flatten"
      bottom: "mbox_priorbox"
      top: "detection_out"
      include {
        phase: TEST
      }
      detection_output_param {
        num_classes: 28
        share_location: true
        background_label_id: 0
        nms_param {
          nms_threshold: 0.449999988079
          top_k: 400
        }
        save_output_param {
          output_directory: ""
          output_name_prefix: "comp4_det_test_"
          output_format: "VOC"
          label_map_file: "labelmap.prototxt"
          name_size_file: "test_name_size.txt"
          num_test_image: 6005
        }
        code_type: CENTER_SIZE
        keep_top_k: 200
        confidence_threshold: 0.00999999977648
      }
    }

  • Ok, got it. 

    >> All the test use the same input image, why the result of my model on the EVM is different?

     Whats is the output you are getting in EVM? How different is that from import tool output?   

    Thanks,

    Praveen

  • 1、I used the openvx_tidl usecase to test the same one image,Score bar is obviously wrong:

    [IPU1-0]    119.151680 s: Thread #1: Create graph ...
    [IPU1-0]    119.152900 s: Thread #1: Create input and output tensors for node 1 ...
    [IPU1-0]    119.153876 s: Thread #1: Create node 1 ...
    [IPU1-0]    119.170103 s: Thread #1: Create output tensors for node 2 ...
    [IPU1-0]    119.170591 s: Thread #1: Create node 2 ...
    [IPU1-0]    119.186787 s:
    [IPU1-0]    119.186909 s: Thread #1: Verify graph ...
    [IPU1-0]    122.410388 s:
    [IPU1-0]    122.410541 s: Thread #1: Start graph ...
    [IPU1-0]    122.410693 s:
    [IPU1-0]    122.410846 s: Thread #1: Wait for graph ...
    [IPU1-0]    122.718019 s:
    [IPU1-0]    122.718141 s: Thread #1: Results
    [IPU1-0]    122.718233 s: ---------------------
    [IPU1-0]    122.718446 s:
    [IPU1-0]    122.718599 s: ObjId|label|score| xmin| ymin| xmax| ymax|
    [IPU1-0]    122.718782 s: ------------------------------------------
    [IPU1-0]    122.719056 s:     0|   13| 1.00| 0.43| 0.56| 0.45| 0.68|
    [IPU1-0]    122.719270 s:     1|   14| 1.00| 0.79| 0.70| 0.85| 0.79|
    [IPU1-0]    122.719514 s:     2|    2| 1.00| 0.12| 0.69| 0.18| 0.77|
    [IPU1-0]    122.719727 s:     3|   12| 1.00| 0.46| 0.60| 0.51| 0.68|
    [IPU1-0]    122.719971 s:     4|   12| 1.00| 0.78| 0.52| 0.84| 0.59|
    [IPU1-0]    122.720185 s:     5|   13| 1.00| 0.12| 0.90| 0.16| 1.01|
    [IPU1-0]    122.720429 s:     6|   12| 1.00| 0.46| 0.76| 0.50| 0.93|
    [IPU1-0]    122.720642 s:     7|   16| 1.00| 0.92|-0.01| 0.96| 0.10|
    [IPU1-0]    122.720886 s:     8|   12| 1.00| 0.77| 0.50| 0.85| 0.65|
    [IPU1-0]    122.721100 s:     9|   20| 1.00| 0.72| 0.16| 0.81| 0.33|
    [IPU1-0]    122.721344 s:    10|   17| 1.00| 0.46| 0.78| 0.50| 0.91|
    [IPU1-0]    122.721557 s:    11|    1| 1.00| 0.79| 0.51| 0.83| 0.62|
    [IPU1-0]    122.721801 s:    12|    4| 1.00| 0.92| 0.08| 0.97| 0.22|
    [IPU1-0]    122.722015 s:    13|    1| 1.00| 0.90| 0.90| 0.91| 0.93|
    [IPU1-0]    122.722259 s:    14|    2| 1.00| 0.90| 0.75| 0.91| 0.78|
    [IPU1-0]    122.722473 s:    15|   22| 1.00| 0.21| 0.74| 0.22| 0.79|
    [IPU1-0]    122.722686 s:    16|    6| 1.00| 0.23| 0.69| 0.25| 0.73|
    [IPU1-0]    122.723113 s:    17|    1| 1.00| 0.57| 0.60| 0.58| 0.71|
    [IPU1-0]    122.723357 s:    18|    1| 1.00| 0.23| 0.65| 0.25| 0.68|
    [IPU1-0]    122.723571 s:    19|    1| 1.00| 0.21| 0.64| 0.22| 0.68|
    [IPU1-0]    122.723632 s:
    [IPU1-0]    122.723784 s: Number of detected objects: 20
    [IPU1-0]    122.723845 s:
    [IPU1-0]    122.723937 s:
    [IPU1-0]    122.724120 s: ---- Thread #1: Node 1 (EVE-1) Execution time: 190.651000 ms
    [IPU1-0]    122.724364 s: ---- Thread #1: Node 2 (DSP-1) Execution time: 116.187000 ms
    [IPU1-0]    122.724608 s: ---- Thread #1: Total Graph Execution time: 307.290000 ms

    2、the last layer output ---trace_dump_49_560x1.y_float.txt generated by import tools and infer tools are all like that below, is that correct?

    -nan(ind)
    -nan(ind)
    -nan(ind)
    -nan(ind)
    -nan(ind)
    -nan(ind)
    -inf
    inf
    -nan(ind)
    inf
    inf
    inf
    inf
    inf
    -inf
    inf
    inf
    inf
    inf
    inf
    -inf
    -inf
    -inf
    inf
    inf
    -inf
    inf
    inf
    -nan(ind)
    -nan(ind)
    -inf
    inf
    -nan(ind)
    -nan(ind)
    inf

  • Hi,

    Can you share with us your deploy.prototxt and voc0712_ssdJacintoNetV2_iter_120000_spare.caffemodel  model as well as your input data and expected output data ? You can share them with us in private if you accept my invitation to connect.

    regards,

    Victor

  • Hi Victor,

    I have sent you a private message, please check!

    Thank you!

  • Hi,

    Yes thank you, I received it !

    regards,

    Victor

  • Is there any result of the model I sent?

  • Hi,

    We are looking at it right now.

    regards,

    Victor

  • Hi Focus lee,

    I took your model from Victor for debugging and found that issue is because of 28 class detection, currently TIDL only supports up to 21 class detection. The import tool will work for any number of classes, but inference on DSP is optimized for maximum of 21 classes.

    Could you please try with reducing the detection classes to 21 and check. We will extended the support for more (>21) classes in the next release.

    Thanks,

    Praveen

  • OK , thank you !

    I find that you have released a new version VSDK0308 . Is there any extended of this function in this version?

  • No,because the code freeze happened long before we fixed this issue. 

    Thanks,

    Praveen