TDA2EXEVM: TIDL run ssd model issues

ZhangQiang

Part Number: TDA2EXEVM

I trained a demo ssd model using caffe-jacinto("voc0712od-ssd512x512_jdetnet21v2_iter_120000.caffemodel").

Because of some unsupported layers, i delete last few layers in deploy prototxt file("deploy_1024x512_delete_layers.prototxt")

Then i import it to TIDL. ("import.log")

Issues:

1、There are something like "Max PASS : -2147483648", the value is NOT right, why?

2、I matched the results layer by layer between TIDL and caffe-jacinto, BUT the results of last layers like "ctx_output1/relu_mbox_loc" or "ctx_output1/relu_mbox_conf" are wrong, the values are always "255" or "0".

I checked the input of "ctx_output1/relu_mbox_loc" or "ctx_output1/relu_mbox_conf" and it is right, BUT the output is wrong.

I don't know why TIDL produce this result.

deploy file:

name: "jdetnet21v2_deploy"
input: "data"
input_shape {
dim: 1
dim: 3
dim: 512
dim: 1024
}
layer {
name: "data/bias"
type: "Bias"
bottom: "data"
top: "data/bias"
param {
lr_mult: 0
decay_mult: 0
}
bias_param {
filler {
type: "constant"
value: -128
}
}
}
layer {
name: "conv1a"
type: "Convolution"
bottom: "data/bias"
top: "conv1a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
bias_term: true
pad: 2
kernel_size: 5
group: 1
stride: 2
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "conv1a/bn"
type: "BatchNorm"
bottom: "conv1a"
top: "conv1a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "conv1a/relu"
type: "ReLU"
bottom: "conv1a"
top: "conv1a"
}
layer {
name: "conv1b"
type: "Convolution"
bottom: "conv1a"
top: "conv1b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "conv1b/bn"
type: "BatchNorm"
bottom: "conv1b"
top: "conv1b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "conv1b/relu"
type: "ReLU"
bottom: "conv1b"
top: "conv1b"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1b"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res2a_branch2a"
type: "Convolution"
bottom: "pool1"
top: "res2a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res2a_branch2a/bn"
type: "BatchNorm"
bottom: "res2a_branch2a"
top: "res2a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res2a_branch2a/relu"
type: "ReLU"
bottom: "res2a_branch2a"
top: "res2a_branch2a"
}
layer {
name: "res2a_branch2b"
type: "Convolution"
bottom: "res2a_branch2a"
top: "res2a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res2a_branch2b/bn"
type: "BatchNorm"
bottom: "res2a_branch2b"
top: "res2a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res2a_branch2b/relu"
type: "ReLU"
bottom: "res2a_branch2b"
top: "res2a_branch2b"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "res2a_branch2b"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res3a_branch2a"
type: "Convolution"
bottom: "pool2"
top: "res3a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res3a_branch2a/bn"
type: "BatchNorm"
bottom: "res3a_branch2a"
top: "res3a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res3a_branch2a/relu"
type: "ReLU"
bottom: "res3a_branch2a"
top: "res3a_branch2a"
}
layer {
name: "res3a_branch2b"
type: "Convolution"
bottom: "res3a_branch2a"
top: "res3a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res3a_branch2b/bn"
type: "BatchNorm"
bottom: "res3a_branch2b"
top: "res3a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res3a_branch2b/relu"
type: "ReLU"
bottom: "res3a_branch2b"
top: "res3a_branch2b"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "res3a_branch2b"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res4a_branch2a"
type: "Convolution"
bottom: "pool3"
top: "res4a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res4a_branch2a/bn"
type: "BatchNorm"
bottom: "res4a_branch2a"
top: "res4a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res4a_branch2a/relu"
type: "ReLU"
bottom: "res4a_branch2a"
top: "res4a_branch2a"
}
layer {
name: "res4a_branch2b"
type: "Convolution"
bottom: "res4a_branch2a"
top: "res4a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res4a_branch2b/bn"
type: "BatchNorm"
bottom: "res4a_branch2b"
top: "res4a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res4a_branch2b/relu"
type: "ReLU"
bottom: "res4a_branch2b"
top: "res4a_branch2b"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "res4a_branch2b"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 1
stride: 1
}
}
layer {
name: "res5a_branch2a"
type: "Convolution"
bottom: "pool4"
top: "res5a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 2
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 2
}
}
layer {
name: "res5a_branch2a/bn"
type: "BatchNorm"
bottom: "res5a_branch2a"
top: "res5a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res5a_branch2a/relu"
type: "ReLU"
bottom: "res5a_branch2a"
top: "res5a_branch2a"
}
layer {
name: "res5a_branch2b"
type: "Convolution"
bottom: "res5a_branch2a"
top: "res5a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 2
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 2
}
}
layer {
name: "res5a_branch2b/bn"
type: "BatchNorm"
bottom: "res5a_branch2b"
top: "res5a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "res5a_branch2b/relu"
type: "ReLU"
bottom: "res5a_branch2b"
top: "res5a_branch2b"
}
layer {
name: "pool6"
type: "Pooling"
bottom: "res5a_branch2b"
top: "pool6"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
pad: 1
}
}
layer {
name: "pool7"
type: "Pooling"
bottom: "pool6"
top: "pool7"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
pad: 1
}
}
layer {
name: "pool8"
type: "Pooling"
bottom: "pool7"
top: "pool8"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
pad: 1
}
}
layer {
name: "ctx_output1"
type: "Convolution"
bottom: "res5a_branch2b"
top: "ctx_output1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output1/bn"
type: "BatchNorm"
bottom: "ctx_output1"
top: "ctx_output1"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "ctx_output1/relu"
type: "ReLU"
bottom: "ctx_output1"
top: "ctx_output1"
}
layer {
name: "ctx_output2"
type: "Convolution"
bottom: "pool6"
top: "ctx_output2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output2/bn"
type: "BatchNorm"
bottom: "ctx_output2"
top: "ctx_output2"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "ctx_output2/relu"
type: "ReLU"
bottom: "ctx_output2"
top: "ctx_output2"
}
layer {
name: "ctx_output3"
type: "Convolution"
bottom: "pool7"
top: "ctx_output3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output3/bn"
type: "BatchNorm"
bottom: "ctx_output3"
top: "ctx_output3"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "ctx_output3/relu"
type: "ReLU"
bottom: "ctx_output3"
top: "ctx_output3"
}
layer {
name: "ctx_output4"
type: "Convolution"
bottom: "pool8"
top: "ctx_output4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output4/bn"
type: "BatchNorm"
bottom: "ctx_output4"
top: "ctx_output4"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
scale_bias: true
}
}
layer {
name: "ctx_output4/relu"
type: "ReLU"
bottom: "ctx_output4"
top: "ctx_output4"
}
layer {
name: "ctx_output1/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output1"
top: "ctx_output1/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output1/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output1"
top: "ctx_output1/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 84
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output2/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output2"
top: "ctx_output2/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output2/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output2"
top: "ctx_output2/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 84
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output3/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output3"
top: "ctx_output3/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output3/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output3"
top: "ctx_output3/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 84
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output4/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output4"
top: "ctx_output4/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output4/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output4"
top: "ctx_output4/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 84
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}

import file:

Caffe Network File : deploy_1024x512_delete_layers.prototxt
Caffe Model File : voc0712od-ssd512x512_jdetnet21v2_iter_120000.caffemodel
TIDL Network File : NET.BIN
TIDL Model File : PRM.BIN
Name of the Network : jdetnet21v2_deploy
Num Inputs : 1
Num of Layer Detected : 28
0, TIDL_DataLayer , data 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 512 , 1024 , 0 ,
1, TIDL_BatchNormLayer , data/bias 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 512 , 1024 , 1 , 3 , 512 , 1024 , 1572864 ,
2, TIDL_ConvolutionLayer , conv1a 1, 1 , 1 , 1 , x , x , x , x , x , x , x , 2 , 1 , 3 , 512 , 1024 , 1 , 32 , 256 , 512 , 314572800 ,
3, TIDL_ConvolutionLayer , conv1b 1, 1 , 1 , 2 , x , x , x , x , x , x , x , 3 , 1 , 32 , 256 , 512 , 1 , 32 , 128 , 256 , 301989888 ,
4, TIDL_ConvolutionLayer , res2a_branch2a 1, 1 , 1 , 3 , x , x , x , x , x , x , x , 4 , 1 , 32 , 128 , 256 , 1 , 64 , 128 , 256 , 603979776 ,
5, TIDL_ConvolutionLayer , res2a_branch2b 1, 1 , 1 , 4 , x , x , x , x , x , x , x , 5 , 1 , 64 , 128 , 256 , 1 , 64 , 64 , 128 , 301989888 ,
6, TIDL_ConvolutionLayer , res3a_branch2a 1, 1 , 1 , 5 , x , x , x , x , x , x , x , 6 , 1 , 64 , 64 , 128 , 1 , 128 , 64 , 128 , 603979776 ,
7, TIDL_ConvolutionLayer , res3a_branch2b 1, 1 , 1 , 6 , x , x , x , x , x , x , x , 7 , 1 , 128 , 64 , 128 , 1 , 128 , 32 , 64 , 301989888 ,
8, TIDL_ConvolutionLayer , res4a_branch2a 1, 1 , 1 , 7 , x , x , x , x , x , x , x , 8 , 1 , 128 , 32 , 64 , 1 , 256 , 32 , 64 , 603979776 ,
9, TIDL_ConvolutionLayer , res4a_branch2b 1, 1 , 1 , 8 , x , x , x , x , x , x , x , 9 , 1 , 256 , 32 , 64 , 1 , 256 , 32 , 64 , 301989888 ,
10, TIDL_PoolingLayer , pool4 1, 1 , 1 , 9 , x , x , x , x , x , x , x , 10 , 1 , 256 , 32 , 64 , 1 , 256 , 32 , 64 , 524288 ,
11, TIDL_ConvolutionLayer , res5a_branch2a 1, 1 , 1 , 10 , x , x , x , x , x , x , x , 11 , 1 , 256 , 32 , 64 , 1 , 512 , 32 , 64 ,2415919104 ,
12, TIDL_ConvolutionLayer , res5a_branch2b 1, 1 , 1 , 11 , x , x , x , x , x , x , x , 12 , 1 , 512 , 32 , 64 , 1 , 512 , 32 , 64 ,1207959552 ,
13, TIDL_PoolingLayer , pool6 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 13 , 1 , 512 , 32 , 64 , 1 , 512 , 17 , 33 , 2585088 ,
14, TIDL_PoolingLayer , pool7 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 14 , 1 , 512 , 17 , 33 , 1 , 512 , 9 , 17 , 705024 ,
15, TIDL_PoolingLayer , pool8 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 15 , 1 , 512 , 9 , 17 , 1 , 512 , 5 , 9 , 207360 ,
16, TIDL_ConvolutionLayer , ctx_output1 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 16 , 1 , 512 , 32 , 64 , 1 , 512 , 32 , 64 , 536870912 ,
17, TIDL_ConvolutionLayer , ctx_output2 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 17 , 1 , 512 , 17 , 33 , 1 , 512 , 17 , 33 , 147062784 ,
18, TIDL_ConvolutionLayer , ctx_output3 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 18 , 1 , 512 , 9 , 17 , 1 , 512 , 9 , 17 , 40108032 ,
19, TIDL_ConvolutionLayer , ctx_output4 1, 1 , 1 , 15 , x , x , x , x , x , x , x , 19 , 1 , 512 , 5 , 9 , 1 , 512 , 5 , 9 , 11796480 ,
20, TIDL_ConvolutionLayer , ctx_output1/relu_mbox_loc 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 20 , 1 , 512 , 32 , 64 , 1 , 16 , 32 , 64 , 150994944 ,
21, TIDL_ConvolutionLayer , ctx_output1/relu_mbox_conf 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 21 , 1 , 512 , 32 , 64 , 1 , 84 , 32 , 64 , 792723456 ,
22, TIDL_ConvolutionLayer , ctx_output2/relu_mbox_loc 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 22 , 1 , 512 , 17 , 33 , 1 , 16 , 17 , 33 , 41361408 ,
23, TIDL_ConvolutionLayer , ctx_output2/relu_mbox_conf 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 23 , 1 , 512 , 17 , 33 , 1 , 84 , 17 , 33 , 217147392 ,
24, TIDL_ConvolutionLayer , ctx_output3/relu_mbox_loc 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 24 , 1 , 512 , 9 , 17 , 1 , 16 , 9 , 17 , 11280384 ,
25, TIDL_ConvolutionLayer , ctx_output3/relu_mbox_conf 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 25 , 1 , 512 , 9 , 17 , 1 , 84 , 9 , 17 , 59222016 ,
26, TIDL_ConvolutionLayer , ctx_output4/relu_mbox_loc 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 26 , 1 , 512 , 5 , 9 , 1 , 16 , 5 , 9 , 3317760 ,
27, TIDL_ConvolutionLayer , ctx_output4/relu_mbox_conf 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 27 , 1 , 512 , 5 , 9 , 1 , 84 , 5 , 9 , 17418240 ,

Processing config file .\tempDir\qunat_stats_config.txt !
0, TIDL_DataLayer , 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 512 , 1024 ,
1, TIDL_BatchNormLayer , 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 512 , 1024 , 1 , 3 , 512 , 1024 ,
2, TIDL_ConvolutionLayer , 1, 1 , 1 , 1 , x , x , x , x , x , x , x , 2 , 1 , 3 , 512 , 1024 , 1 , 32 , 256 , 512 ,
3, TIDL_ConvolutionLayer , 1, 1 , 1 , 2 , x , x , x , x , x , x , x , 3 , 1 , 32 , 256 , 512 , 1 , 32 , 128 , 256 ,
4, TIDL_ConvolutionLayer , 1, 1 , 1 , 3 , x , x , x , x , x , x , x , 4 , 1 , 32 , 128 , 256 , 1 , 64 , 128 , 256 ,
5, TIDL_ConvolutionLayer , 1, 1 , 1 , 4 , x , x , x , x , x , x , x , 5 , 1 , 64 , 128 , 256 , 1 , 64 , 64 , 128 ,
6, TIDL_ConvolutionLayer , 1, 1 , 1 , 5 , x , x , x , x , x , x , x , 6 , 1 , 64 , 64 , 128 , 1 , 128 , 64 , 128 ,
7, TIDL_ConvolutionLayer , 1, 1 , 1 , 6 , x , x , x , x , x , x , x , 7 , 1 , 128 , 64 , 128 , 1 , 128 , 32 , 64 ,
8, TIDL_ConvolutionLayer , 1, 1 , 1 , 7 , x , x , x , x , x , x , x , 8 , 1 , 128 , 32 , 64 , 1 , 256 , 32 , 64 ,
9, TIDL_ConvolutionLayer , 1, 1 , 1 , 8 , x , x , x , x , x , x , x , 9 , 1 , 256 , 32 , 64 , 1 , 256 , 32 , 64 ,
10, TIDL_PoolingLayer , 1, 1 , 1 , 9 , x , x , x , x , x , x , x , 10 , 1 , 256 , 32 , 64 , 1 , 256 , 32 , 64 ,
11, TIDL_ConvolutionLayer , 1, 1 , 1 , 10 , x , x , x , x , x , x , x , 11 , 1 , 256 , 32 , 64 , 1 , 512 , 32 , 64 ,
12, TIDL_ConvolutionLayer , 1, 1 , 1 , 11 , x , x , x , x , x , x , x , 12 , 1 , 512 , 32 , 64 , 1 , 512 , 32 , 64 ,
13, TIDL_PoolingLayer , 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 13 , 1 , 512 , 32 , 64 , 1 , 512 , 17 , 33 ,
14, TIDL_PoolingLayer , 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 14 , 1 , 512 , 17 , 33 , 1 , 512 , 9 , 17 ,
15, TIDL_PoolingLayer , 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 15 , 1 , 512 , 9 , 17 , 1 , 512 , 5 , 9 ,
16, TIDL_ConvolutionLayer , 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 16 , 1 , 512 , 32 , 64 , 1 , 512 , 32 , 64 ,
17, TIDL_ConvolutionLayer , 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 17 , 1 , 512 , 17 , 33 , 1 , 512 , 17 , 33 ,
18, TIDL_ConvolutionLayer , 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 18 , 1 , 512 , 9 , 17 , 1 , 512 , 9 , 17 ,
19, TIDL_ConvolutionLayer , 1, 1 , 1 , 15 , x , x , x , x , x , x , x , 19 , 1 , 512 , 5 , 9 , 1 , 512 , 5 , 9 ,
20, TIDL_ConvolutionLayer , 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 20 , 1 , 512 , 32 , 64 , 1 , 16 , 32 , 64 ,
21, TIDL_ConvolutionLayer , 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 21 , 1 , 512 , 32 , 64 , 1 , 84 , 32 , 64 ,
22, TIDL_ConvolutionLayer , 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 22 , 1 , 512 , 17 , 33 , 1 , 16 , 17 , 33 ,
23, TIDL_ConvolutionLayer , 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 23 , 1 , 512 , 17 , 33 , 1 , 84 , 17 , 33 ,
24, TIDL_ConvolutionLayer , 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 24 , 1 , 512 , 9 , 17 , 1 , 16 , 9 , 17 ,
25, TIDL_ConvolutionLayer , 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 25 , 1 , 512 , 9 , 17 , 1 , 84 , 9 , 17 ,
26, TIDL_ConvolutionLayer , 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 26 , 1 , 512 , 5 , 9 , 1 , 16 , 5 , 9 ,
27, TIDL_ConvolutionLayer , 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 27 , 1 , 512 , 5 , 9 , 1 , 84 , 5 , 9 ,
28, TIDL_DataLayer , 0, 1 , -1 , 27 , x , x , x , x , x , x , x , 0 , 1 , 84 , 5 , 9 , 0 , 0 , 0 , 0 ,
Layer ID ,inBlkWidth ,inBlkHeight ,inBlkPitch ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs ,numOutChs ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
2 72 72 72 32 32 32 3 32 3 1 8 1 3 16 8 5184 1024 1
3 40 34 40 32 32 32 8 8 8 4 8 1 2 16 8 1360 1024 1
4 40 34 40 32 32 32 32 64 32 6 8 1 6 8 4 1360 1024 1
5 40 34 40 32 32 32 16 16 16 6 8 1 3 8 4 1360 1024 1
6 40 34 40 32 32 32 64 128 64 6 8 1 11 4 2 1360 1024 1
7 40 34 40 32 32 32 32 32 32 6 8 1 6 4 2 1360 1024 1
8 40 34 40 32 32 32 128 256 128 6 8 1 22 2 1 1360 1024 1
9 40 34 40 32 32 32 64 64 64 6 8 1 11 2 1 1360 1024 1
11 40 20 40 32 16 32 256 512 256 8 8 1 32 2 2 800 512 1
12 40 36 40 32 32 32 128 128 128 5 8 1 26 2 1 1440 1024 1
16 32 16 32 32 16 32 512 512 512 8 8 1 64 2 2 512 512 1
17 32 17 32 32 17 32 512 512 512 8 8 1 64 2 1 544 544 1
18 32 9 32 32 9 32 512 512 512 8 8 1 64 1 1 288 288 1
19 16 5 16 16 5 16 512 512 512 8 8 1 64 1 1 80 80 1
20 40 18 40 32 16 32 512 16 256 8 2 2 32 2 2 720 512 1
21 40 18 40 32 16 32 512 84 256 8 2 2 32 2 2 720 512 1
22 40 18 40 32 16 32 512 16 256 8 2 2 32 2 2 720 512 1
23 40 18 40 32 16 32 512 84 256 8 2 2 32 2 2 720 512 1
24 40 11 40 32 9 32 512 16 512 8 8 1 64 1 1 440 288 1
25 40 11 40 32 9 32 512 84 512 8 8 1 64 1 1 440 288 1
26 24 7 24 16 5 16 512 16 512 8 8 1 64 1 1 168 80 1
27 24 7 24 16 5 16 512 84 512 8 8 1 64 1 1 168 80 1

Processing Frame Number : 0

Layer 1 : Max PASS : -2147483648 : 15301 Out Q : 254 , 43861, TIDL_BatchNormLayer, PASSED #MMACs = 1.57, 0.00, 1.57, Sparsity : 0.00, 100.00
Layer 2 : Max PASS : -2147483648 : 105994 Out Q : 11467 , 106410, TIDL_ConvolutionLayer, PASSED #MMACs = 314.57, 279.97, 300.94, Sparsity : 4.33, 11.00
Layer 3 : Max PASS : -2147483648 : 19171 Out Q : 11355 , 19246, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 261.88, 272.63, Sparsity : 9.72, 13.28
Layer 4 : Max PASS : -2147483648 : 48257 Out Q : 22744 , 48446, TIDL_ConvolutionLayer, PASSED #MMACs = 603.98, 558.43, 578.81, Sparsity : 4.17, 7.54
Layer 5 : Max PASS : -2147483648 : 41123 Out Q : 15555 , 41284, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 284.26, 292.16, Sparsity : 3.26, 5.87
Layer 6 : Max PASS : -2147483648 : 71895 Out Q : 19140 , 72177, TIDL_ConvolutionLayer, PASSED #MMACs = 603.98, 563.72, 581.21, Sparsity : 3.77, 6.67
Layer 7 : Max PASS : -2147483648 : 57803 Out Q : 18455 , 58030, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 288.17, 298.35, Sparsity : 1.20, 4.58
Layer 8 : Max PASS : -2147483648 : 112088 Out Q : 20457 , 112528, TIDL_ConvolutionLayer, PASSED #MMACs = 603.98, 571.37, 588.81, Sparsity : 2.51, 5.40
Layer 9 : Max PASS : -2147483648 : 57972 Out Q : 22662 , 58199, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 287.32, 296.04, Sparsity : 1.97, 4.86
Layer 10 :TIDL_PoolingLayer, PASSED #MMACs = 0.52, 0.00, 0.52, Sparsity : 0.00, 100.00
Layer 11 : Max PASS : -2147483648 : 52315 Out Q : 39664 , 52520, TIDL_ConvolutionLayer, PASSED #MMACs = 2415.92, 2246.56, 2297.30, Sparsity : 4.91, 7.01
Layer 12 : Max PASS : -2147483648 : 50735 Out Q : 3152 , 50934, TIDL_ConvolutionLayer, PASSED #MMACs = 1207.96, 1094.96, 1135.71, Sparsity : 5.98, 9.35
Layer 13 :TIDL_PoolingLayer, PASSED #MMACs = 0.29, 0.00, 0.29, Sparsity : 0.00, 100.00
Layer 14 :TIDL_PoolingLayer, PASSED #MMACs = 0.08, 0.00, 0.08, Sparsity : 0.00, 100.00
Layer 15 :TIDL_PoolingLayer, PASSED #MMACs = 0.02, 0.00, 0.02, Sparsity : 0.00, 100.00
Layer 16 : Max PASS : -2147483648 : 42410 Out Q : 22305 , 42576, TIDL_ConvolutionLayer, PASSED #MMACs = 536.87, 512.22, 536.80, Sparsity : 0.01, 4.59
Layer 17 : Max PASS : -2147483648 : 51785 Out Q : 28329 , 51988, TIDL_ConvolutionLayer, PASSED #MMACs = 147.06, 140.29, 147.03, Sparsity : 0.02, 4.61
Layer 18 : Max PASS : -2147483648 : 54221 Out Q : 25702 , 54434, TIDL_ConvolutionLayer, PASSED #MMACs = 40.11, 38.12, 40.10, Sparsity : 0.02, 4.95
Layer 19 : Max PASS : -2147483648 : 63420 Out Q : 24673 , 63669, TIDL_ConvolutionLayer, PASSED #MMACs = 11.80, 11.33, 11.80, Sparsity : 0.01, 3.92
Layer 20 : Max PASS : -2147483648 : 93195 Out Q : 3094 , 353782, TIDL_ConvolutionLayer, PASSED #MMACs = 150.99, 126.10, 129.25, Sparsity : 14.40, 16.49
Layer 21 : Max PASS : -2147483648 : 717460 Out Q : 1965 , 1446219, TIDL_ConvolutionLayer, PASSED #MMACs = 792.72, 734.32, 750.56, Sparsity : 5.32, 7.37
Layer 22 : Max PASS : -2147483648 : 100259 Out Q : 7823 , 302286, TIDL_ConvolutionLayer, PASSED #MMACs = 41.36, 36.99, 37.83, Sparsity : 8.53, 10.57
Layer 23 : Max PASS : -2147483648 : 661541 Out Q : 2025 , 1333500, TIDL_ConvolutionLayer, PASSED #MMACs = 217.15, 197.41, 201.89, Sparsity : 7.03, 9.09
Layer 24 : Max PASS : -2147483648 : 132514 Out Q : 9805 , 267115, TIDL_ConvolutionLayer, PASSED #MMACs = 11.28, 10.32, 10.56, Sparsity : 6.39, 8.51
Layer 25 : Max PASS : -2147483648 : 618447 Out Q : 2022 , 1246633, TIDL_ConvolutionLayer, PASSED #MMACs = 59.22, 57.15, 58.45, Sparsity : 1.30, 3.50
Layer 26 : Max PASS : -2147483648 : 186789 Out Q : 10431 , 376520, TIDL_ConvolutionLayer, PASSED #MMACs = 3.32, 3.08, 3.15, Sparsity : 5.03, 7.15
Layer 27 : Max PASS : -2147483648 : 837833 Out Q : 1679 , 1688860, TIDL_ConvolutionLayer, PASSED #MMACs = 17.42, 16.92, 17.28, Sparsity : 0.78, 2.87
End of config list found !

Total Giga Macs : 8.9932
Total Giga Macs : 134.8987 @15 fps
Total Giga Macs : 269.7975 @30 fps

over 8 years ago

0 Praveen Eppa over 8 years ago

Expert 1420 points

Hi,

For issue 1: You can ignore "Max PASS" values while import, you will get proper values for Max PASS while actual TIDL execution.

For issue 2: We will replicate issue at our end and let you know

Thanks,
Praveen

0 Praveen Eppa over 8 years ago in reply to Praveen Eppa

Expert 1420 points

Hi ,

We are not able to re-produce this issue. We are getting right output at this layer. Can you clarify below.

Are you observing the below log during import or actual processing.?

Are you observing mismatch in the trace generated by the import also? if yes,are you using the import executable (tidlStatsTool = "..\quantStatsTool\eve_test_dl_algo.out.exe") from the release package or did you rebuilt it?

Layer 1 : Max PASS : -2147483648 : 15301 Out Q : 254 , 43861, TIDL_BatchNormLayer, PASSED #MMACs = 1.57, 0.00, 1.57, Sparsity : 0.00, 100.00

Layer 2 : Max PASS : -2147483648 : 105994 Out Q : 11467 , 106410, TIDL_ConvolutionLayer, PASSED #MMACs = 314.57, 279.97, 300.94, Sparsity : 4.33, 11.00

Thanks,

Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa

Intellectual 415 points

Hi Praveen,
This is my import txt, I don't rebuild the tidlStatsTool and you can see i am using vsdk_03_01.
my log is same with you.
by the way, I build EVE debug mode(TIDL File I/O Usecase) and use CCS to match results.
Are you using my deploy file?

# Default - 0
randParams = 0

# 0: Caffe, 1: TensorFlow, Default - 0
modelType = 0

# 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
quantizationStyle = 1

# quantRoundAdd/100 will be added while rounding to integer, Default - 50
quantRoundAdd = 25

numParamBits = 8
# 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
inElementType = 0

inputNetFile = "deploy_1024x512_delete_layers.prototxt"
inputParamsFile = "voc0712od-ssd512x512_jdetnet21v2_iter_120000.caffemodel"

outputNetFile = "NET.BIN"
outputParamsFile = "PRM.BIN"

rawSampleInData = 1
sampleInData = "000100_1024x512_bgr.y"
tidlStatsTool = "C:\PROCESSOR_SDK_VISION_03_01_00_00\ti_components\algorithms\REL.TIDL.00.08.00.00\modules\ti_dl\utils\quantStatsTool\eve_test_dl_algo.out.exe"

0 Praveen Eppa over 8 years ago in reply to ZhangQiang

Expert 1420 points

Hi,

We are still not able to replicate issue with the files you shared.
I am using all the files that you shared and pre-built executables in release
1. eve_test_dl_algo.out.exe ("..\quantStatsTool\eve_test_dl_algo.out.exe")
2. tidl_model_import.out.exe("ti_dl\utils\tidlModelImport\ tidl_model_import.out.exe), to run the import and attached my import console.

Can you please re-try with both pre-built executables from release and share some trace dumps (inputs and outputs) of mismatched conv layer for debugging ?

My console:
----------------------------------------------------------------------------------------------------------------------------------------------------
Caffe Network File : ..\..\test\testvecs\config\caffe_models\SSD\deploy_e2e.prototxt
Caffe Model File : ..\..\test\testvecs\config\caffe_models\SSD\voc0712od-ssd512x512_jdetnet21v2_iter_120000.caffemodel
TIDL Network File : ..\..\test\testvecs\config\tidl_models\tidl_net_imagenet_jacintonet11v2_ssd.bin
TIDL Model File : ..\..\test\testvecs\config\tidl_models\tidl_param_imagenet_jacintonet11v2_ssd.bin
Name of the Network : jdetnet21v2_deploy
Num Inputs : 1
Num of Layer Detected : 28
0, TIDL_DataLayer , data 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 ,
512 , 1024 , 0 ,
1, TIDL_BatchNormLayer , data/bias 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 512 , 1024 , 1 , 3 ,
512 , 1024 , 1572864 ,
2, TIDL_ConvolutionLayer , conv1a 1, 1 , 1 , 1 , x , x , x , x , x , x , x , 2 , 1 , 3 , 512 , 1024 , 1 , 32 ,
256 , 512 , 314572800 ,
3, TIDL_ConvolutionLayer , conv1b 1, 1 , 1 , 2 , x , x , x , x , x , x , x , 3 , 1 , 32 , 256 , 512 , 1 , 32 ,
128 , 256 , 301989888 ,
4, TIDL_ConvolutionLayer , res2a_branch2a 1, 1 , 1 , 3 , x , x , x , x , x , x , x , 4 , 1 , 32 , 128 , 256 , 1 , 64 ,
128 , 256 , 603979776 ,
5, TIDL_ConvolutionLayer , res2a_branch2b 1, 1 , 1 , 4 , x , x , x , x , x , x , x , 5 , 1 , 64 , 128 , 256 , 1 , 64 ,
64 , 128 , 301989888 ,
6, TIDL_ConvolutionLayer , res3a_branch2a 1, 1 , 1 , 5 , x , x , x , x , x , x , x , 6 , 1 , 64 , 64 , 128 , 1 , 128 ,
64 , 128 , 603979776 ,
7, TIDL_ConvolutionLayer , res3a_branch2b 1, 1 , 1 , 6 , x , x , x , x , x , x , x , 7 , 1 , 128 , 64 , 128 , 1 , 128 ,
32 , 64 , 301989888 ,
8, TIDL_ConvolutionLayer , res4a_branch2a 1, 1 , 1 , 7 , x , x , x , x , x , x , x , 8 , 1 , 128 , 32 , 64 , 1 , 256 ,
32 , 64 , 603979776 ,
9, TIDL_ConvolutionLayer , res4a_branch2b 1, 1 , 1 , 8 , x , x , x , x , x , x , x , 9 , 1 , 256 , 32 , 64 , 1 , 256 ,
32 , 64 , 301989888 ,
10, TIDL_PoolingLayer , pool4 1, 1 , 1 , 9 , x , x , x , x , x , x , x , 10 , 1 , 256 , 32 , 64 , 1 , 256 ,
32 , 64 , 524288 ,
11, TIDL_ConvolutionLayer , res5a_branch2a 1, 1 , 1 , 10 , x , x , x , x , x , x , x , 11 , 1 , 256 , 32 , 64 , 1 , 512 ,
32 , 64 ,2415919104 ,
12, TIDL_ConvolutionLayer , res5a_branch2b 1, 1 , 1 , 11 , x , x , x , x , x , x , x , 12 , 1 , 512 , 32 , 64 , 1 , 512 ,
32 , 64 ,1207959552 ,
13, TIDL_PoolingLayer , pool6 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 13 , 1 , 512 , 32 , 64 , 1 , 512 ,
17 , 33 , 2585088 ,
14, TIDL_PoolingLayer , pool7 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 14 , 1 , 512 , 17 , 33 , 1 , 512 ,
9 , 17 , 705024 ,
15, TIDL_PoolingLayer , pool8 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 15 , 1 , 512 , 9 , 17 , 1 , 512 ,
5 , 9 , 207360 ,
16, TIDL_ConvolutionLayer , ctx_output1 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 16 , 1 , 512 , 32 , 64 , 1 , 512 ,
32 , 64 , 536870912 ,
17, TIDL_ConvolutionLayer , ctx_output2 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 17 , 1 , 512 , 17 , 33 , 1 , 512 ,
17 , 33 , 147062784 ,
18, TIDL_ConvolutionLayer , ctx_output3 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 18 , 1 , 512 , 9 , 17 , 1 , 512 ,
9 , 17 , 40108032 ,
19, TIDL_ConvolutionLayer , ctx_output4 1, 1 , 1 , 15 , x , x , x , x , x , x , x , 19 , 1 , 512 , 5 , 9 , 1 , 512 ,
5 , 9 , 11796480 ,
20, TIDL_ConvolutionLayer , ctx_output1/relu_mbox_loc 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 20 , 1 , 512 , 32 , 64 , 1 , 16 ,
32 , 64 , 150994944 ,
21, TIDL_ConvolutionLayer , ctx_output1/relu_mbox_conf 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 21 , 1 , 512 , 32 , 64 , 1 , 84 ,
32 , 64 , 792723456 ,
22, TIDL_ConvolutionLayer , ctx_output2/relu_mbox_loc 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 22 , 1 , 512 , 17 , 33 , 1 , 16 ,
17 , 33 , 41361408 ,
23, TIDL_ConvolutionLayer , ctx_output2/relu_mbox_conf 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 23 , 1 , 512 , 17 , 33 , 1 , 84 ,
17 , 33 , 217147392 ,
24, TIDL_ConvolutionLayer , ctx_output3/relu_mbox_loc 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 24 , 1 , 512 , 9 , 17 , 1 , 16 ,
9 , 17 , 11280384 ,
25, TIDL_ConvolutionLayer , ctx_output3/relu_mbox_conf 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 25 , 1 , 512 , 9 , 17 , 1 , 84 ,
9 , 17 , 59222016 ,
26, TIDL_ConvolutionLayer , ctx_output4/relu_mbox_loc 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 26 , 1 , 512 , 5 , 9 , 1 , 16 ,
5 , 9 , 3317760 ,
27, TIDL_ConvolutionLayer , ctx_output4/relu_mbox_conf 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 27 , 1 , 512 , 5 , 9 , 1 , 84 ,
5 , 9 , 17418240 ,
Total Giga Macs : 8.9932
Total Giga Macs : 134.8987 @15 fps
Total Giga Macs : 269.7975 @30 fps
1 file(s) copied.

Processing config file .\tempDir\qunat_stats_config.txt !
0, TIDL_DataLayer , 0, -1 , 1 , x , x , x , x , x , x , x , x , 0 , 0 , 0 , 0 , 0 , 1 , 3 , 512 , 1024 ,
1, TIDL_BatchNormLayer , 1, 1 , 1 , 0 , x , x , x , x , x , x , x , 1 , 1 , 3 , 512 , 1024 , 1 , 3 , 512 , 1024 ,
2, TIDL_ConvolutionLayer , 1, 1 , 1 , 1 , x , x , x , x , x , x , x , 2 , 1 , 3 , 512 , 1024 , 1 , 32 , 256 , 512 ,
3, TIDL_ConvolutionLayer , 1, 1 , 1 , 2 , x , x , x , x , x , x , x , 3 , 1 , 32 , 256 , 512 , 1 , 32 , 128 , 256 ,
4, TIDL_ConvolutionLayer , 1, 1 , 1 , 3 , x , x , x , x , x , x , x , 4 , 1 , 32 , 128 , 256 , 1 , 64 , 128 , 256 ,
5, TIDL_ConvolutionLayer , 1, 1 , 1 , 4 , x , x , x , x , x , x , x , 5 , 1 , 64 , 128 , 256 , 1 , 64 , 64 , 128 ,
6, TIDL_ConvolutionLayer , 1, 1 , 1 , 5 , x , x , x , x , x , x , x , 6 , 1 , 64 , 64 , 128 , 1 , 128 , 64 , 128 ,
7, TIDL_ConvolutionLayer , 1, 1 , 1 , 6 , x , x , x , x , x , x , x , 7 , 1 , 128 , 64 , 128 , 1 , 128 , 32 , 64 ,
8, TIDL_ConvolutionLayer , 1, 1 , 1 , 7 , x , x , x , x , x , x , x , 8 , 1 , 128 , 32 , 64 , 1 , 256 , 32 , 64 ,
9, TIDL_ConvolutionLayer , 1, 1 , 1 , 8 , x , x , x , x , x , x , x , 9 , 1 , 256 , 32 , 64 , 1 , 256 , 32 , 64 ,
10, TIDL_PoolingLayer , 1, 1 , 1 , 9 , x , x , x , x , x , x , x , 10 , 1 , 256 , 32 , 64 , 1 , 256 , 32 , 64 ,
11, TIDL_ConvolutionLayer , 1, 1 , 1 , 10 , x , x , x , x , x , x , x , 11 , 1 , 256 , 32 , 64 , 1 , 512 , 32 , 64 ,
12, TIDL_ConvolutionLayer , 1, 1 , 1 , 11 , x , x , x , x , x , x , x , 12 , 1 , 512 , 32 , 64 , 1 , 512 , 32 , 64 ,
13, TIDL_PoolingLayer , 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 13 , 1 , 512 , 32 , 64 , 1 , 512 , 17 , 33 ,
14, TIDL_PoolingLayer , 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 14 , 1 , 512 , 17 , 33 , 1 , 512 , 9 , 17 ,
15, TIDL_PoolingLayer , 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 15 , 1 , 512 , 9 , 17 , 1 , 512 , 5 , 9 ,
16, TIDL_ConvolutionLayer , 1, 1 , 1 , 12 , x , x , x , x , x , x , x , 16 , 1 , 512 , 32 , 64 , 1 , 512 , 32 , 64 ,
17, TIDL_ConvolutionLayer , 1, 1 , 1 , 13 , x , x , x , x , x , x , x , 17 , 1 , 512 , 17 , 33 , 1 , 512 , 17 , 33 ,
18, TIDL_ConvolutionLayer , 1, 1 , 1 , 14 , x , x , x , x , x , x , x , 18 , 1 , 512 , 9 , 17 , 1 , 512 , 9 , 17 ,
19, TIDL_ConvolutionLayer , 1, 1 , 1 , 15 , x , x , x , x , x , x , x , 19 , 1 , 512 , 5 , 9 , 1 , 512 , 5 , 9 ,
20, TIDL_ConvolutionLayer , 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 20 , 1 , 512 , 32 , 64 , 1 , 16 , 32 , 64 ,
21, TIDL_ConvolutionLayer , 1, 1 , 1 , 16 , x , x , x , x , x , x , x , 21 , 1 , 512 , 32 , 64 , 1 , 84 , 32 , 64 ,
22, TIDL_ConvolutionLayer , 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 22 , 1 , 512 , 17 , 33 , 1 , 16 , 17 , 33 ,
23, TIDL_ConvolutionLayer , 1, 1 , 1 , 17 , x , x , x , x , x , x , x , 23 , 1 , 512 , 17 , 33 , 1 , 84 , 17 , 33 ,
24, TIDL_ConvolutionLayer , 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 24 , 1 , 512 , 9 , 17 , 1 , 16 , 9 , 17 ,
25, TIDL_ConvolutionLayer , 1, 1 , 1 , 18 , x , x , x , x , x , x , x , 25 , 1 , 512 , 9 , 17 , 1 , 84 , 9 , 17 ,
26, TIDL_ConvolutionLayer , 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 26 , 1 , 512 , 5 , 9 , 1 , 16 , 5 , 9 ,
27, TIDL_ConvolutionLayer , 1, 1 , 1 , 19 , x , x , x , x , x , x , x , 27 , 1 , 512 , 5 , 9 , 1 , 84 , 5 , 9 ,
28, TIDL_DataLayer , 0, 1 , -1 , 27 , x , x , x , x , x , x , x , 0 , 1 , 84 , 5 , 9 , 0 , 0 , 0 , 0 ,
Layer ID ,inBlkWidth ,inBlkHeight ,inBlkPitch ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs ,numOutChs ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs ,numHorBlock ,numVe
rBlock ,inBlkChPitch,outBlkChPitc,alignOrNot
2 72 72 72 32 32 32 3 32 3 1 8 1 3 16
8 5184 1024 1
3 40 34 40 32 32 32 8 8 8 4 8 1 2 16
8 1360 1024 1
4 40 34 40 32 32 32 32 64 32 6 8 1 6 8
4 1360 1024 1
5 40 34 40 32 32 32 16 16 16 6 8 1 3 8
4 1360 1024 1
6 40 34 40 32 32 32 64 128 64 6 8 1 11 4
2 1360 1024 1
7 40 34 40 32 32 32 32 32 32 6 8 1 6 4
2 1360 1024 1
8 40 34 40 32 32 32 128 256 128 6 8 1 22 2
1 1360 1024 1
9 40 34 40 32 32 32 64 64 64 6 8 1 11 2
1 1360 1024 1
11 40 20 40 32 16 32 256 512 256 8 8 1 32 2
2 800 512 1
12 40 36 40 32 32 32 128 128 128 5 8 1 26 2
1 1440 1024 1
16 32 16 32 32 16 32 512 512 512 8 8 1 64 2
2 512 512 1
17 32 17 32 32 17 32 512 512 512 8 8 1 64 2
1 544 544 1
18 32 9 32 32 9 32 512 512 512 8 8 1 64 1
1 288 288 1
19 16 5 16 16 5 16 512 512 512 8 8 1 64 1
1 80 80 1
20 40 10 40 32 8 32 512 16 512 8 8 1 64 2
4 400 256 1
21 40 10 40 32 8 32 512 88 512 8 8 1 64 2
4 400 256 1
22 40 10 40 32 8 32 512 16 512 8 8 1 64 2
3 400 256 1
23 40 10 40 32 8 32 512 88 512 8 8 1 64 2
3 400 256 1
24 40 11 40 32 9 32 512 16 512 8 8 1 64 1
1 440 288 1
25 40 11 40 32 9 32 512 88 512 8 8 1 64 1
1 440 288 1
26 24 7 24 16 5 16 512 16 512 8 8 1 64 1
1 168 80 1
27 24 7 24 16 5 16 512 88 512 8 8 1 64 1
1 168 80 1

Processing Frame Number : 0

Layer 1 : Max PASS : -2147483648 : 15301 Out Q : 254 , 43861, TIDL_BatchNormLayer, PASSED #MMACs = 1.57, 0.00, 1.57, Sparsity : 0.00, 100.00
Layer 2 : Max PASS : -2147483648 : 139279 Out Q : 8271 , 139825, TIDL_ConvolutionLayer, PASSED #MMACs = 314.57, 256.11, 276.82, Sparsity : 12.00, 18.58
Layer 3 : Max PASS : -2147483648 : 18288 Out Q : 9672 , 18360, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 250.87, 263.72, Sparsity : 12.67, 16.93
Layer 4 : Max PASS : -2147483648 : 61258 Out Q : 18054 , 61498, TIDL_ConvolutionLayer, PASSED #MMACs = 603.98, 566.43, 585.89, Sparsity : 2.99, 6.22
Layer 5 : Max PASS : -2147483648 : 57873 Out Q : 11225 , 58100, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 285.44, 294.26, Sparsity : 2.56, 5.48
Layer 6 : Max PASS : -2147483648 : 64639 Out Q : 14112 , 64892, TIDL_ConvolutionLayer, PASSED #MMACs = 603.98, 561.64, 578.78, Sparsity : 4.17, 7.01
Layer 7 : Max PASS : -2147483648 : 42566 Out Q : 13764 , 42733, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 284.09, 293.63, Sparsity : 2.77, 5.93
Layer 8 : Max PASS : -2147483648 : 51681 Out Q : 18130 , 51884, TIDL_ConvolutionLayer, PASSED #MMACs = 603.98, 545.17, 562.28, Sparsity : 6.90, 9.74
Layer 9 : Max PASS : -2147483648 : 76130 Out Q : 14555 , 76429, TIDL_ConvolutionLayer, PASSED #MMACs = 301.99, 287.03, 295.72, Sparsity : 2.08, 4.95
Layer 10 :TIDL_PoolingLayer, PASSED #MMACs = 0.52, 0.00, 0.52, Sparsity : 0.00, 100.00
Layer 11 : Max PASS : -2147483648 : 99140 Out Q : 13880 , 99529, TIDL_ConvolutionLayer, PASSED #MMACs = 2415.92, 2186.35, 2237.01, Sparsity : 7.41, 9.50
Layer 12 : Max PASS : -2147483648 : 22559 Out Q : 6512 , 22647, TIDL_ConvolutionLayer, PASSED #MMACs = 1207.96, 1009.58, 1050.52, Sparsity : 13.03, 16.42
Layer 13 :TIDL_PoolingLayer, PASSED #MMACs = 0.29, 0.00, 0.29, Sparsity : 0.00, 100.00
Layer 14 :TIDL_PoolingLayer, PASSED #MMACs = 0.08, 0.00, 0.08, Sparsity : 0.00, 100.00
Layer 15 :TIDL_PoolingLayer, PASSED #MMACs = 0.02, 0.00, 0.02, Sparsity : 0.00, 100.00
Layer 16 : Max PASS : -2147483648 : 39772 Out Q : 24789 , 39928, TIDL_ConvolutionLayer, PASSED #MMACs = 536.87, 508.54, 536.63, Sparsity : 0.04, 5.28
Layer 17 : Max PASS : -2147483648 : 72622 Out Q : 24508 , 72907, TIDL_ConvolutionLayer, PASSED #MMACs = 147.06, 141.03, 147.05, Sparsity : 0.01, 4.10
Layer 18 : Max PASS : -2147483648 : 50744 Out Q : 26739 , 50943, TIDL_ConvolutionLayer, PASSED #MMACs = 40.11, 38.00, 40.10, Sparsity : 0.03, 5.25
Layer 19 : Max PASS : -2147483648 : 62411 Out Q : 23126 , 62656, TIDL_ConvolutionLayer, PASSED #MMACs = 11.80, 11.31, 11.79, Sparsity : 0.01, 4.12
Layer 20 : Max PASS : -2147483648 : 75037 Out Q : 3748 , 287532, TIDL_ConvolutionLayer, PASSED #MMACs = 150.99, 117.70, 120.82, Sparsity : 19.99, 22.05
Layer 21 : Max PASS : -2147483648 : 774990 Out Q : 1573 , 1562185, TIDL_ConvolutionLayer, PASSED #MMACs = 792.72, 745.48, 762.02, Sparsity : 3.87, 5.96
Layer 22 : Max PASS : -2147483648 : 96298 Out Q : 7532 , 247645, TIDL_ConvolutionLayer, PASSED #MMACs = 41.36, 36.70, 37.57, Sparsity : 9.16, 11.26
Layer 23 : Max PASS : -2147483648 : 723780 Out Q : 1893 , 1458958, TIDL_ConvolutionLayer, PASSED #MMACs = 217.15, 211.20, 215.99, Sparsity : 0.53, 2.74
Layer 24 : Max PASS : -2147483648 : 106717 Out Q : 10944 , 234732, TIDL_ConvolutionLayer, PASSED #MMACs = 11.28, 10.26, 10.50, Sparsity : 6.96, 9.04
Layer 25 : Max PASS : -2147483648 : 628473 Out Q : 2189 , 1266843, TIDL_ConvolutionLayer, PASSED #MMACs = 59.22, 57.22, 58.47, Sparsity : 1.26, 3.38
Layer 26 : Max PASS : -2147483648 : 135248 Out Q : 10831 , 319786, TIDL_ConvolutionLayer, PASSED #MMACs = 3.32, 3.06, 3.13, Sparsity : 5.73, 7.82
Layer 27 : Max PASS : -2147483648 : 683268 Out Q : 1826 , 1377296, TIDL_ConvolutionLayer, PASSED #MMACs = 17.42, 16.82, 17.18, Sparsity : 1.37, 3.45
End of config list found !
Press any key to continue . . .
-------------------------------------------------------------------------------------------------------------------------------------------------------
Thanks,
Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa

Intellectual 415 points

I am using the pre-built executables in release too. I do NOT rebuild them.

You say that your result is right. I think you means that the result in ../tempDir (like "trace_dump_21_64x32.y") is right.

BUT, I mean that the result which is got from TDA2xx EVM board.

I print the result like this:

0x8DE4AB00 is the address of layer 16 's output, the reslut is right

addr = (unsigned char*)(0x8DE4AB00 + 4*(64 + 4*2) + 4);

[EVE1 ] 20.589428 s: addr[0] = 99
[EVE1 ] 20.589641 s: addr[1] = 53
[EVE1 ] 20.589855 s: addr[2] = 99
[EVE1 ] 20.590068 s: addr[3] = 67
[EVE1 ] 20.590251 s: addr[4] = 94
[EVE1 ] 20.590465 s: addr[5] = 92
[EVE1 ] 20.590678 s: addr[6] = 89
[EVE1 ] 20.590861 s: addr[7] = 89
[EVE1 ] 20.591075 s: addr[8] = 87
[EVE1 ] 20.591288 s: addr[9] = 91
[EVE1 ] 20.591502 s: addr[10] = 93
[EVE1 ] 20.591715 s: addr[11] = 91
[EVE1 ] 20.591929 s: addr[12] = 95
[EVE1 ] 20.592142 s: addr[13] = 82
[EVE1 ] 20.592356 s: addr[14] = 70
[EVE1 ] 20.592569 s: addr[15] = 39

0x8DFB2B80 is the address of layer 21 's output, the result is wrong

addr = (unsigned char*)(0x8DFB2B80 + 4*(64 + 4*2) + 4);
[EVE1 ] 20.592783 s: addr[0] = 255
[EVE1 ] 20.592997 s: addr[1] = 255
[EVE1 ] 20.593210 s: addr[2] = 255
[EVE1 ] 20.593424 s: addr[3] = 255
[EVE1 ] 20.593637 s: addr[4] = 255
[EVE1 ] 20.593851 s: addr[5] = 255
[EVE1 ] 20.594125 s: addr[6] = 255
[EVE1 ] 20.594369 s: addr[7] = 255
[EVE1 ] 20.594583 s: addr[8] = 255
[EVE1 ] 20.594796 s: addr[9] = 255
[EVE1 ] 20.595010 s: addr[10] = 255
[EVE1 ] 20.595223 s: addr[11] = 255
[EVE1 ] 20.595467 s: addr[12] = 255
[EVE1 ] 20.595681 s: addr[13] = 255
[EVE1 ] 20.595894 s: addr[14] = 255
[EVE1 ] 20.596138 s: addr[15] = 255

0 Praveen Eppa1 over 8 years ago in reply to ZhangQiang

TI__Genius 17580 points

Hi,

I just ran your test case on EVM Board and outputs from this are also looks good. I am not seeing either "255" or "0" that you are reporting.

Can you please dump traces from EVM broad also by setting "#define ENABLE_TRACE_DUMP (1)" in tidl_alg.c file at line #39.

Please re-build EVE mode(TIDL File I/O Usecase) with this change and run. This will dump outputs of each layer similar to import tool.Please note that it will take some time to run and dump the outputs on CCS.

Thanks,

Praveen

0 Praveen Eppa over 8 years ago in reply to Praveen Eppa1

Expert 1420 points

Hi,

I just ran your test case on EVM Board and outputs from this are also looks good. I am not seeing either "255" or "0" that you are reporting.

Can you please dump traces from EVM broad also by setting "#define ENABLE_TRACE_DUMP (1)" in tidl_alg.c file at line #39.

Thanks,

Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa1

Intellectual 415 points

what's your version of vsdk and TIDL?

0 Praveen Eppa over 8 years ago in reply to ZhangQiang

Expert 1420 points

My TIDL version is 01.00.00.00

0 Praveen Eppa over 8 years ago in reply to Praveen Eppa

Expert 1420 points

Hi,

Looks like you are using some older TIDL version, can you please pickup the latest version try ?

Thanks,
Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa

Intellectual 415 points

it's correct when using vsdk_03.02.00.00 and TIDL_01.00.00.00

Thanks!

0 Praveen Eppa over 8 years ago in reply to ZhangQiang

Expert 1420 points

Hi,

Thanks for the update. Good to hear that.

Regards,

Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa

Intellectual 415 points

Hi Praveen,

I test the run time of TIDL is 1736ms and sparse modle is 796ms. (8.9932GMACs)
The seg model is only 338ms (8.442GMACs).
Why so big difficence?

when I set "conv2dKernelType = 1", the converted model can't run when using TIDL usecase, maybe there is some bug?
and I can NOT access the link :
cdds.ext.ti.com/.../emxTree.jsp

Thanks

0 Praveen Eppa1 over 8 years ago in reply to ZhangQiang

TI__Genius 17580 points

Hi,

1. Can you build TIDL in release mode and try?

2. Please try this below link

https://cdds.ext.ti.com/ematrix/common/TIemxNavigator.jsp?objectId=28670.42872.40084.8275

Thanks,

Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa1

Intellectual 415 points

Now I am using the TIDL in the link, the result is similar.

There are some details:

I imported seg and ssd model to tidl using "tidl_model_import.out.exe"(do NOT rebuild it),.

I do NOT change any code and test the tidl run time.

input image size is 1024x512.

	seg model	ssd model
no sparsity model	1251ms	1738ms
sparsity model	397ms	698
dense convolution	397ms	import failed

so,

1. Why so big difference between seg and ssd NO sparsity?

2. Why the 80% sparsity only get 3.15x speedup in seg model(I know you can get about 4x speedup) and 2.49x speedup in ssd model?

3. When using dense convolution, I set "conv2dKernelType = 1" but there is no difference in seg model and import failed in ssd model.

4. When I use the "NET.BIN" and "PRM.BIN" in "..\PROCESSOR_SDK_VISION_03_02_00_00\vision_sdk\apps\tools\TIDL_files" folder, the run time is only 310ms. how did you get it ? (I can only get 397ms)

Thanks !

0 Praveen Eppa1 over 8 years ago in reply to ZhangQiang

TI__Genius 17580 points

Hi,

1. We are working on importing SSD network and implementation of new layers of SSD in TIDL, so please wait for some time to test SSD netwrok with TIDL.

2. Please share import config file and prototext used for importing seg net, we will review and check?

3. Dense convolution will be set only when convolution layer output width or height < 64, so may be in this case there are no conv layers with output width or height < 64.

4. Please share import config file and prototext used for importing seg net? Also, you can check that imported seg net bin files("NET.BIN" and "PRM.BIN") from tidl using "tidl_model_import.out.exe" should match with the existing bin files in the package.

Thanks,
Praveen

0 ZhangQiang over 8 years ago in reply to Praveen Eppa1

Intellectual 415 points

Hi Praveen,

I deleted some layers and run the ssd model as my first question，the ssd model is trained by caffe-jacinto demo and do NOT change any config(default is 80% sparsity). The seg model is got from caffe-jacinto-models, no sparsity model is got from "initial" folder and sparsity is got from "sparse" folder.

1. Could you tell me the roughly waiting time ? and hope to hear your good news about faster run time on ssd network.

2. You can just use caffe-jacinto-models trained model.

3. When I change "dilation" value in deploy prototxt from "2" to "1", it import success.

4. I don't know how to deal with the imported bin files.

Thanks

0 Praveen Eppa1 over 8 years ago in reply to ZhangQiang

TI__Genius 17580 points

Hi,

1. SSD support in TIDL will be available in 1Q18 release.

2. We used custom data set for training, so such deviation in seg net performance is expected.

3. Okay, thank you.

4. Please refer to section 3.6 in user guide.

Thanks,
Praveen

Processors

Processors forum

TDA2EXEVM: TIDL run ssd model issues