TDA2EVM5777: TIDL problem

Guarantee

Part Number: TDA2EVM5777

Hi FAE:
we are using the TIDL tools in the PROCESSOR_SDK_VISION_03_07_00_00. it performs well in the Mnist test which contains only convolution layer and ReLU layer in its convolution block. but when we use the tools in our model which cotains convolution, normalization, normalization scale and ReLU layers, the error increases rapidly after a few layers. we compare the trace dump in the tempDir directory with our caffe floating result and find some different. After the normalization scale layer, a few channels are negatived compared with the caffe result. that is, the maxium point of the caffe result corresponds to the minimum point in the TIDL trace dump. if we mutiply these channels by -1(XOR them with 0xFF), the following layer results seem to be improved. we wonder if this phenomenon is related with the bad result in our model. Would you please help us to improve the result.
Our model is in the attachment. the example channel is the layer 0 channel 4, the caffe result is in the directory \caffe_result_float, it is negative to the channel 4 of trace_dump_1_427x240.y.

retinaReport.rar

over 5 years ago

0 Praveen Eppa1 over 5 years ago

TI__Genius 17580 points

Hi,

Did you check that the input to caffe inference is matching with "trace_dump_0_853x480.y" of TIDL import tool output ? For more details on how to check refer to section 3.8 (Matching TIDL inference result) in the TIDL user guide.

Thanks,

Praveen

0 Guarantee over 5 years ago in reply to Praveen Eppa1

Prodigy 90 points

yes we have checked. the following is from the caffe input layer. it is the same as trace_dump_0_853x480.y

inputdata.rar

0 Praveen Eppa1 over 5 years ago in reply to Guarantee

TI__Genius 17580 points

Okay, next the first layer output (trace_dump_1_427x240.y) from import tool should be compared with the first RelU layer ouptut from caffe inference because in the TIDL import tool the "BatchNorm", "Scale", "RelU" are merged with convolution layer.

0 Guarantee over 5 years ago in reply to Praveen Eppa1

Prodigy 90 points

have you begin running the model in caffe?
OK of course we have done this step in the first stage. for this problem we have even develops a series of tools, calculating the similarity between the caffe results and the TI results. we have not only compared the relu output of caffe and the trace_dump_1_427x240.y, but also the convolution output layers and scale output layers (by removing one or more layers of batchNorm/batchNormScale/ReLU in the prototext then run TIDL). we have the conclusion that some channels are abnormal because we have calculatd the similarity is -1 between the
relu output of caffe and the trace_dump_1_427x240.y. the following is the detail comparison results.

6320.result.rar

0 Praveen Eppa1 over 5 years ago in reply to Guarantee

TI__Genius 17580 points

>>stat1.txt: conv+batchnorm+scale+relu(caffe) vs conv+batchnorm+scale+relu(TI) (default)

For state 1, can you share the outputs of layer 1 for all the channels from caffe and from TI import tool to compare and check the issue at our end?

Thanks,

Praveen

0 Guarantee over 5 years ago in reply to Praveen Eppa1

Prodigy 90 points

we have found the problem. the original caffe model use the pad_h, pad_w, stride_h, and stride_w form. TIDL tools do not support this form. so we change them to pad and stride form in .prototxt file (the .caffemodel file is kept unchanged). then the output is wrong. no warning or error messages are given during running.

suggest you to improve the tools. to support the _h and _w parameters.

0 Praveen Eppa1 over 5 years ago in reply to Guarantee

TI__Genius 17580 points

Okay, thanks for pointing this. We will take care in the future release.

Processors

Processors forum

TDA2EVM5777: TIDL problem