This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA2EVM5777: TIDL problem

Part Number: TDA2EVM5777

Hi FAE:
we are using the TIDL tools in the PROCESSOR_SDK_VISION_03_07_00_00. it performs well in the Mnist test which contains only convolution layer and ReLU layer in its convolution block. but when we use the tools in our model which cotains convolution, normalization, normalization scale and ReLU layers, the error increases rapidly after a few layers. we compare the trace dump in the tempDir directory with our caffe floating result and find some different. After the normalization scale layer, a few channels are negatived compared with the caffe result. that is, the maxium point of the caffe result corresponds to the minimum point in the TIDL trace dump. if we mutiply these channels by -1(XOR them with 0xFF), the following layer results seem to be improved. we wonder if this phenomenon is related with the bad result in our model. Would you please help us to improve the result.
Our model is in the attachment. the example channel is the layer 0 channel 4, the caffe result is in the directory \caffe_result_float, it is negative to the channel 4 of trace_dump_1_427x240.y.

retinaReport.rar

  • Hi,

    Did you check that the input to caffe inference is matching with "trace_dump_0_853x480.y" of TIDL import tool output ? For more details on how to check refer to section 3.8 (Matching TIDL inference result) in the TIDL user guide.

    Thanks,

    Praveen

  • yes we have checked. the following is from the caffe input layer. it is the same as trace_dump_0_853x480.y

    inputdata.rar

  • Okay, next the first layer output (trace_dump_1_427x240.y) from import tool should be compared with the first RelU layer ouptut from caffe inference because  in the TIDL import tool the "BatchNorm", "Scale", "RelU" are merged with convolution layer.

       

  • have you begin running the model in caffe?
    OK of course we have done this step in the first stage. for this problem we have even develops a series of tools, calculating the similarity between the caffe results and the TI results. we have not only compared the relu output of caffe and the trace_dump_1_427x240.y, but also the convolution output layers and scale output layers (by removing one or more layers of batchNorm/batchNormScale/ReLU in the prototext then run TIDL). we have the conclusion that some channels are abnormal because we have calculatd the similarity is -1 between the
    relu output of caffe and the trace_dump_1_427x240.y. the following is the detail comparison results.

    6320.result.rar

  • >>stat1.txt: conv+batchnorm+scale+relu(caffe) vs conv+batchnorm+scale+relu(TI)  (default) 

    For state 1, can you share the outputs of layer 1 for all the channels from caffe and from TI import tool to compare and check the issue at our end?

    Thanks,

    Praveen 

  • we have found the problem. the original caffe model use the pad_h, pad_w, stride_h, and stride_w form. TIDL tools do not support this form. so we change them to pad and stride form in .prototxt file (the .caffemodel file is kept unchanged). then the output is wrong. no warning or error messages are given during running.

    suggest you to improve the tools. to support the _h and _w parameters.

  • Okay, thanks for pointing this. We will take care in the future release.