Linux/AM5749: TIDL import parameter

shigehiro tsuda

Mastermind 9490 points

Part Number: AM5749

Tool/software: Linux

Hi,

I confirm the cifar10 and look the following web site.

http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Foundational_Components_TIDL.html

The data of cifar10 is 32x32.
Please tell me how to use the following settings.

3.15.3.7.2. Sample configuration file (tidl_import_j11_v2.txt)

# Calibration image file
sampleInData = "import/test.raw"

① Use files in the following directories.
ti-processor-sdk-linux-am57xx-evm-05.02.00.10\filesys\usr\share\ti\tidl\examples\test\testvecs\input\preproc_3_32x32.y
② Separately, it must be created from dataset.

Please tell me the details of how to create this data.

Best Regards,
Shigehiro Tsuda

over 6 years ago

0 Manisha Agrawal over 6 years ago

TI__Mastermind 22946 points

Hi,

The sampleInData is the first image that you would like to run inference on. It should be in a format that the network model was trained on and on which you would like to run the inference, .

As mentioned in section 3.15.3.7 of the document, import process is a two step process.

The first step deals with parsing of model parameters and network topology, and converting them into custom format that TIDL Lib can understand.

The second step does calibration of dynamic quantization process by finding out ranges of activations for each layer. This is accomplished by invoking simulation (using native C implementation) which estimates initial values important for quantization process. These values are later updated on per frame basis, assuming strong temporal correlation between input frames.

The sampleInData is used in the second step to calibrate the dynamic quantization to find out ranges of activations for each layer.

Regards,

Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply.
I understood that it is necessary to make it by training data.

What is the data format?
The dataset of cifar10 is obtained from the following site, but I did not understand how to create this data.
Please tell me if there is information.
www.cs.toronto.edu/.../cifar.html

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

CIFAR10 is a 32x32x3, raw data.

You can convert PNG or JPG to .raw format using imagemagick "convert" utility.

Following operation needs to be done on the .png/.jpg image
• SWAP RGB->BGR channels (if channel swap needed): convert $filename -separate +channel -swap 0,2 -combine -colorspace sRGB ./sample_bgr_256x256.png
• RESIZE (if required): convert ./sample_bgr_256x256.png -resize 224x224 ./sample_bgr.png
• SPLIT PLANES (required): convert ./sample_bgr.png -interlace plane BGR:sample_img_256x256.raw

Regards,
Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply.
I will try it your way of answering.
Is this data correct in understanding that one of multiple image data is selected?

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

What do you mean by of the multiple image data is selected?

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply.

The cifar10 has the following image.
Each item has 1000 images each.
Are there bases and conditions to choose as sampleInData?
Is it random at all?

1.airplane
2.automobile
3.bird
4.cat
5.deer
6.dog
7.frog
8.horse
9.ship
10.truck

Best Regards,
Shigehiro Tsuda

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for your support.

I have confirmed the classification with cifar10, but I can recognize it well.
Could you check if the attached file is incorrect?
1.cnn.txt
2.cnn_config.txt

It checks with the following command.

root@am57xx-evm:/home/cifar10# tidl_model_import.out cnn.txt
root@am57xx-evm:/usr/share/ti/tidl/examples/classification/tidl_classification -g 1 -d 2 -e 2 -l ./cifar10.txt -s ./classlist_cifar10.txt -i 1 -c ./cnn_config.txt

Fullscreen cnn.txt Download

# Default - 0
randParams         = 0 

# 0: Caffe, 1: TensorFlow, Default - 0
modelType          = 0 

# 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
quantizationStyle  = 1 

# quantRoundAdd/100 will be added while rounding to integer, Default - 50
quantRoundAdd      = 25

numParamBits       = 8
# 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
inElementType      = 0

inputNetFile       = "/home/cifar10/deploy.prototxt"
inputParamsFile    = "/home/cifar10/cifar10_jacintonet11v2_iter_64000.caffemodel"
outputNetFile      = "/home/cifar10/tidl_net_cifar10_convert.bin"
outputParamsFile   = "/home/cifar10/tidl_param_cifar10_convert.bin"

rawSampleInData = 1
sampleInData = "/home/cifar10/cifar10_img_32x32.raw"
#sampleInData = "/home/cifar10/cifar10_img_224x224.raw"
tidlStatsTool = "/usr/bin/eve_test_dl_algo.out"

Fullscreen cnn_config.txt Download

numFrames     = 10000
preProcType   = 0
#inData        = /home/cifar10/cifar10_img_224x224.raw
inData        = /home/cifar10/cifar10_img_32x32.raw
outData       = "/usr/bin/stats_tool_out.bin"
netBinFile    = "/home/cifar10/tidl_net_cifar10_convert.bin"
paramsBinFile = "/home/cifar10/tidl_param_cifar10_convert.bin"
#inWidth       = 224
#inHeight      = 224
inWidth       = 32
inHeight      = 32
inNumChannels = 3
#inNumChannels = 1
#layerIndex2LayerGroupId = { {12, 2}, {13, 2}, {14, 2} }

[result]
It is always recognized as cat without cat image.
I think that it does not recognize other than cats and dogs, but is it correct?

root@am57xx-evm:/home/cifar10# /usr/share/ti/tidl/examples/classification/tidl_classification -g 1 -d 2 -e 2 -l ./cifar10.txt -s ./classlist_cat_only.txt -i 1 -c ./cnn_config.txt
[ 6173.073962] omap-iommu 58882000.mmu: 58882000.mmu: version 2.1
[ 6173.118221] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
[ 6173.124114] omap-iommu 41501000.mmu: 41501000.mmu: version 3.0
[ 6173.131037] omap-iommu 41502000.mmu: 41502000.mmu: version 3.0
[ 6173.144562] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
[ 6173.150452] omap-iommu 40d01000.mmu: 40d01000.mmu: version 3.0
[ 6173.156409] omap-iommu 40d02000.mmu: 40d02000.mmu: version 3.0
==Total of 11 items!
0) airplane
1) automobile
2) bird
3) cat
4) deer
5) dog
6) frog
7) horse
8) ship
9) truck
10)
Searching for cat
Found: 3
Searching for dog
Found: 5
Run single configuration: ./cnn_config.txt
[ 6173.791247] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 000040e2
[ 6173.797577] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
[ 6173.804791] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 000040e2
[ 6173.811095] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
[ 6173.838112] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
[ 6173.845307] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 00004040
[ 6173.851602] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
[ 6173.858800] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 000040e2
[ 6173.865094] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
init done
Using Wayland-EGL
wlpvr: PVR Services Initialised
Using the 'xdg-shell-v5' shell integration
Capture camera with 30 fps, 640x480 px
Rect[10, 10]
About to start ProcessFrame loop!!
Frame:796,788 ROI[0]: rank=1, outval=0.12549, cat
Frame:804,796 ROI[0]: rank=1, outval=0.133333, cat
Frame:805,797 ROI[0]: rank=2, outval=0.145098, cat
ROI(0)(3)=cat
Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:345.608
ROI(0)(3)=cat
Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:199.241
Frame:808,800 ROI[0]: rank=2, outval=0.196078, cat
Frame:811,803 ROI[0]: rank=1, outval=0.109804, cat
Frame:812,804 ROI[0]: rank=2, outval=0.266667, cat
ROI(0)(3)=cat
Device:EVE0 eops(8), EVES(2) DSPS(2) FPS:141.853
Frame:813,805 ROI[0]: rank=2, outval=0.137255, cat
ROI(0)(3)=cat
Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:110.08
Frame:814,806 ROI[0]: rank=1, outval=0.109804, cat
ROI(0)(3)=cat
Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:89.2373
ROI(0)(3)=cat
Device:DSP1 eops(8), EVES(2) DSPS(2) FPS:74.8635
Frame:816,808 ROI[0]: rank=1, outval=0.431373, cat
ROI(0)(3)=cat

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi,

I will look into this and get back to you in next couple of days.

Regards,

Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply.
I am waiting for information on the results of your investigation.

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi,

It seems to me that you are violating the constrains on network parameter configuration. The common one to violate is input number of nodes to Inner product Layer.

Can you please generate the network model traces and graph for your model and share with us?

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply.

How can we generate traces and graphs of the network model?

Attach the result of Network graph viewer.

The Execution Graph viewer did not know how to do it.
Please tell me how to generate Execution Graph viewer.

cifar10.pdf

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

The generated network graph looks good. I wonder if you are feeding the image in same format as you used for training (RGB vs BGR), that matters.

Also, if you can share the network binary, parameter binary and configuration file (if this has changed compared to the previous one), we can try to reproduce/analyze the problem at our end.

Regards,
Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for your support.
Attach the result of network binary and parameter binary.
The configuration file is not changed from the previous one.

tidl_cifar10_convert.zip

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Thanks Tsuda-san,

We will study this and get back.

Regards,
Manisha

0 Manisha Agrawal over 6 years ago in reply to Manisha Agrawal

TI__Mastermind 22946 points

Hi Tsuda-san,

We are studying the failure of your network model. We need .prototxt and .caffemodel files to investigate further? Can you please share same?

Regards,

Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for your support.
I attach .prototxt and .caffemodel files.
In cifar10, I found that I need to set preProcType = 3 in config file.

Although it is currently recognized, it seems to be slow to start recognition after classification is executed.
I use the classification sample as it is.
Do you know what the cause is?

caffemodel_deploy.zip

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

Are you running your network model entirely on EVE subsystem or DSP cores? There are few layers which runs faster on DSP (SoftMax, Flatten and Concat layer). You may want to consider splitting your network model between EVE and DSP cores.

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for your support.

Yes, I am running the network model with EVE and DSP core.
I try to process EVE and DSP separately, but it do not change.
The log is below.
It is a cat image capture.

About to start ProcessFrame loop!!
Frame:8,0 ROI[0]: rank=3, outval=0.0980392, airplane
Frame:8,0 ROI[0]: rank=2, outval=0.0980392, automobile
Frame:8,0 ROI[0]: rank=1, outval=0.0980392, bird
Frame:9,1 ROI[0]: rank=3, outval=0.0980392, airplane
Frame:9,1 ROI[0]: rank=2, outval=0.0980392, automobile
Frame:9,1 ROI[0]: rank=1, outval=0.0980392, bird
ROI(0)(2)=bird
Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:162.404
Frame:10,2 ROI[0]: rank=3, outval=0.0980392, airplane
Frame:10,2 ROI[0]: rank=2, outval=0.0980392, automobile
Frame:10,2 ROI[0]: rank=1, outval=0.0980392, bird
ROI(0)(2)=bird
Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:124.756
Frame:11,3 ROI[0]: rank=3, outval=0.0980392, airplane
Frame:11,3 ROI[0]: rank=2, outval=0.0980392, automobile
Frame:11,3 ROI[0]: rank=1, outval=0.0980392, bird
ROI(0)(2)=bird
Device:DSP1 eops(8), EVES(2) DSPS(2) FPS:108.252
Frame:12,4 ROI[0]: rank=3, outval=0.0980392, airplane
Frame:12,4 ROI[0]: rank=2, outval=0.0980392, automobile
Frame:12,4 ROI[0]: rank=1, outval=0.0980392, bird

There are several frames followed by the same value.

ROI[0]: rank=3, outval=0.0980392, airplane
ROI[0]: rank=2, outval=0.0980392, automobile
ROI[0]: rank=1, outval=0.0980392, bird

Recognition begins after about 30s.

Device:EVE0 eops(8), EVES(2) DSPS(2) FPS:30.064
Frame:833,825 ROI[0]: rank=3, outval=0, automobile
Frame:833,825 ROI[0]: rank=2, outval=0, bird
Frame:833,825 ROI[0]: rank=1, outval=0.996078, cat
ROI(0)(3)=cat
Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:30.6044
Frame:834,826 ROI[0]: rank=3, outval=0, automobile
Frame:834,826 ROI[0]: rank=2, outval=0, bird
Frame:834,826 ROI[0]: rank=1, outval=0.996078, cat
ROI(0)(3)=cat
Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:30.0183
Frame:835,827 ROI[0]: rank=3, outval=0.0196078, airplane
Frame:835,827 ROI[0]: rank=2, outval=0.0235294, bird
Frame:835,827 ROI[0]: rank=1, outval=0.941176, cat
ROI(0)(3)=cat
Device:DSP1 eops(8), EVES(2) DSPS(2) FPS:29.8626
Frame:836,828 ROI[0]: rank=3, outval=0, automobile
Frame:836,828 ROI[0]: rank=2, outval=0, bird
Frame:836,828 ROI[0]: rank=1, outval=0.996078, cat
ROI(0)(3)=cat

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

Okay, now I understand what you meant by slow start. You are saying that the classification results are in-correct in the beginning and they improve after sometimes. I think dynamic quantization might be contributing to it. When you import the network model, did you provide cat image as input for the imported model to find the right dynamic range for quantization?

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply.

I understood that the dynamic quantization may be affected.
Although a cat image is not input for the imported model, one dog image of cifar10 recognition is input.

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

Please see below guidelines on matching the TIDL inference results and trying quantization parameter control.

Matching TIDL inference result

The TIDL import step runs the inference on PC and the result generates expected output (With caffe or tensorflow inference). If you observe difference at this stage please follow below steps to debug.

1. Caffe inference input and TIDL inference input shall match. Import step dumps input of the first layer at “trace_dump_0_*”, make sure that this is same for caffe as well.

2. If the input is matching, then dump layer level features from caffe and match with TIDL import traces.

3. TDIL trace is in fixed point and can be converted to floating point (using OutQ printed in the import log). Due to quantization the results will not exactly match, but will be similar.

4. Check the parameters of the layer where the mismatch is observed.

5. Share the input and Parameter with TI for further debug.

We use the statistics collected from the previous process for quantizing the activation dynamically in the current processes. So, results we observe during the process on target will NOT be same (but similar) for same input images compared to import steps. We have validated this logic with semantic segmentation application on input video sequence

TIDL maintains range statistics for previously processed frames. It quantizes the current inference activations using range statistics from history for processes (weighted average range).

Below is the parameters controls quantization.

quantMargin is margin added to the average in percentage.

quantHistoryParam1 weights used for previously processed inference during application boot time

quantHistoryParam2 weights used for previously processed inference during application execution (After initial few frames)

To get the same result in TIDL target same as import step for an image. Please set below parameters during algorithm creation.

createParams.quantHistoryParam1 = 0; createParams.quantHistoryParam2 = 0; createParams.quantMargin = 0;

Set with below parameters for running on video sequence. createParams.quantHistoryParam1 = 20; createParams.quantHistoryParam2 = 10; createParams.quantMargin = 20;

Regards,

Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for your support.
After adding it to the file of import configuration parameters and making the following settings, it started about 3s.
Is there a guideline for setting this value?
In the case of large values, it seems that they are not recognized much.
quantHistoryParam1 = 50
quantHistoryParam2 = 50
quantMargin = 20
What is the default value?

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

Thanks for the update. You can refer to the default values from our example application.

Copying below description of the dynamic quantization parameter tuning again -

TIDL maintains range statistics for previously processed frames. It quantizes the current inference activations using range statistics from history for processes (weighted average range).

Below is the parameters controls quantization -

quantMargin is margin added to the average in percentage.
quantHistoryParam1 weights used for previously processed inference during application boot time
quantHistoryParam2 weights used for previously processed inference during application execution (After initial few frames)

Above parameters are set based on how the images in a sequence passed to TIDL is co-related to each other. If it is a video sequence, most of the time current image will be similar to previous image and the quantization parameters from history can be used. If they are new class of image each time during boot time only or during run boot time, they aren't correlated to each other and the quantHistoryParam1 and/or quantHistoryParam2 can be set to zero.

Regards,

Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for your support.

Sorry for the late reply.
Is the default correct with the values set in the following sources?

tidl\tidl_api\src\configuration.cpp

Configuration::Configuration(): numFrames(0), inHeight(0), inWidth(0),
inNumChannels(0),
noZeroCoeffsPercentage(100),
preProcType(0),
runFullNet(false),
NETWORK_HEAP_SIZE(64 << 20), // 64MB for inceptionNetv1
PARAM_HEAP_SIZE(9 << 20), // 9MB for mobileNet1
enableOutputTrace(false),
enableApiTrace(false),
showHeapStats(false),
quantHistoryParam1(20),
quantHistoryParam2(5),
quantMargin(0)
{
}

quantHistoryParam1=20
quantHistoryParam2=5
quantMargin=0

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

You got default settings correctly. Mentioning below again the recommendation on dynamic quantization settings -

Dynamic quantization settings debugging is mostly trial and error. Belo some high level guidelines:

• To get the same result in TIDL target as import step for an image. Please set below parameters during algorithm creation.
createParams.quantHistoryParam1 = 0;
createParams.quantHistoryParam2 = 0;
createParams.quantMargin = 0;

• Set with below parameters for running on video sequence.
createParams.quantHistoryParam1 = 20;
createParams.quantHistoryParam2 = 10;
createParams.quantMargin = 20;

• Default values should be OK too in many cases (copied from testbench, tidl_tb.c)
createParams.quantHistoryParam1 = 20;
createParams.quantHistoryParam2 = 5;
createParams.quantMargin = 0;

Regards,
Manisha

0 shigehiro tsuda over 6 years ago in reply to Manisha Agrawal

Mastermind 9490 points

Hi Manisha,

Thank you for quick reply and kindly support.

I understand that debugging of dynamic quantization settings is mostly trial and error.
Where does tidl_tb.c of testbench come from?
I searched for Processor SDK but could not find it.

Best Regards,
Shigehiro Tsuda

0 Manisha Agrawal over 6 years ago in reply to shigehiro tsuda

TI__Mastermind 22946 points

Hi Tsuda-san,

Currently the host emulation tool source code is not packaged in processor SDK Linux. Why would you like to refer to this file?

Regards,
Manisha

Processors

Processors forum

Linux/AM5749: TIDL import parameter