This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM5749: TIDL import parameter

Part Number: AM5749

Tool/software: Linux

Hi,

I confirm the cifar10 and look the following web site.

http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Foundational_Components_TIDL.html


The data of cifar10 is 32x32.
Please tell me how to use the following settings.

3.15.3.7.2. Sample configuration file (tidl_import_j11_v2.txt)

# Calibration image file
sampleInData = "import/test.raw"

① Use files in the following directories.
ti-processor-sdk-linux-am57xx-evm-05.02.00.10\filesys\usr\share\ti\tidl\examples\test\testvecs\input\preproc_3_32x32.y
② Separately, it must be created from dataset.

Please tell me the details of how to create this data.

Best Regards,
Shigehiro Tsuda

  • Hi,

    The sampleInData is the first image that you would like to run inference on. It should be in a format that the network model was trained on and on which you would like to run the inference, .

    As mentioned in section 3.15.3.7 of the document, import process is a two step process. 

    • The first step deals with parsing of model parameters and network topology, and converting them into custom format that TIDL Lib can understand.
    • The second step does calibration of dynamic quantization process by finding out ranges of activations for each layer. This is accomplished by invoking simulation (using native C implementation) which estimates initial values important for quantization process. These values are later updated on per frame basis, assuming strong temporal correlation between input frames.

    The sampleInData is used in the second step to calibrate the dynamic quantization to find out ranges of activations for each layer. 

    Regards,

    Manisha

  • Hi Manisha,

    Thank you for quick reply.
    I understood that it is necessary to make it by training data.

    What is the data format?
    The dataset of cifar10 is obtained from the following site, but I did not understand how to create this data.
    Please tell me if there is information.
    www.cs.toronto.edu/.../cifar.html

    Best Regards,
    Shigehiro Tsuda
  • Hi Tsuda-san,

    CIFAR10 is a 32x32x3, raw data.

    You can convert PNG or JPG to .raw format using imagemagick "convert" utility.

    Following operation needs to be done on the .png/.jpg image
    • SWAP RGB->BGR channels (if channel swap needed): convert $filename -separate +channel -swap 0,2 -combine -colorspace sRGB ./sample_bgr_256x256.png
    • RESIZE (if required): convert ./sample_bgr_256x256.png -resize 224x224 ./sample_bgr.png
    • SPLIT PLANES (required): convert ./sample_bgr.png -interlace plane BGR:sample_img_256x256.raw

    Regards,
    Manisha
  • Hi Manisha,

    Thank you for quick reply.
    I will try it your way of answering.
    Is this data correct in understanding that one of multiple image data is selected?

    Best Regards,
    Shigehiro Tsuda
  • What do you mean by of the multiple image data is selected?
  • Hi Manisha,

    Thank you for quick reply.

    The cifar10 has the following image.
    Each item has 1000 images each.
    Are there bases and conditions to choose as sampleInData?
    Is it random at all?

    1.airplane
    2.automobile
    3.bird
    4.cat
    5.deer
    6.dog
    7.frog
    8.horse
    9.ship
    10.truck

    Best Regards,
    Shigehiro Tsuda

  • Hi Manisha,

    Thank you for your support.

    I have confirmed the classification with cifar10, but I can recognize it well.
    Could you check if the attached file is incorrect?
    1.cnn.txt
    2.cnn_config.txt

    It checks with the following command.

    root@am57xx-evm:/home/cifar10# tidl_model_import.out cnn.txt
    root@am57xx-evm:/usr/share/ti/tidl/examples/classification/tidl_classification -g 1 -d 2 -e 2 -l ./cifar10.txt -s ./classlist_cifar10.txt -i 1 -c ./cnn_config.txt

    # Default - 0
    randParams         = 0 
    
    # 0: Caffe, 1: TensorFlow, Default - 0
    modelType          = 0 
    
    # 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
    quantizationStyle  = 1 
    
    # quantRoundAdd/100 will be added while rounding to integer, Default - 50
    quantRoundAdd      = 25
    
    numParamBits       = 8
    # 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
    inElementType      = 0
    
    inputNetFile       = "/home/cifar10/deploy.prototxt"
    inputParamsFile    = "/home/cifar10/cifar10_jacintonet11v2_iter_64000.caffemodel"
    outputNetFile      = "/home/cifar10/tidl_net_cifar10_convert.bin"
    outputParamsFile   = "/home/cifar10/tidl_param_cifar10_convert.bin"
    
    rawSampleInData = 1
    sampleInData = "/home/cifar10/cifar10_img_32x32.raw"
    #sampleInData = "/home/cifar10/cifar10_img_224x224.raw"
    tidlStatsTool = "/usr/bin/eve_test_dl_algo.out"
    
    
      
    numFrames     = 10000
    preProcType   = 0
    #inData        = /home/cifar10/cifar10_img_224x224.raw
    inData        = /home/cifar10/cifar10_img_32x32.raw
    outData       = "/usr/bin/stats_tool_out.bin"
    netBinFile    = "/home/cifar10/tidl_net_cifar10_convert.bin"
    paramsBinFile = "/home/cifar10/tidl_param_cifar10_convert.bin"
    #inWidth       = 224
    #inHeight      = 224
    inWidth       = 32
    inHeight      = 32
    inNumChannels = 3
    #inNumChannels = 1
    #layerIndex2LayerGroupId = { {12, 2}, {13, 2}, {14, 2} }
    
    


    [result]
    It is always recognized as cat without cat image.
    I think that it does not recognize other than cats and dogs, but is it correct?

    root@am57xx-evm:/home/cifar10# /usr/share/ti/tidl/examples/classification/tidl_classification -g 1 -d 2 -e 2 -l ./cifar10.txt -s ./classlist_cat_only.txt -i 1 -c ./cnn_config.txt
    [ 6173.073962] omap-iommu 58882000.mmu: 58882000.mmu: version 2.1
    [ 6173.118221] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [ 6173.124114] omap-iommu 41501000.mmu: 41501000.mmu: version 3.0
    [ 6173.131037] omap-iommu 41502000.mmu: 41502000.mmu: version 3.0
    [ 6173.144562] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [ 6173.150452] omap-iommu 40d01000.mmu: 40d01000.mmu: version 3.0
    [ 6173.156409] omap-iommu 40d02000.mmu: 40d02000.mmu: version 3.0
    ==Total of 11 items!
    0) airplane
    1) automobile
    2) bird
    3) cat
    4) deer
    5) dog
    6) frog
    7) horse
    8) ship
    9) truck
    10)
    Searching for cat
    Found: 3
    Searching for dog
    Found: 5
    Run single configuration: ./cnn_config.txt
    [ 6173.791247] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 000040e2
    [ 6173.797577] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
    [ 6173.804791] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 000040e2
    [ 6173.811095] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
    [ 6173.838112] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
    [ 6173.845307] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 00004040
    [ 6173.851602] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
    [ 6173.858800] [drm:omap_crtc_error_irq] *ERROR* lcd: errors: 000040e2
    [ 6173.865094] [drm:omap_irq_handler] *ERROR* FIFO underflow on gfx (0x00000040)
    init done
    Using Wayland-EGL
    wlpvr: PVR Services Initialised
    Using the 'xdg-shell-v5' shell integration
    Capture camera with 30 fps, 640x480 px
    Rect[10, 10]
    About to start ProcessFrame loop!!
    Frame:796,788 ROI[0]: rank=1, outval=0.12549, cat
    Frame:804,796 ROI[0]: rank=1, outval=0.133333, cat
    Frame:805,797 ROI[0]: rank=2, outval=0.145098, cat
    ROI(0)(3)=cat
    Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:345.608
    ROI(0)(3)=cat
    Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:199.241
    Frame:808,800 ROI[0]: rank=2, outval=0.196078, cat
    Frame:811,803 ROI[0]: rank=1, outval=0.109804, cat
    Frame:812,804 ROI[0]: rank=2, outval=0.266667, cat
    ROI(0)(3)=cat
    Device:EVE0 eops(8), EVES(2) DSPS(2) FPS:141.853
    Frame:813,805 ROI[0]: rank=2, outval=0.137255, cat
    ROI(0)(3)=cat
    Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:110.08
    Frame:814,806 ROI[0]: rank=1, outval=0.109804, cat
    ROI(0)(3)=cat
    Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:89.2373
    ROI(0)(3)=cat
    Device:DSP1 eops(8), EVES(2) DSPS(2) FPS:74.8635
    Frame:816,808 ROI[0]: rank=1, outval=0.431373, cat
    ROI(0)(3)=cat


    Best Regards,
    Shigehiro Tsuda

  • Hi,

    I will look into this and get back to you in next couple of days.

    Regards,

    Manisha

  • Hi Manisha,

    Thank you for quick reply.
    I am waiting for information on the results of your investigation.

    Best Regards,
    Shigehiro Tsuda
  • Hi,

    It seems to me that you are violating the constrains on network parameter configuration. The common one to violate is input number of nodes to Inner product Layer. 

    Can you please generate the network model traces and graph for your model and share with us?

  • Hi Manisha,

    Thank you for quick reply.

    How can we generate traces and graphs of the network model?

    Attach the result of Network graph viewer.

    The Execution Graph viewer did not know how to do it.
    Please tell me how to generate Execution Graph viewer.

    cifar10.pdf

    Best Regards,
    Shigehiro Tsuda

  • Hi Tsuda-san,

    The generated network graph looks good. I wonder if you are feeding the image in same format as you used for training (RGB vs BGR), that matters.

    Also, if you can share the network binary, parameter binary and configuration file (if this has changed compared to the previous one), we can try to reproduce/analyze the problem at our end.

    Regards,
    Manisha
  • Hi Manisha,

    Thank you for your support.
    Attach the result of network binary and parameter binary.
    The configuration file is not changed from the previous one.

    tidl_cifar10_convert.zip

    Best Regards,
    Shigehiro Tsuda

  • Thanks Tsuda-san,

    We will study this and get back.

    Regards,
    Manisha
  • Hi Tsuda-san,

    We are studying the failure of your network model. We need  .prototxt and .caffemodel files to investigate further? Can you please share same?

    Regards,

    Manisha

  • Hi Manisha,

    Thank you for your support.
    I attach .prototxt and .caffemodel files.
    In cifar10, I found that I need to set preProcType = 3 in config file.

    Although it is currently recognized, it seems to be slow to start recognition after classification is executed.
    I use the classification sample as it is.
    Do you know what the cause is?

    caffemodel_deploy.zip

    Best Regards,
    Shigehiro Tsuda

  • Hi Tsuda-san,

    Are you running your network model entirely on EVE subsystem or DSP cores? There are few layers which runs faster on DSP (SoftMax, Flatten and Concat layer). You may want to consider splitting your network model between EVE and DSP cores.

  • Hi Manisha,

    Thank you for your support.

    Yes, I am running the network model with EVE and DSP core.
    I try to process EVE and DSP separately, but it do not change.
    The log is below.
    It is a cat image capture.

    About to start ProcessFrame loop!!
    Frame:8,0 ROI[0]: rank=3, outval=0.0980392, airplane
    Frame:8,0 ROI[0]: rank=2, outval=0.0980392, automobile
    Frame:8,0 ROI[0]: rank=1, outval=0.0980392, bird
    Frame:9,1 ROI[0]: rank=3, outval=0.0980392, airplane
    Frame:9,1 ROI[0]: rank=2, outval=0.0980392, automobile
    Frame:9,1 ROI[0]: rank=1, outval=0.0980392, bird
    ROI(0)(2)=bird
    Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:162.404
    Frame:10,2 ROI[0]: rank=3, outval=0.0980392, airplane
    Frame:10,2 ROI[0]: rank=2, outval=0.0980392, automobile
    Frame:10,2 ROI[0]: rank=1, outval=0.0980392, bird
    ROI(0)(2)=bird
    Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:124.756
    Frame:11,3 ROI[0]: rank=3, outval=0.0980392, airplane
    Frame:11,3 ROI[0]: rank=2, outval=0.0980392, automobile
    Frame:11,3 ROI[0]: rank=1, outval=0.0980392, bird
    ROI(0)(2)=bird
    Device:DSP1 eops(8), EVES(2) DSPS(2) FPS:108.252
    Frame:12,4 ROI[0]: rank=3, outval=0.0980392, airplane
    Frame:12,4 ROI[0]: rank=2, outval=0.0980392, automobile
    Frame:12,4 ROI[0]: rank=1, outval=0.0980392, bird


    There are several frames followed by the same value.

    ROI[0]: rank=3, outval=0.0980392, airplane
    ROI[0]: rank=2, outval=0.0980392, automobile
    ROI[0]: rank=1, outval=0.0980392, bird

    Recognition begins after about 30s.

    Device:EVE0 eops(8), EVES(2) DSPS(2) FPS:30.064
    Frame:833,825 ROI[0]: rank=3, outval=0, automobile
    Frame:833,825 ROI[0]: rank=2, outval=0, bird
    Frame:833,825 ROI[0]: rank=1, outval=0.996078, cat
    ROI(0)(3)=cat
    Device:EVE1 eops(8), EVES(2) DSPS(2) FPS:30.6044
    Frame:834,826 ROI[0]: rank=3, outval=0, automobile
    Frame:834,826 ROI[0]: rank=2, outval=0, bird
    Frame:834,826 ROI[0]: rank=1, outval=0.996078, cat
    ROI(0)(3)=cat
    Device:DSP0 eops(8), EVES(2) DSPS(2) FPS:30.0183
    Frame:835,827 ROI[0]: rank=3, outval=0.0196078, airplane
    Frame:835,827 ROI[0]: rank=2, outval=0.0235294, bird
    Frame:835,827 ROI[0]: rank=1, outval=0.941176, cat
    ROI(0)(3)=cat
    Device:DSP1 eops(8), EVES(2) DSPS(2) FPS:29.8626
    Frame:836,828 ROI[0]: rank=3, outval=0, automobile
    Frame:836,828 ROI[0]: rank=2, outval=0, bird
    Frame:836,828 ROI[0]: rank=1, outval=0.996078, cat
    ROI(0)(3)=cat

    Best Regards,
    Shigehiro Tsuda

  • Hi Tsuda-san,

    Okay, now I understand what you meant by slow start. You are saying that the classification results are in-correct in the beginning and they improve after sometimes. I think dynamic quantization might be contributing to it. When you import the network model, did you provide cat image as input for the imported model to find the right dynamic range for quantization?
  • Hi Manisha,

    Thank you for quick reply.

    I understood that the dynamic quantization may be affected.
    Although a cat image is not input for the imported model, one dog image of cifar10 recognition is input.

    Best Regards,
    Shigehiro Tsuda


  • Hi Tsuda-san,

    Please see below guidelines on matching the TIDL inference results and trying quantization parameter control. 

    Matching TIDL inference result

    The TIDL import step runs the inference on PC and the result generates expected output (With caffe or tensorflow inference). If you observe difference at this stage please follow below steps to debug.

    1. Caffe inference input and TIDL inference input shall match. Import step dumps input of the first layer at “trace_dump_0_*”, make sure that this is same for caffe as well.

    2. If the input is matching, then dump layer level features from caffe and match with TIDL import traces.

    3. TDIL trace is in fixed point and can be converted to floating point (using OutQ printed in the import log). Due to quantization the results will not exactly match, but will be similar.

    4. Check the parameters of the layer where the mismatch is observed.

    5. Share the input and Parameter with TI for further debug.

    We use the statistics collected from the previous process for quantizing the activation dynamically in the current processes. So, results we observe during the process on target will NOT be same (but similar) for same input images compared to import steps. We have validated this logic with semantic segmentation application on input video sequence

    TIDL maintains range statistics for previously processed frames. It quantizes the current inference activations using range statistics from history for processes (weighted average range).

    Below is the parameters controls quantization.

    quantMargin is margin added to the average in percentage.

    quantHistoryParam1 weights used for previously processed inference during application boot time

    quantHistoryParam2 weights used for previously processed inference during application execution (After initial few frames)

    To get the same result in TIDL target same as import step for an image. Please set below parameters during algorithm creation.

    createParams.quantHistoryParam1 = 0; createParams.quantHistoryParam2 = 0; createParams.quantMargin = 0;

    Set with below parameters for running on video sequence. createParams.quantHistoryParam1 = 20; createParams.quantHistoryParam2 = 10; createParams.quantMargin = 20;

    Regards,

    Manisha

  • Hi Manisha,

    Thank you for your support.
    After adding it to the file of import configuration parameters and making the following settings, it started about 3s.
    Is there a guideline for setting this value?
    In the case of large values, it seems that they are not recognized much.
    quantHistoryParam1 = 50
    quantHistoryParam2 = 50
    quantMargin = 20
    What is the default value?

    Best Regards,
    Shigehiro Tsuda
  • Hi Tsuda-san,

    Thanks for the update. You can refer to the default values from our example application.

    Copying below description of the dynamic quantization parameter tuning again - 

    TIDL maintains range statistics for previously processed frames. It quantizes the current inference activations using range statistics from history for processes (weighted average range).

    Below is the parameters controls quantization -

    • quantMargin is margin added to the average in percentage.
    • quantHistoryParam1 weights used for previously processed inference during application boot time
    • quantHistoryParam2 weights used for previously processed inference during application execution (After initial few frames)

    Above parameters are set based on how the images in a sequence passed to TIDL  is co-related to each other. If it is a video sequence, most of the time current image will be similar to previous image and the quantization parameters from history can be used. If they are new class of image each time during boot time only or during run boot time, they aren't correlated to each other and the quantHistoryParam1 and/or quantHistoryParam2 can be set to zero. 

    Regards,

    Manisha

  • Hi Manisha,

    Thank you for your support.

    Sorry for the late reply.
    Is the default correct with the values ​​set in the following sources?

    tidl\tidl_api\src\configuration.cpp

    Configuration::Configuration(): numFrames(0), inHeight(0), inWidth(0),
    inNumChannels(0),
    noZeroCoeffsPercentage(100),
    preProcType(0),
    runFullNet(false),
    NETWORK_HEAP_SIZE(64 << 20), // 64MB for inceptionNetv1
    PARAM_HEAP_SIZE(9 << 20), // 9MB for mobileNet1
    enableOutputTrace(false),
    enableApiTrace(false),
    showHeapStats(false),
    quantHistoryParam1(20),
    quantHistoryParam2(5),
    quantMargin(0)
    {
    }

    quantHistoryParam1=20
    quantHistoryParam2=5
    quantMargin=0

    Best Regards,
    Shigehiro Tsuda

  • Hi Tsuda-san,

    You got default settings correctly. Mentioning below again the recommendation on dynamic quantization settings -

    Dynamic quantization settings debugging is mostly trial and error. Belo some high level guidelines:

    • To get the same result in TIDL target as import step for an image. Please set below parameters during algorithm creation.
    createParams.quantHistoryParam1 = 0;
    createParams.quantHistoryParam2 = 0;
    createParams.quantMargin = 0;

    • Set with below parameters for running on video sequence.
    createParams.quantHistoryParam1 = 20;
    createParams.quantHistoryParam2 = 10;
    createParams.quantMargin = 20;

    • Default values should be OK too in many cases (copied from testbench, tidl_tb.c)
    createParams.quantHistoryParam1 = 20;
    createParams.quantHistoryParam2 = 5;
    createParams.quantMargin = 0;


    Regards,
    Manisha
  • Hi Manisha,

    Thank you for quick reply and kindly support.

    I understand that debugging of dynamic quantization settings is mostly trial and error.
    Where does tidl_tb.c of testbench come from?
    I searched for Processor SDK but could not find it.

    Best Regards,
    Shigehiro Tsuda

  • Hi Tsuda-san,

    Currently the host emulation tool source code is not packaged in processor SDK Linux. Why would you like to refer to this file?

    Regards,
    Manisha