This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS: TDA4x: Layer dumps size issue

Tool/software: Code Composer Studio

Dear Sir,

Observation: For TDA2x:

Resolution is 512x512x3

Layer dump: 0(Data layer)trace_dump_0_512x512.zip

Size: 512x512x3 = 786432 bytes

Layer 0 visualized through YUView is below:

Observation: For TDA4x:

Resolution is 512x512x3

Layer dump: 0(Data layer)caffe_tidl_infer_msi_mobilenet_pd.txt_0000_00003_00512x00512.zip

Size: 512x512x3 (x2)extra is observed = 1572864 bytes

Layer 0 visualized  through YUView is below:

For TDA4x: 

Why the layer dumps has the double size of the input(512x512x3)?

How to confirm that input to the model is correct?

Kindly do the needful.

Thanks and Regards,

Vyom Mishra

  • Is the featureParamBits set to 16? that might lead to double buffer size for a tensor.

    Let me have a look at the files you have sent and get back to you

  • Dear Sir,

    As per the experiment,

    "numfeaturebits"  and "numparambits" is not affecting the size of the Data Layer dump(.y).

    It is constant as 512x512x3 (x2)extra.

    Thanks and Regards,

    Vyom Mishra

  • Gentle Reminder!

  • Vyom Mishra,

    I am attaching an image. Can you verify that this is your intended output? With the black pads in the top and bottom rows?

    This is a 512 x 512 image with size = 786432 bytes and it is derived from the file that you attached.

    In your file, each feature is 16 bits wide, and I had to run the following code to get the correct image

    for(i = 0; i < 512 * 512; i++) {
    new_buffer[i] = ((uint16_t*)buffer)[i];
    new_buffer[512 * 512 + i] = ((uint16_t *)buffer)[512 * 512 + i];
    new_buffer[512 * 512 * 2 + i] = ((uint16_t *)buffer)[512 * 512 * 2 + i];
    }

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/81/caffe_5F00_tidl_5F00_infer_5F00_msi_5F00_mobilenet_5F00_pd.txt_5F00_0000_5F00_00003_5F00_00512x00512_2D00_modified.y

    - Subhajit

  • Dear Sir,

    Now the image visualized is correct with black pads in the top and bottom rows.

    Can you please let us know where these changes have to be updated to get the correct trace dumps output so that we can do the layer-by-layer matching.

    I have several queries regarding the same,

    With ref. to https://e2e.ti.com/support/processors/f/791/t/876743#pi320966=2 post (Parallel query running for the same Padded model)

    we are getting some detections, some are missed but no FP's with below configuration:

    numparambits = numfeaturebits = 8 and quantizationstyle =3

    but by increasing the above parameters to 12/16 we are getting detections with FP's.

    Is this increase in the above parameters(from 8 to 12/16) can be a reason for the FP's?

    How to tackle this FP's issue?

    Kindly do the needful.

    Thanks and Regards,

    Vyom Mishra

  • Vyom Mishra,

    Can you share your import and infer config files.

    Looks like the model is imported for 16-bit flow with the trace that you have shared.

  • Dear Sir,

    I am sharing you the import config file and infer config file for your reference:

    modelType          = 0
    inputNetFile       = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/models/mando/fvc/od/L1_mob_final/deploy.prototxt"
    inputParamsFile    = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/models/mando/fvc/od/L1_mob_final/mob.caffemodel"
    outputNetFile      = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/models/mando/fvc/od/L1_mob_final/tidl_net_msi_mobilenet_pd_l1.bin"
    outputParamsFile   = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/models/mando/fvc/od/L1_mob_final/tidl_io_msi_mobilenet_pd_l1"
    numParamBits = 12
    numFeatureBits = 12
    quantizationStyle = 2
    inDataFormat = 0
    inElementType  = 0 
    inWidth = 512
    inHeight = 512
    inNumChannels = 3
    perfSimConfig = "../../test/testvecs/config/import/perfsim_base.cfg"
    inData = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/config/det_bck.txt"
    numFrames = 1
    postProcType = 2
    inFileFormat = 2
    
    
    
    
    
    
    

    inFileFormat    = 2
    postProcType = 2
    numFrames   = 1
    padInBuffInTB = 1
    netBinFile      = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/models/mando/fvc/od/L1_mob_final/tidl_net_msi_mobilenet_pd_l1.bin"
    ioConfigFile    = "/home/vyom/psdk_rtos_auto_j7_06_01_00_15/tidl_j7_01_00_00_00/ti_dl/test/testvecs/models/mando/fvc/od/L1_mob_final/tidl_io_msi_mobilenet_pd_l11.bin"
    outData =   "testvecs/output/msi_mobilenet.bin"
    inData  =   "testvecs/config/det_bck.txt"
    debugTraceLevel = 1
    writeTraceLevel = 3
    numFrames = 1
    
    
    
    
    
    
    

    Just informing you again regarding the size of the first layer dump:

    a) numparambits=numfeaturebits=8

        - It has the correct size of first layer dump i.e., 512x512x3

    b) numparambits=numfeaturebits=12/16

        - It has the size: 512x512x3x2

    So the traces shared earlier were of 

    numparambits=numfeaturebits=12 only.

    I am requesting you to share the changes made to generate the correct layer dumps which were made by Shubhajit.

    As we need to compare the PC and Import tool dumps.

    Thanks and Regards,

    Vyom Mishra

  • Vyom,

    For numParambits > 8 or numFeatureBits > 8, it is going to use W x H x 3 x 2 bytes, as the elements are 2 bytes (16 bits)

    - Subhajit

  • Dear Sir,

    As per the experiments,

    I am getting WxHx3X2 for below (b) configuration only

    a) if numparambits=numfeaturebits=8

        - It has the correct size of first layer dump i.e., 512x512x3

    b) if numparambits=numfeaturebits=12/16

        - It has the size: 512x512x3x2

     

    Thanks and Regards,

    Vyom Mishra

  • Refer below page for debugging accuracy mismatch issues.

    Comparing the input tensor to TIDL with Reference

    • It is important to match the input tensor to TIDL net with the input tensor of a network which was trained.
    • Save the input tensor from the training code that you are using in float format.
    • Use writeTraceLevel = 3 to write the layer level traces from the TIDL to files.
    • By default the data normalizing batchNorm layer is merged to following convolution layer. So set foldPreBnConv2D = 0 to avoid this.
    • Compare the output of this batchNorm layer with input tensor from training code. Refer Link

    http://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/latest/exports/docs/tidl_j7_01_01_00_10/ti_dl/docs/user_guide_html/md_tidl_fsg_steps_to_debug_mismatch.html

  • Dear Sir,

    We have followed the suggestion to set "foldPreBnConv2D = 0" but results were the same as before i.e., multiple boxes on a single object.

    We have also compared the output of this batchNorm layer with input tensor and results can be found in the below folder for your reference:

    TI Query.zip

    Observations:

    Data layer and Batch norm layer output data for both PC/Import tool matches. 

    Kindly provide the feedback for the same.

    I have a query regarding the "Feature Map Scale Analysis"

    As observed the min/max value for the layer-0(data layer) and layer-1(batch norm) has the values greater than 32. which means these layer only has the maximum quantization loss. FY reference please find the console output in the zip file shared above.

    But,

    As observed earlier and we followed the suggestion of "Weights Quantization statistic Analysis"

    we found and shared you that convolution(dw) i.e., depthwise has the maximum Quantization loss.

    Above two conclusions are different. Kindly help us to understand if we have misunderstood it. 

    Kindly do the needful.

    Thanks and Regards,

    Vyom Mishra

  • Dear Sir,

    We are trying to do Layer-by-Layer matching between Target and PC.

    Below are the visual observations till now :

    a)      Pooling layer output from the board side is matching with PC

    b)      Out of the first 5 convolutions, first convolution layer output doesn’t match with PC but others four convolution output matched with PC

    c)      Detection Output layer( final layer of the model)

              -        Target side: we have two Bounding boxes out of which one box matches with PC whereas another box is not matching as it is multiple bounding boxes on the same object.

    What could be a possible reason for point (a)?

    Kindly do the needful.

    Thanks and Regards,

    Vyom Mishra

  • layer level traces of PC and target execution is supposed to match (for the same input)

    Is the trace of input tensor  (data ID 0) matching if yes, can you share the sample model (Layers with issue) to reproduce the issue at our end

  • The model has been shared to Sujith and Kartik,

    We have made certain changes in the aspect ratios of the prior boxes. These changes are present in the deploy.prototxt. We are not sure if these changes are being taking foward into the model after running the import tool.

    Following is a snippet of the deploy.prototxt showing the kind of aspect ratio values for prior boxes

    layer {

      name: "ctx_output1/sep/relu_mbox_priorbox"

      type: "PriorBox"

      bottom: "ctx_output1/sep"

      bottom: "data"

      top: "ctx_output1/sep/relu_mbox_priorbox"

      prior_box_param {

        min_size: 35.0

        max_size: 109.800003052

        aspect_ratio: 0.5

        aspect_ratio: 0.333333343267

     

    Regards,

    Sankalp

  • Hi Sankalp,

    Can you share one input image and expected output from caffe for the same (Along with layer level float tensor from caffe).

    We will try to reproduce this issue at our end using the model that you have shared.

    Regarding the aspect_ratio, Import tool is expected to read these aspect ratios from deploy.prototxt during import

    Regards,

    Kumar.D

  • Hi Sankalp,

    I have also noticed below in the model that you have shared.

    layer {
    name: "ctx_output2/sep/relu_mbox_priorbox"
    type: "PriorBox"
    bottom: "ctx_output2/sep"
    bottom: "data"
    top: "ctx_output2/sep/relu_mbox_priorbox"
    prior_box_param {
    min_size: 109.800003052
    max_size: 184.600006104
    aspect_ratio: 0.5
    aspect_ratio: 0.40000000596
    aspect_ratio: 0.001
    aspect_ratio: 0.285714298487
    flip: true
    clip: false
    variance: 0.10000000149
    variance: 0.10000000149
    variance: 0.20000000298
    variance: 0.20000000298
    offset: 0.5
    }
    }

    This is means aspect raion of 1:1000. Is this expected?

    Regards,

    Kumar.D