This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5749: SSD model speed issue

Part Number: AM5749

Now, I have a AM5749 board, so I  test the ssd on the board. When I use the model the speed is 197ms which both use the eve and dsp. But when I use the model which I imported, the speed is 1.448e+04ms. And the model which I use is  /caffe-jacinto-models/trained/object_detection/voc0712/JDetNet/ssd768x320_ds_PSP_dsFac_32_hdDS8_0/sparse/voc0712_ssdJacintoNetV2_iter_120000.caffemodel(I get it on the git) and the tidl_import.txt is same to the .txt file in the filesytem. Could you give me some advice?

  • Is there some update

  • Part Number: AM5749

    I import the ssd-model and it output net.bin and param.bin. But the speed is too slow which is 1.487e+04ms. And when I use the net.bin and param.bin which in the filesystem the speed is 167ms. Did I neglect any precautions?

    Following is the import file and the model which I used is from caffe-jacito-models.

    # Default - 0
    randParams         = 0 
    
    # 0: Caffe, 1: TensorFlow, Default - 0
    modelType          = 0 
    
    # 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
    quantizationStyle  = 1 
    
    # quantRoundAdd/100 will be added while rounding to integer, Default - 50
    quantRoundAdd      = 25
    
    numParamBits       = 8
    # 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
    inElementType      = 0 
    
    inputNetFile      = "../test_quantize/deploy.prototxt"
    
    inputParamsFile    = "./voc0712_ssdJacintoNetV2_iter_120000.caffemodel"
    
    outputNetFile      = "./tidl_net_jdetNet_ssd_768x320.bin"
    outputParamsFile   = "./tidl_param_jdetNet_ssd_768x320.bin"
    
    rawSampleInData = 1
    preProcType   = 4
    sampleInData = "./trace_dump_0_768x320.y"
    tidlStatsTool = "eve_test_dl_algo_ref.out"
    layersGroupId = 0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	2	0
    conv2dKernelType = 0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1

  • layersGroupId 1 is for EVE. You are not using DSP for processing. All the layers are running on EVE only. Please offload last few layers on DSP. Make sure you are using two groups and  utilizing both eves and 1 DSP using command line argument.

    Also, check this post -
    https://e2e.ti.com/support/processors/f/791/t/857974#pi320966=1
  • I have checked the import.txt file and the last layer is on the dsp which layersGroupId is 2.

    And I read the post, and I confirm the import file is right.

  • And I will give you my import file and the model. The import log is following:

    [linux-devkit]:~/code/11-tidl/caffe-jacinto-models/trained/object_detection/voc0712/JDetNet/ssd768x320_ds_PSP_dsFac_32_hdDS8_0/sparse> tidl_model_import.out tidl_import_JDetNet_768x320.txt 
    Caffe Network File : ./deploy.prototxt  
    Caffe Model File   : ./voc0712_ssdJacintoNetV2_iter_120000.caffemodel  
    TIDL Network File  : ./tidl_net_jdetNet_ssd_768x320.bin  
    TIDL Model File    : ./tidl_param_jdetNet_ssd_768x320.bin  
    Name of the Network : ssdJacintoNetV2_deploy 
    Num Inputs :               1 
    Could not find detection_out Params
     Num of Layer Detected :  50 
      0, TIDL_DataLayer                , data                                      0,  -1 ,  1 ,   x ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  0 ,       0 ,       0 ,       0 ,       0 ,       1 ,       3 ,     320 ,     768 ,         0 ,
      1, TIDL_BatchNormLayer           , data/bias                                 1,   1 ,  1 ,   0 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  1 ,       1 ,       3 ,     320 ,     768 ,       1 ,       3 ,     320 ,     768 ,    737280 ,
      2, TIDL_ConvolutionLayer         , conv1a                                    1,   1 ,  1 ,   1 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  2 ,       1 ,       3 ,     320 ,     768 ,       1 ,      32 ,     160 ,     384 , 147456000 ,
      3, TIDL_ConvolutionLayer         , conv1b                                    1,   1 ,  1 ,   2 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  3 ,       1 ,      32 ,     160 ,     384 ,       1 ,      32 ,      80 ,     192 , 141557760 ,
      4, TIDL_ConvolutionLayer         , res2a_branch2a                            1,   1 ,  1 ,   3 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  4 ,       1 ,      32 ,      80 ,     192 ,       1 ,      64 ,      80 ,     192 , 283115520 ,
      5, TIDL_ConvolutionLayer         , res2a_branch2b                            1,   1 ,  1 ,   4 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  5 ,       1 ,      64 ,      80 ,     192 ,       1 ,      64 ,      40 ,      96 , 141557760 ,
      6, TIDL_ConvolutionLayer         , res3a_branch2a                            1,   1 ,  1 ,   5 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  6 ,       1 ,      64 ,      40 ,      96 ,       1 ,     128 ,      40 ,      96 , 283115520 ,
      7, TIDL_ConvolutionLayer         , res3a_branch2b                            1,   1 ,  1 ,   6 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  7 ,       1 ,     128 ,      40 ,      96 ,       1 ,     128 ,      20 ,      48 , 141557760 ,
      8, TIDL_ConvolutionLayer         , res4a_branch2a                            1,   1 ,  1 ,   7 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  8 ,       1 ,     128 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 , 283115520 ,
      9, TIDL_ConvolutionLayer         , res4a_branch2b                            1,   1 ,  1 ,   8 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  9 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 , 141557760 ,
     10, TIDL_PoolingLayer             , pool4                                     1,   1 ,  1 ,   9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 10 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      10 ,      24 ,    245760 ,
     11, TIDL_ConvolutionLayer         , res5a_branch2a                            1,   1 ,  1 ,  10 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 11 ,       1 ,     256 ,      10 ,      24 ,       1 ,     512 ,      10 ,      24 , 283115520 ,
     12, TIDL_ConvolutionLayer         , res5a_branch2b                            1,   1 ,  1 ,  11 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 12 ,       1 ,     512 ,      10 ,      24 ,       1 ,     512 ,      10 ,      24 , 141557760 ,
     13, TIDL_PoolingLayer             , pool6                                     1,   1 ,  1 ,  12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 13 ,       1 ,     512 ,      10 ,      24 ,       1 ,     512 ,       5 ,      12 ,    122880 ,
     14, TIDL_PoolingLayer             , pool7                                     1,   1 ,  1 ,  13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 14 ,       1 ,     512 ,       5 ,      12 ,       1 ,     512 ,       3 ,       6 ,     36864 ,
     15, TIDL_PoolingLayer             , pool8                                     1,   1 ,  1 ,  14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 15 ,       1 ,     512 ,       3 ,       6 ,       1 ,     512 ,       2 ,       3 ,     12288 ,
     16, TIDL_ConvolutionLayer         , ctx_output1                               1,   1 ,  1 ,   9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 16 ,       1 ,     256 ,      20 ,      48 ,       1 ,     256 ,      20 ,      48 ,  62914560 ,
     17, TIDL_ConvolutionLayer         , ctx_output2                               1,   1 ,  1 ,  12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 17 ,       1 ,     512 ,      10 ,      24 ,       1 ,     256 ,      10 ,      24 ,  31457280 ,
     18, TIDL_ConvolutionLayer         , ctx_output3                               1,   1 ,  1 ,  13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 18 ,       1 ,     512 ,       5 ,      12 ,       1 ,     256 ,       5 ,      12 ,   7864320 ,
     19, TIDL_ConvolutionLayer         , ctx_output4                               1,   1 ,  1 ,  14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 19 ,       1 ,     512 ,       3 ,       6 ,       1 ,     256 ,       3 ,       6 ,   2359296 ,
     20, TIDL_ConvolutionLayer         , ctx_output5                               1,   1 ,  1 ,  15 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 20 ,       1 ,     512 ,       2 ,       3 ,       1 ,     256 ,       2 ,       3 ,    786432 ,
     21, TIDL_ConvolutionLayer         , ctx_output1/relu_mbox_loc                 1,   1 ,  1 ,  16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 21 ,       1 ,     256 ,      20 ,      48 ,       1 ,      16 ,      20 ,      48 ,   3932160 ,
     22, TIDL_FlattenLayer             , ctx_output1/relu_mbox_loc_perm            1,   1 ,  1 ,  21 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 22 ,       1 ,      16 ,      20 ,      48 ,       1 ,       1 ,       1 ,   15360 ,         1 ,
     23, TIDL_ConvolutionLayer         , ctx_output1/relu_mbox_conf                1,   1 ,  1 ,  16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 23 ,       1 ,     256 ,      20 ,      48 ,       1 ,      84 ,      20 ,      48 ,  20643840 ,
     24, TIDL_FlattenLayer             , ctx_output1/relu_mbox_conf_perm           1,   1 ,  1 ,  23 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 24 ,       1 ,      84 ,      20 ,      48 ,       1 ,       1 ,       1 ,   80640 ,         1 ,
     26, TIDL_ConvolutionLayer         , ctx_output2/relu_mbox_loc                 1,   1 ,  1 ,  17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 26 ,       1 ,     256 ,      10 ,      24 ,       1 ,      24 ,      10 ,      24 ,   1474560 ,
     27, TIDL_FlattenLayer             , ctx_output2/relu_mbox_loc_perm            1,   1 ,  1 ,  26 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 27 ,       1 ,      24 ,      10 ,      24 ,       1 ,       1 ,       1 ,    5760 ,         1 ,
     28, TIDL_ConvolutionLayer         , ctx_output2/relu_mbox_conf                1,   1 ,  1 ,  17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 28 ,       1 ,     256 ,      10 ,      24 ,       1 ,     126 ,      10 ,      24 ,   7741440 ,
     29, TIDL_FlattenLayer             , ctx_output2/relu_mbox_conf_perm           1,   1 ,  1 ,  28 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 29 ,       1 ,     126 ,      10 ,      24 ,       1 ,       1 ,       1 ,   30240 ,         1 ,
     31, TIDL_ConvolutionLayer         , ctx_output3/relu_mbox_loc                 1,   1 ,  1 ,  18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 31 ,       1 ,     256 ,       5 ,      12 ,       1 ,      24 ,       5 ,      12 ,    368640 ,
     32, TIDL_FlattenLayer             , ctx_output3/relu_mbox_loc_perm            1,   1 ,  1 ,  31 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 32 ,       1 ,      24 ,       5 ,      12 ,       1 ,       1 ,       1 ,    1440 ,         1 ,
     33, TIDL_ConvolutionLayer         , ctx_output3/relu_mbox_conf                1,   1 ,  1 ,  18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 33 ,       1 ,     256 ,       5 ,      12 ,       1 ,     126 ,       5 ,      12 ,   1935360 ,
     34, TIDL_FlattenLayer             , ctx_output3/relu_mbox_conf_perm           1,   1 ,  1 ,  33 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 34 ,       1 ,     126 ,       5 ,      12 ,       1 ,       1 ,       1 ,    7560 ,         1 ,
     36, TIDL_ConvolutionLayer         , ctx_output4/relu_mbox_loc                 1,   1 ,  1 ,  19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 36 ,       1 ,     256 ,       3 ,       6 ,       1 ,      24 ,       3 ,       6 ,    110592 ,
     37, TIDL_FlattenLayer             , ctx_output4/relu_mbox_loc_perm            1,   1 ,  1 ,  36 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 37 ,       1 ,      24 ,       3 ,       6 ,       1 ,       1 ,       1 ,     432 ,         1 ,
     38, TIDL_ConvolutionLayer         , ctx_output4/relu_mbox_conf                1,   1 ,  1 ,  19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 38 ,       1 ,     256 ,       3 ,       6 ,       1 ,     126 ,       3 ,       6 ,    580608 ,
     39, TIDL_FlattenLayer             , ctx_output4/relu_mbox_conf_perm           1,   1 ,  1 ,  38 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 39 ,       1 ,     126 ,       3 ,       6 ,       1 ,       1 ,       1 ,    2268 ,         1 ,
     41, TIDL_ConvolutionLayer         , ctx_output5/relu_mbox_loc                 1,   1 ,  1 ,  20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 41 ,       1 ,     256 ,       2 ,       3 ,       1 ,      16 ,       2 ,       3 ,     24576 ,
     42, TIDL_FlattenLayer             , ctx_output5/relu_mbox_loc_perm            1,   1 ,  1 ,  41 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 42 ,       1 ,      16 ,       2 ,       3 ,       1 ,       1 ,       1 ,      96 ,         1 ,
     43, TIDL_ConvolutionLayer         , ctx_output5/relu_mbox_conf                1,   1 ,  1 ,  20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 43 ,       1 ,     256 ,       2 ,       3 ,       1 ,      84 ,       2 ,       3 ,    129024 ,
     44, TIDL_FlattenLayer             , ctx_output5/relu_mbox_conf_perm           1,   1 ,  1 ,  43 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 44 ,       1 ,      84 ,       2 ,       3 ,       1 ,       1 ,       1 ,     504 ,         1 ,
     46, TIDL_ConcatLayer              , mbox_loc                                  1,   5 ,  1 ,  22 , 27 , 32 , 37 , 42 ,  x ,  x ,  x , 46 ,       1 ,       1 ,       1 ,   15360 ,       1 ,       1 ,       1 ,   23088 ,         1 ,
     47, TIDL_ConcatLayer              , mbox_conf                                 1,   5 ,  1 ,  24 , 29 , 34 , 39 , 44 ,  x ,  x ,  x , 47 ,       1 ,       1 ,       1 ,   80640 ,       1 ,       1 ,       1 ,  121212 ,         1 ,
     49, TIDL_DetectionOutputLayer     , detection_out                             1,   2 ,  1 ,  46 , 47 ,  x ,  x ,  x ,  x ,  x ,  x , 49 ,       1 ,       1 ,       1 ,   23088 ,       1 ,       1 ,       1 ,    5600 ,         1 ,
    Total Giga Macs : 2.1312
    
    Processing config file ./tempDir/qunat_stats_config.txt !
      0, TIDL_DataLayer                ,  0,  -1 ,  1 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  0 ,    0 ,    0 ,    0 ,    0 ,    1 ,    3 ,  320 ,  768 ,
      1, TIDL_BatchNormLayer           ,  1,   1 ,  1 ,  0 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  1 ,    1 ,    3 ,  320 ,  768 ,    1 ,    3 ,  320 ,  768 ,
      2, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  1 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  2 ,    1 ,    3 ,  320 ,  768 ,    1 ,   32 ,  160 ,  384 ,
      3, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  2 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  3 ,    1 ,   32 ,  160 ,  384 ,    1 ,   32 ,   80 ,  192 ,
      4, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  3 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  4 ,    1 ,   32 ,   80 ,  192 ,    1 ,   64 ,   80 ,  192 ,
      5, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  4 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  5 ,    1 ,   64 ,   80 ,  192 ,    1 ,   64 ,   40 ,   96 ,
      6, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  5 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  6 ,    1 ,   64 ,   40 ,   96 ,    1 ,  128 ,   40 ,   96 ,
      7, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  6 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  7 ,    1 ,  128 ,   40 ,   96 ,    1 ,  128 ,   20 ,   48 ,
      8, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  7 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  8 ,    1 ,  128 ,   20 ,   48 ,    1 ,  256 ,   20 ,   48 ,
      9, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  8 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  9 ,    1 ,  256 ,   20 ,   48 ,    1 ,  256 ,   20 ,   48 ,
     10, TIDL_PoolingLayer             ,  1,   1 ,  1 ,  9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 10 ,    1 ,  256 ,   20 ,   48 ,    1 ,  256 ,   10 ,   24 ,
     11, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 10 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 11 ,    1 ,  256 ,   10 ,   24 ,    1 ,  512 ,   10 ,   24 ,
     12, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 11 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 12 ,    1 ,  512 ,   10 ,   24 ,    1 ,  512 ,   10 ,   24 ,
     13, TIDL_PoolingLayer             ,  1,   1 ,  1 , 12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 13 ,    1 ,  512 ,   10 ,   24 ,    1 ,  512 ,    5 ,   12 ,
     14, TIDL_PoolingLayer             ,  1,   1 ,  1 , 13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 14 ,    1 ,  512 ,    5 ,   12 ,    1 ,  512 ,    3 ,    6 ,
     15, TIDL_PoolingLayer             ,  1,   1 ,  1 , 14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 15 ,    1 ,  512 ,    3 ,    6 ,    1 ,  512 ,    2 ,    3 ,
     16, TIDL_ConvolutionLayer         ,  1,   1 ,  1 ,  9 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 16 ,    1 ,  256 ,   20 ,   48 ,    1 ,  256 ,   20 ,   48 ,
     17, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 12 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 17 ,    1 ,  512 ,   10 ,   24 ,    1 ,  256 ,   10 ,   24 ,
     18, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 13 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 18 ,    1 ,  512 ,    5 ,   12 ,    1 ,  256 ,    5 ,   12 ,
     19, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 14 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 19 ,    1 ,  512 ,    3 ,    6 ,    1 ,  256 ,    3 ,    6 ,
     20, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 15 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 20 ,    1 ,  512 ,    2 ,    3 ,    1 ,  256 ,    2 ,    3 ,
     21, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 21 ,    1 ,  256 ,   20 ,   48 ,    1 ,   16 ,   20 ,   48 ,
     22, TIDL_FlattenLayer             ,  1,   1 ,  1 , 21 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 22 ,    1 ,   16 ,   20 ,   48 ,    1 ,    1 ,    1 ,15360 ,
     23, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 16 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 23 ,    1 ,  256 ,   20 ,   48 ,    1 ,   84 ,   20 ,   48 ,
     24, TIDL_FlattenLayer             ,  1,   1 ,  1 , 23 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 24 ,    1 ,   84 ,   20 ,   48 ,    1 ,    1 ,    1 ,80640 ,
     25, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 26 ,    1 ,  256 ,   10 ,   24 ,    1 ,   24 ,   10 ,   24 ,
     26, TIDL_FlattenLayer             ,  1,   1 ,  1 , 26 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 27 ,    1 ,   24 ,   10 ,   24 ,    1 ,    1 ,    1 , 5760 ,
     27, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 17 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 28 ,    1 ,  256 ,   10 ,   24 ,    1 ,  126 ,   10 ,   24 ,
     28, TIDL_FlattenLayer             ,  1,   1 ,  1 , 28 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 29 ,    1 ,  126 ,   10 ,   24 ,    1 ,    1 ,    1 ,30240 ,
     29, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 31 ,    1 ,  256 ,    5 ,   12 ,    1 ,   24 ,    5 ,   12 ,
     30, TIDL_FlattenLayer             ,  1,   1 ,  1 , 31 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 32 ,    1 ,   24 ,    5 ,   12 ,    1 ,    1 ,    1 , 1440 ,
     31, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 18 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 33 ,    1 ,  256 ,    5 ,   12 ,    1 ,  126 ,    5 ,   12 ,
     32, TIDL_FlattenLayer             ,  1,   1 ,  1 , 33 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 34 ,    1 ,  126 ,    5 ,   12 ,    1 ,    1 ,    1 , 7560 ,
     33, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 36 ,    1 ,  256 ,    3 ,    6 ,    1 ,   24 ,    3 ,    6 ,
     34, TIDL_FlattenLayer             ,  1,   1 ,  1 , 36 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 37 ,    1 ,   24 ,    3 ,    6 ,    1 ,    1 ,    1 ,  432 ,
     35, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 19 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 38 ,    1 ,  256 ,    3 ,    6 ,    1 ,  126 ,    3 ,    6 ,
     36, TIDL_FlattenLayer             ,  1,   1 ,  1 , 38 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 39 ,    1 ,  126 ,    3 ,    6 ,    1 ,    1 ,    1 , 2268 ,
     37, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 41 ,    1 ,  256 ,    2 ,    3 ,    1 ,   16 ,    2 ,    3 ,
     38, TIDL_FlattenLayer             ,  1,   1 ,  1 , 41 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 42 ,    1 ,   16 ,    2 ,    3 ,    1 ,    1 ,    1 ,   96 ,
     39, TIDL_ConvolutionLayer         ,  1,   1 ,  1 , 20 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 43 ,    1 ,  256 ,    2 ,    3 ,    1 ,   84 ,    2 ,    3 ,
     40, TIDL_FlattenLayer             ,  1,   1 ,  1 , 43 ,  x ,  x ,  x ,  x ,  x ,  x ,  x , 44 ,    1 ,   84 ,    2 ,    3 ,    1 ,    1 ,    1 ,  504 ,
     41, TIDL_ConcatLayer              ,  1,   5 ,  1 , 22 , 27 , 32 , 37 , 42 ,  x ,  x ,  x , 46 ,    1 ,    1 ,    1 ,15360 ,    1 ,    1 ,    1 ,23088 ,
     42, TIDL_ConcatLayer              ,  1,   5 ,  1 , 24 , 29 , 34 , 39 , 44 ,  x ,  x ,  x , 47 ,    1 ,    1 ,    1 ,80640 ,    1 ,    1 ,    1 ,121212 ,
     43, TIDL_DetectionOutputLayer     ,  1,   2 ,  1 , 46 , 47 ,  x ,  x ,  x ,  x ,  x ,  x , 49 ,    1 ,    1 ,    1 ,23088 ,    1 ,    1 ,    1 , 5600 ,
     44, TIDL_DataLayer                ,  0,   1 , -1 , 49 ,  x ,  x ,  x ,  x ,  x ,  x ,  x ,  0 ,    1 ,    1 ,    1 , 5600 ,    0 ,    0 ,    0 ,    0 ,
    Layer ID    ,inBlkWidth  ,inBlkHeight ,inBlkPitch  ,outBlkWidth ,outBlkHeight,outBlkPitch ,numInChs    ,numOutChs   ,numProcInChs,numLclInChs ,numLclOutChs,numProcItrs ,numAccItrs  ,numHorBlock ,numVerBlock ,inBlkChPitch,outBlkChPitc,alignOrNot 
          2           72           72           72           32           32           32            3           32            3            1            8            1            3           12            5         5184         1024            1    
          3           40           34           40           32           32           32            8            8            8            4            8            1            2           12            5         1360         1024            1    
          4           40           22           40           32           20           32           32           64           32            8            8            1            4            6            4          880          640            1    
          5           40           22           40           32           20           32           16           16           16            8            8            1            2            6            4          880          640            1    
          6           40           22           40           32           20           32           64          128           64            8            8            1            8            3            2          880          640            1    
          7           40           22           40           32           20           32           32           32           32            8            8            1            4            3            2          880          640            1    
          8           56           22           56           48           20           48          128          256          128            7            8            1           19            1            1         1232          960            1    
          9           56           22           56           48           20           48           64           64           64            7            8            1           10            1            1         1232          960            1    
         11           40           12           40           32           10           32          256          512          256            8            8            1           32            1            1          480          320            1    
         12           40           12           40           32           10           32          128          128          128            8            8            1           16            1            1          480          320            1    
         16           48            4           48           48            4           48          256          256          256           32            8            1            8            1            5          192          192            1    
         17           24           10           24           24           10           24          512          256          512           32           32            1           16            1            1          240          240            1    
         18           12            5           12           12            5           12          512          256          512           32           32            1           16            1            1           60           60            1    
         19            6            3            6            6            3            6          512          256          512           32           32            1           16            1            1           18           18            1    
         20            3            2            3            3            2            3          512          256          512           32           32            1           16            1            1            6            6            1    
         21           48            4           48           48            4           48          256           16          256           32            8            1            8            1            5          192          192            1    
         23           48            4           48           48            4           48          256           88          256           32            8            1            8            1            5          192          192            1    
         25           24           10           24           24           10           24          256           24          256           32           24            1            8            1            1          240          240            1    
         27           24           10           24           24           10           24          256          128          256           32           32            1            8            1            1          240          240            1    
         29           12            5           12           12            5           12          256           24          256           32           24            1            8            1            1           60           60            1    
         31           12            5           12           12            5           12          256          128          256           32           32            1            8            1            1           60           60            1    
         33            6            3            6            6            3            6          256           24          256           32           24            1            8            1            1           18           18            1    
         35            6            3            6            6            3            6          256          128          256           32           32            1            8            1            1           18           18            1    
         37            3            2            3            3            2            3          256           16          256           32           16            1            8            1            1            6            6            1    
         39            3            2            3            3            2            3          256           96          256           32           32            1            8            1            1            6            6            1    
    
    Processing Frame Number : 0 
    
     Layer    1 : Out Q :      254 , TIDL_BatchNormLayer  , PASSED  #MMACs =     0.74,     0.74, Sparsity :   0.00
     Layer    2 : Out Q :     6021 , TIDL_ConvolutionLayer, PASSED  #MMACs =   147.46,    92.65, Sparsity :  37.17
     Layer    3 : Out Q :     6168 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    53.33, Sparsity :  62.33
     Layer    4 : Out Q :    11702 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    83.44, Sparsity :  70.53
     Layer    5 : Out Q :    10597 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    66.11, Sparsity :  53.30
     Layer    6 : Out Q :    13807 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    91.59, Sparsity :  67.65
     Layer    7 : Out Q :    16861 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    57.32, Sparsity :  59.51
     Layer    8 : Out Q :    18642 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    96.27, Sparsity :  66.00
     Layer    9 : Out Q :    12901 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    52.28, Sparsity :  63.07
     Layer   10 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.06,     0.06, Sparsity :   0.00
     Layer   11 : Out Q :    20342 , TIDL_ConvolutionLayer, PASSED  #MMACs =   283.12,    76.31, Sparsity :  73.04
     Layer   12 : Out Q :     5763 , TIDL_ConvolutionLayer, PASSED  #MMACs =   141.56,    31.40, Sparsity :  77.82
     Layer   13 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.03,     0.03, Sparsity :   0.00
     Layer   14 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   15 :TIDL_PoolingLayer,     PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   16 : Out Q :    18571 , TIDL_ConvolutionLayer, PASSED  #MMACs =    62.91,    62.91, Sparsity :   0.00
     Layer   17 : Out Q :    13599 , TIDL_ConvolutionLayer, PASSED  #MMACs =    31.46,    31.46, Sparsity :   0.00
     Layer   18 : Out Q :    17793 , TIDL_ConvolutionLayer, PASSED  #MMACs =     7.86,     7.86, Sparsity :   0.00
     Layer   19 : Out Q :    18851 , TIDL_ConvolutionLayer, PASSED  #MMACs =     2.36,     2.36, Sparsity :   0.00
     Layer   20 : Out Q :    26620 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.79,     0.79, Sparsity :   0.00
     Layer   21 : Out Q :     4438 , TIDL_ConvolutionLayer, PASSED  #MMACs =     3.93,     3.93, Sparsity :   0.00
     Layer   22 :TIDL_FlattenLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
     Layer   23 : Out Q :     3888 , TIDL_ConvolutionLayer, PASSED  #MMACs =    21.63,    21.63, Sparsity :   0.00
     Layer   24 :TIDL_FlattenLayer, PASSED  #MMACs =     0.08,     0.08, Sparsity :   0.00
     Layer   25 : Out Q :     8255 , TIDL_ConvolutionLayer, PASSED  #MMACs =     1.47,     1.47, Sparsity :   0.00
     Layer   26 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   27 : Out Q :     2918 , TIDL_ConvolutionLayer, PASSED  #MMACs =     7.86,     7.86, Sparsity :   0.00
     Layer   28 :TIDL_FlattenLayer, PASSED  #MMACs =     0.03,     0.03, Sparsity :   0.00
     Layer   29 : Out Q :    10803 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.37,     0.37, Sparsity :   0.00
     Layer   30 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   31 : Out Q :     2794 , TIDL_ConvolutionLayer, PASSED  #MMACs =     1.97,     1.97, Sparsity :   0.00
     Layer   32 :TIDL_FlattenLayer, PASSED  #MMACs =     0.01,     0.01, Sparsity :   0.00
     Layer   33 : Out Q :     6802 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.11,     0.11, Sparsity :   0.00
     Layer   34 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   35 : Out Q :     3384 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.59,     0.59, Sparsity :   0.00
     Layer   36 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   37 : Out Q :     9125 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.02,     0.02, Sparsity :   0.00
     Layer   38 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   39 : Out Q :     4334 , TIDL_ConvolutionLayer, PASSED  #MMACs =     0.15,     0.15, Sparsity :   0.00
     Layer   40 :TIDL_FlattenLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   0.00
     Layer   41 : Out Q :     4455 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   -nan
     Layer   42 : Out Q :     2805 , TIDL_ConcatLayer, PASSED  #MMACs =     0.00,     0.00, Sparsity :   -nan
     Layer   43 : #MMACs =     0.01,     0.01, Sparsity :   0.00
    End of config list found !
    [linux-devkit]:~/code/11-tidl/caffe-jacinto-models/trained/object_detection/voc0712/JDetNet/ssd768x320_ds_PSP_dsFac_32_hdDS8_0/sparse> 
    

    ssd.zip

  • Any update here?

  • Hi, echoing Manisha's comments, it seems to me it performance issue could be related distribution of layers between DSP/EVE. 

    From http://downloads.ti.com/mctools/esd/docs/tidl-api/example.html#ssd we have:

    The ssd network used in both categories has 43 layers. Input to the network is RGB image of size 768x320. Output is a list of boxes (up to 20), each box has information about the box coordinates, and which pre-trained category that the object inside the box belongs to. The example will take the network output, draw boxes accordingly, and create an output image. The network can be run entirely on either EVE or C66x. However, the best performance comes with running the first 30 layers as a group on EVE and the next 13 layers as another group on C66x. Our end-to-end example shows how easy it is to assign a Layer Group id to an Executor and how easy it is to construct an ExecutionObjectPipeline to connect the output of one Executor’s ExecutionObject to the input of another Executor’s ExecutionObject


    It looks to me only the last layer is offloaded to DSP. I would suggest to use Network graph viewer. It is pretty useful for checking how layers are distributed. In other words, with this network viewer, you can compare OOB demo SSD JDetNet layers distribution with yours.

    You could also check this E2E (Dec 5 post) for an example TIDL SSD JDetNet configuration file, as a reference. 

    Also, I couldn't open your previously posted ssd.zip. When you have a chance please upload it again.

    thank you,

    Paula

  • Thanks for you reply. And can you give me your import file?

  • Hi, I had an example import configuration file in the post mentioned above (E2E - Dec 5 post). Attached also here for your convenience.

    # Default - 0
    randParams         = 0 
    
    # 0: Caffe, 1: TensorFlow, Default - 0
    modelType          = 0 
    
    # 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
    quantizationStyle  = 1 
    
    # quantRoundAdd/100 will be added while rounding to integer, Default - 50
    quantRoundAdd      = 25
    
    numParamBits       = 8
    # 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
    inElementType      = 0 
    
    inputNetFile      = "/home/root/TIDL_utils_bin/OD_SSD_JDetNet/ssd512x512_ds_PSP_dsFac_32_fc_0_hdDS8_1_kerMbox_3_1stHdSameOpCh_1/sparse/deploy.prototxt"
    
    inputParamsFile    = "/home/root/TIDL_utils_bin/OD_SSD_JDetNet/ssd512x512_ds_PSP_dsFac_32_fc_0_hdDS8_1_kerMbox_3_1stHdSameOpCh_1/sparse/voc0712_ssdJacintoNetV2_iter_104000.caffemodel"
    
    outputNetFile      = "/home/root/TIDL_utils_bin/tidl_net_jdetNet_512x512.bin"
    outputParamsFile   = "/home/root/TIDL_utils_bin/tidl_param_jdetNet_512x512.bin"
    
    rawSampleInData = 1
    preProcType   = 4
    sampleInData = "/home/root/TIDL_utils_bin/testvecs/input/pexels_2_515x512_frame.y"
    tidlStatsTool = "eve_test_dl_algo_ref.out"
    layersGroupId = 0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	2	1	2	1	2	1	2	1	2	1	2	1	2	1	2	1	2	1	2	1	2	1	2	2	2	2	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1
    
    
    thank you,

    Paula

  • Hi,

    Please check this section "Splitting the network layers between available accelerators (EVE subsystem and/or C66x cores)" in the user guide document. 

    Few layers like SoftMax, Flatten and Concat layers run faster on DSP. 

    Configuration for the layer grouping can be either done at the time of importing the model or at the time of running the inference. I suggest you skip the layer grouping configuration at the import time and do them at the inference time. You can follow the example configuration as shown in below example config file  -

    Example for your use case would be -

    layerIndex2LayerGroupId = { {22, 2}, {24, 2}, {26, 2} ..and so on... }

    first number in the set is layer id and second is core id. DSP has core id of 2, EVE has core id 1. The layers that aren't mentioned in the
    layerIndex2LayerGroupId, runs by default on EVE.

    Hope this helps. Let me know your finding.

    Regards,
    Manisha
  • Thanks for your reply. And I will test it.

  • Thanks for your reply and it worked.