This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VH-Q1: seg model infer fail

Part Number: TDA4VH-Q1

Tool/software:

Issue Description: The same model, inferring the same image. The result obtained using the TIDL simulation tool is normal. However, when running the inference on the evaluation board, the result is abnormal. Comparing the binary results of both, it is found that most parts are the same, with the matching parts corresponding to the segmented results, which validate correctly. The parts where the binary results differ between the two correspond to the erroneous results.

Original Model:
segod_res18_410.zip
Converted Model:
tidl_io_seg_1.zip

tidl_net_seg.zip
Simulation Inference Result:
seg_out.zip
Evaluation Board Inference Result:
em_seg_output_nopad.zip
Binary Comparison Result:

Sample Image:

Floating-point numbers of each layer output:
seg_double_result.zip

  • Additional Information:
    Inference results obtained on the PC, which we believe are correct
    import config:

    #Basic Configuration Parameters
    modelType          = 2
    numParamBits       = 8
    
    
    inputNetFile      = "../../test/testvecs/models/public/onnx/segod_res18_410.onnx"
    outputNetFile      = "../../test/testvecs/output/seg/tidl_net_seg.bin"
    outputParamsFile   = "../../test/testvecs/output/seg/tidl_io_seg_"
    
    inDataNamesList = "data"
    outDataNamesList = "od_pred_center_heat od_reg_map od_points_refine_map feature index"
    inWidth  = 384
    inHeight = 384
    inNumChannels = 3
    
    # Input Pre-Processing Parameters
    inDataNorm  = 1
    inMean = 123.675 116.28 103.53
    inScale = 0.0171247538 0.0175070028 0.0174291939
    inDataFormat = 0
    
    # Performance Simulation Parameters
    #perfSimTool = 	../../utils/perfsim/ti_cnnperfsim.out
    #perfSimConfig = ../../test/testvecs/config/import/device_config.cfg
    #graphVizTool = ../../utils/tidlModelGraphviz/out/tidl_graphVisualiser.out
    
    # Range Collection (Quant) Parameters
    #tidlStatsTool = ../../test/PC_dsp_test_dl_algo.out
    inData = ../../test/testvecs/config/quant/quant_list_seg.txt
    inFileFormat 		    = 2
    postProcType 		    = 0
    CalibrationOption = 7
    quantizationStyle = 3
    debugTraceLevel = 2
    writeTraceLevel = 3

    import log:
    seg_0527.log

  • Hi Shinting,

    Floating-point numbers of each layer output:

    1. How did you get the results of each layer? In pc emulation mode or target evm mode? And how did you convert from bin to the txt files?

    2. Could you also provide the infer config?

    Regards,

    Adam

  • I obtained the results of each layer by adding the parameter writeTraceLevel = 3 during model conversion on the PC. I converted every 4 bytes in the binary file to a float, and then stored them in a txt file. As shown in the figure below:

    infer config:

    ./seg/tidl_infer_seg.txt
    0
    tidl_infer_seg.zip
    ./seg/20210313_000506_000025.bmp

  • Hi 

    Can you also try this on evm? 

    I obtained the results of each layer by adding the parameter writeTraceLevel = 3 during model conversion on the PC

    You may see a folder called "trace" where you execute the executable and the layer result will be saved into it. You may also check the /tmp folder if there is no "trace" folder

    The target for these steps is to know which layer starts to produce this mismatch and will help further debug

    Regards,

    Adam

  • On the evm, we perform inference using the encapsulated vision_apps.
    How do I set writeTraceLevel = 3 to output the results of each layer?
    Key code snippet:



    Source code:

    #include "em_api.h"
    #include <stdlib.h>
    #include <stdio.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    //#define NOPAD
    
    int InferPld(const char* inputDataPath)
    {
        int ret = 0;
        uint8_t *pldInputP = (uint8_t *)malloc(768*768*3);
        uint8_t* tmpP = (uint8_t *)malloc(768*768*3);
        FILE *inputFp = fopen(inputDataPath, "r");
        if(inputFp != NULL)
        {
            fseek(inputFp, 54, SEEK_SET);
            fread(tmpP, 768*768*3, 1, inputFp);
            fclose(inputFp);
            /* Deinterleave RGB to planar */
            uint32_t chOffset = 768 * 768;
            for(int i = 0; i < 768; i ++)
            {
                uint8_t* pData = tmpP + ((768-1-i)*(768*3));
                for(int j = 0; j < 768; j ++)
                {
                    pldInputP[(0 * chOffset) + (i * 768) + j] = *pData++;
                    pldInputP[(1 * chOffset) + (i * 768) + j] = *pData++;
                    pldInputP[(2 * chOffset) + (i * 768) + j] = *pData++;
                }
            }
            printf("---------pld test input\n");
        }
    
    
        uint8_t *outputP[10];
        for(int i=0; i<10; i++){
            outputP[i] = (uint8_t *)malloc(100000);
        }
    
        ret = em_tidl_init();
        printf("0======%d\n", ret);
        int netId = em_tidl_net_init("/run/media/rootfs-mmcblk0p1/ai_test/tidl_test_lnx/test_data/tidl_io_pld_1.bin", "/run/media/rootfs-mmcblk0p1/ai_test/tidl_test_lnx/test_data/tidl_net_pld.bin");
        printf("1======%d\n", netId);
    
        ret = em_tidl_infer(netId, pldInputP);
        ret = em_tidl_getOutput(netId, outputP);
        printf("2======%d\n", ret);
    
        FILE *outputFp = fopen("./em_pld_output", "w+");
    
    #ifdef NOPAD
        for(int i=0;i<3;i++) //corner_cls corner_angle corner_reg  192*221*2
        {
            int index = 0;
            int dataSize = 192*192;
            int padSize = 192*29;
            fwrite(outputP[i]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
            fwrite(outputP[i]+index, dataSize, 1, outputFp);
        }
        
        //points_map 48*99*8
        int index = 0;
        int dataSize = 48*48;
        //int padSize = 48*51;
        int padSize = 0x980;
        int channel = 8;
        int outputIndex=3;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
    
        //heat_map 48*99*3
        index = 0;
        dataSize = 48*48;
        //padSize = 48*51;  0x990
        padSize = 0x980;
        channel = 3;
        outputIndex=4;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
    
        //first_pt_map 48*99*4
        index = 0;
        dataSize = 48*48;
        //padSize = 48*51;
        padSize = 0x980;
        channel = 4;
        outputIndex=5;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
    #else
        fwrite(outputP[0], 192*221*2, 1, outputFp); //corner_cls
        fwrite(outputP[1], 192*221*2, 1, outputFp); //corner_angle
        fwrite(outputP[2], 192*221*2, 1, outputFp); //corner_reg
        fwrite(outputP[3], 48*99*8, 1, outputFp); //points_map
        fwrite(outputP[4], 48*99*3, 1, outputFp); //heat_map
        fwrite(outputP[5], 48*99*4, 1, outputFp); //first_pt_map
    #endif
        
        fclose(outputFp);
    
        ret = em_tidl_net_deinit(netId); 
        ret = em_tidl_deinit(); 
        printf("3======%d\n", ret);
    }
    
    int InferOd(const char* inputDataPath)
    {
        int ret = 0;
        /*inWidth  = 512
    inHeight = 320
    inNumChannels = 3*/
     
        uint8_t *odInputP = (uint8_t *)malloc(512*320*3);
        uint8_t* tmpP = (uint8_t *)malloc( 512*320*3);
        FILE *inputFp = fopen(inputDataPath, "r");
        if(inputFp != NULL)
        {
            fseek(inputFp, 54, SEEK_SET);
            fread(tmpP, 512*320*3, 1, inputFp);
            fclose(inputFp);
            /* Deinterleave RGB to planar */
            uint32_t chOffset = 512 * 320;
            for(int i = 0; i < 320; i ++)
            {
                uint8_t* pData = tmpP + ((320-1-i)*(512*3));
                for(int j = 0; j < 512; j ++)
                {
                    odInputP[(0 * chOffset) + (i * 512) + j] = *pData++;
                    odInputP[(1 * chOffset) + (i * 512) + j] = *pData++;
                    odInputP[(2 * chOffset) + (i * 512) + j] = *pData++;
                }
            }
            printf("---------od test input\n");
        }
    
    
        uint8_t *outputP[10];
        for(int i=0; i<10; i++){
            outputP[i] = (uint8_t *)malloc(100000);
        }
    
        ret = em_tidl_init();
        printf("0======%d\n", ret);
        int netId = em_tidl_net_init("/run/media/rootfs-mmcblk0p1/ai_test/tidl_test_lnx/test_data/tidl_io_od_1.bin", "/run/media/rootfs-mmcblk0p1/ai_test/tidl_test_lnx/test_data/tidl_net_od.bin");
        printf("1======%d\n", netId);
    
        ret = em_tidl_infer(netId, odInputP);
        ret = em_tidl_getOutput(netId, outputP);
        printf("2======%d\n", ret);
    
        FILE *outputFp = fopen("./em_od_output", "w+");
    
    #ifdef NOPAD
        /*
        hm      128*110*5
        hm_det  128*110*6
        wh      128*110*2
        reg     128*110*2
        hps_coord   128*110*8
        hps_conf    128*110*4
        hm_hp_car4  128*110*4
        hp_offset   128*110*2
        */
    
        //hm      128*110*5
        int index = 0;
        int dataSize = 128*80;
        int padSize = 128*30; //128*30
        int channel = 5;
        int outputIndex=0;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //hm_det  128*110*6
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 6;
        outputIndex=1;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //wh      128*110*2
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 2;
        outputIndex=2;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //reg     128*110*2
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 2;
        outputIndex=3;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //hps_coord   128*110*8
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 8;
        outputIndex=4;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //hps_conf    128*110*4
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 4;
        outputIndex=5;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //hm_hp_car4  128*110*4
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 4;
        outputIndex=6;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //hp_offset   128*110*2
        index = 0;
        dataSize = 128*80;
        padSize = 128*30; //128*30
        channel = 2;
        outputIndex=7;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
    #else
        fwrite(outputP[0], 128*110*5, 1, outputFp); //hm
        fwrite(outputP[1], 128*110*6, 1, outputFp); //hm_det
        fwrite(outputP[2], 128*110*2, 1, outputFp); //wh
        fwrite(outputP[3], 128*110*2, 1, outputFp); //reg
        fwrite(outputP[4], 128*110*8, 1, outputFp); //hps_coord
        fwrite(outputP[5], 128*110*4, 1, outputFp); //hps_conf
        fwrite(outputP[6], 128*110*4, 1, outputFp); //hm_hp_car4
        fwrite(outputP[7], 128*110*2, 1, outputFp); //hp_offset
    #endif
        fclose(outputFp);
    
        ret = em_tidl_net_deinit(netId); 
        ret = em_tidl_deinit(); 
        printf("3======%d\n", ret);
    }
    
    int InferSeg(const char* inputDataPath)
    {
        int ret = 0;
        /*inWidth  = 384
    inHeight = 384
    inNumChannels = 3*/
        uint8_t *segInputP = (uint8_t *)malloc(384*384*3);
        uint8_t* tmpP = (uint8_t *)malloc( 384*384*3);
        FILE *inputFp = fopen(inputDataPath, "r");
        if(inputFp != NULL)
        {
            fseek(inputFp, 54, SEEK_SET);
            fread(tmpP, 384*384*3, 1, inputFp);
            fclose(inputFp);
            /* Deinterleave RGB to planar */
            uint32_t chOffset = 384 * 384;
            for(int i = 0; i < 384; i ++)
            {
                uint8_t* pData = tmpP + ((384-1-i)*(384*3));
                for(int j = 0; j < 384; j ++)
                {
                    segInputP[(0 * chOffset) + (i * 384) + j] = *pData++;
                    segInputP[(1 * chOffset) + (i * 384) + j] = *pData++;
                    segInputP[(2 * chOffset) + (i * 384) + j] = *pData++;
                }
            }
            printf("---------seg test input\n");
        }
    
        uint8_t *outputP[10];
        for(int i=0; i<10; i++){
            outputP[i] = (uint8_t *)malloc(150000);
            memset(outputP[i], 0, 150000); // 将申请的空间全部设为0
        }
    
        ret = em_tidl_init();
        printf("0======%d\n", ret);
        int netId = em_tidl_net_init("/run/media/rootfs-mmcblk0p1/ai_test/tidl_test_lnx/test_data/tidl_io_seg_1.bin", "/run/media/rootfs-mmcblk0p1/ai_test/tidl_test_lnx/test_data/tidl_net_seg.bin");
        printf("1======%d\n", netId);
    
        ret = em_tidl_infer(netId,segInputP);
        ret = em_tidl_getOutput(netId, outputP);
        printf("2======%d\n", ret);
    
        FILE *outputFp = fopen("./em_seg_output", "w+");
    
    #ifdef NOPAD
        /*
          od_pred_center_heat    96*(96+51)*5
          od_reg_map             96*(96+51)*2
          od_points_refine_map   96*(96+51)*4
          feature                384*(384+11)*9
          index                  384*384*1
        */
    
        //od_pred_center_heat    96*(96+51)*5
        int index = 0;
        int dataSize = 96*96;
        int padSize = 96*51; //96*51
        int channel = 5;
        int outputIndex=0;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //od_reg_map             96*(96+51)*2
        index = 0;
        dataSize = 96*96;
        padSize = 96*51; //96*51
        channel = 2;
        outputIndex=1;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //od_points_refine_map   96*(96+51)*4
        index = 0;
        dataSize = 96*96;
        padSize = 96*51; //96*51
        channel = 4;
        outputIndex=2;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //feature                384*(384+11)*9
        index = 0;
        dataSize = 384*384;
        padSize = 384*11 - 0x80; //384*11
        channel = 9;
        outputIndex=3;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
        //index                  384*384*1
        index = 0;
        dataSize = 384*384;
        padSize = 0; //no pad
        channel = 1;
        outputIndex=4;
        for(int i=0;i<channel;i++)
        {
            fwrite(outputP[outputIndex]+index, dataSize, 1, outputFp);
            index += dataSize;
            index += padSize;
        }
    #else
        fwrite(outputP[0], 96*147*5, 1, outputFp); //od_pred_center_heat
        fwrite(outputP[1], 96*147*2, 1, outputFp); //od_reg_map
        fwrite(outputP[2], 96*147*4, 1, outputFp); //od_points_refine_map
        fwrite(outputP[3], 384*395*9, 1, outputFp); //feature
        fwrite(outputP[4], 384*384*1, 1, outputFp); //index
    #endif
        fclose(outputFp);
    
        ret = em_tidl_net_deinit(netId); 
        ret = em_tidl_deinit(); 
        printf("3======%d\n", ret);
    }
    
    
    
    int main(int argc, char *argv[])
    {
        
    
        InferPld("./test_data/pld/20210403_101047_000050.bmp");
        //InferOd("./test_data/od/20210706_154150_226_right.bmp");
        //InferSeg("./test_data/seg/20210313_000506_000025.bmp");
    
        return 0;
    }
    
    app_tidl_em.zip

  • Hi 

    As we communicate, you can search for traceWriteLevel in your code and set it to 3 to get the intermediate results of each layer. 

    And it is good to know the results of using ti rtos executable produces the same results in PC and EVM. The issue seems to be in postprocessing of customer code. There isn't mismatch problem now. 

    This ticket can be closed then. 

    Regards,

    Adam