This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Model binary converted by TIDL importer generates saturated output for TDA4VL TDA4VH and TDA4VM(TI Edge Cloud)

Other Parts Discussed in Thread: TDA4VM, TDA4VL, TDA4VH, AM68A

Environment:

  1. TI Edge AI Cloud (which connects to a TDA4VM board).
  2. Ubuntu20.04, PROCESSOR-SDK-RTOS-J721S2, 08.00.04.04, for TDA4VL.
  3. Ubuntu20.04, PROCESSOR-SDK-RTOS-J784S4, 08.02.02.02, for TDA4VH.

        We tried the three above environments separately and got the same problem below.

Model:

  • MobileNetV2 + 1fc layer. 
  • Inputs are size 96x96x3 images, uint8. Pixels are normalized to [-1,1] in training process, i.e., y=(x-127.5)/127.5
  • Outputs are angles in [0,180] range
  • The float tflite model runs well and we got expected outputs in the range [0,180] for any input images.
  • Layers are supported by TIDL. Same architecture was used for 0-1 classification and can be converted and run without issue. 

Problem Description:

        We are generating 16-bit model binaries from a well trained tflite file for TDA4VH. Calibration data include 119 images with ground truth covering the range [0, 180]. Testing data include 60 images with ground truth covering the same range. Testing is run in emulation mode on PC. The configuration for importer and inference look like below.

importer_config

inference_config

The 16bit binaries can be generated without any issue. When evaluated the performance, we found that the outputs are cut-off at around 132. In other words, it works well on any images with groud truth <132. For input iamges with groud truth>132, it output fixed values like 132.1528. Seems like the value got overflow somewhere. Output predictions look like below.

binary_predictions

We then uploaded the float tflite model and images to TI Edge Cloud to generate model binaries, and found it has exactly the same problem, like below,

We then tested it with TIDL importer under the SDK TDA4VL, same again. Our float tflite model does not have this problem. Below is the output from our float tflite model, everything is good.

What would be a solution to this problem? Thanks.

Hello  , I am looking for techinical help. Could you reivew my problem described above and if possible give any potential solutions? Our team need this to be solved ASAP. Please let me know if I need to provide more details for you to investigate. Thanks. 

  • Hi Yongquan, I will check with our experts. But if is it possible to share a model to reproduce the issue it would be great.

    thank you,

    Paula

  • Hi Paula, thanks for replying. What is the best way to share a file with you? The model is under NDA between us and TI.

  • Hi Yongquan, I just send you an E2E friend request. 

    thank you,

    Paula

  • Thanks. I send the files to you via private message.

  • Hi Yongquan, thanks a lot for sharing your files. I was able to reproduce your results. However, I don't think it is a saturated output issue to 132.32. In your notebook I increased the advanced_options:calibration_iterations to 50 and output for samples files with higher gt (>140) now shows 140,17.

    thank you,

    Paula

  • When the advanced_options:calibration_iterations=3, the outputs were capped at around 132.  When the advanced_options:calibration_iterations=50, the outputs were capped at around 140 which is as shown in the above reply. No matter what calibration_iterations was tried, the output were capped at an value.  In addition, for 15 selected samples that have gt<60 (outputs were not large enough to be capped), the MAE(gt, prediction) showed an obvious increase wen changed the calibration_iterations from 3 to 50.  

    I am still expecting a solution to the problem.  Thanks.

  • Yongquan, sorry for the delay. I am getting the right expert to help us.

    Thank you,

    Paula

  • hi Yongquan, I believe you are using PSDK 8.2 (or 8.0). Could you please try latest 8.6? Few fixes for 16bit were introduced in our latest SDK

    PROCESSOR-SDK-J721S2 Software development kit (SDK) | TI.com

    PROCESSOR-SDK-J721E Software development kit (SDK) | TI.com

    If you still see the same issue, please try script (posted in below link) which generates some plots to compare TIDL float output with TIDL fixed point output. Those plots can give us some clues on where the issue is.

    software-dl.ti.com/.../md_tidl_fsg_steps_to_debug_mismatch.html

    thank you,

    Paula

  • Hi Paula, thanks for the suggestion. Our team began to upgrade the PSDKs to latest in last week. Once we got results I will let you know.

    I have a question regarding the TI Edge Cloud. What is the version of SDK it is running with? We assumed it is already upgraded to the latest. 

    Thank you. 

  • Hi Yongquan, cloud is currently running PSDK8.2 for TDA4VM (a.k.a AM68PA). Soon, ~1week, we will update the cloud to PSDK8.6 and we will have more devices available (AM62A, AM68A, AM68PA) for evaluation.

    Thank you,

    Paula

  • Hi Paula, we tried PSDK8.5 and 8.6. The problem was not solved. we got exactly the same outputs capped at a value. While I am looking into the script provided above, could you or your expert continue to help with the diagosis?   Thanks.   

  • Setting quantRangeExpansionFactor to larger values, e.g 1.5, than default solved our capped-prediction problem. The trick is it worked for some models, for some others predictions are still capped. Then need to change quantRangeExpansionFactor to larger until getting normal outputs. 

  • There is an option to disable histogram based range estimation and use min/max based range estimation instead. That will also help to capture a larger range.

  • Hi Manu, which specific option can disable the histogram based range estimation and use min/max estimation please? 

  • If you are using TIDL-RT, calibrationOption is an import config option that can be used to control this. It is a bit-field and the 0th bit can be switched off to disable histogram based activation clipping.

    If you are using calibrationOption = 7, then use 

    calibrationOption = 6 to disable this.

  • Hi Manu and   , after some tests I found correct setting should be "calibrationOption = 7" and "activationRangeMethod =1". That gives us sounding resutls. Do not need to tune 'quantRangeExpansionFactor ' anymore.

    Now, my question is, how can I add the two parameters "calibrationOption" and "activationRangeMethod" to the "compile_options" when using the TI EdgeAI Cloud?

    I tried (note: last two options)

    compile_options = {
        'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
        'artifacts_folder' : output_dir,
        'tensor_bits' : num_bits,
        'accuracy_level' : accuracy,
        'advanced_options:calibration_frames' : len(calib_images),
        'advanced_options:calibration_iterations' : 3,
        'calibrationOption' : 7,
        'activationRangeMethod' : 1
    }

    and 

    compile_options = {
        'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
        'artifacts_folder' : output_dir,
        'tensor_bits' : num_bits,
        'accuracy_level' : accuracy,
        'advanced_options:calibration_frames' : len(calib_images),
        'advanced_options:calibration_iterations' : 3,
        'advanced_options:calibrationOption' : 7,
        'advanced_options:activationRangeMethod' : 1
    }

    But none of them worked. I'd like more detailed instructions about adding parameter setting to the compile_options.

    Regards and Thanks.   

    Yongquan

  • calibrationOption and activationRangeMethod are not directly accessible from Python interface, but instead other equivalent options are provided.

    These are the options that we typically use - when accuracy_level is set to 1, it is equivalent to setting calibrationOption to 7.

    github.com/.../config_settings.py

            compile_options = {
                ##################################
                # basic_options
                #################################
                'tensor_bits': 8,
                'accuracy_level': 1,
                # debug level
                'debug_level': 0,
                'priority': 0,
                ##################################
                # advanced_options
                #################################
                # model optimization options
                'advanced_options:high_resolution_optimization': 0,
                'advanced_options:pre_batchnorm_fold': 1,
                # quantization/calibration options
                'advanced_options:calibration_frames': 25,
                # note that calibration_iterations has effect only if accuracy_level>0
                'advanced_options:calibration_iterations': 25,
                # 0 (non-power of 2, default), 1 (power of 2, might be helpful sometimes, needed for qat models)
                'advanced_options:quantization_scale_type': 1,
                # further quantization/calibration options - these take effect
                # only if the accuracy_level in basic options is set to 9
                'advanced_options:activation_clipping': 1,
                'advanced_options:weight_clipping': 1,
                # if bias_clipping is set to 0 (default), weight scale will be adjusted to avoid bias_clipping
                # if bias_clipping is set to 1, weight scale is computed solely based on weight range.
                # this should only affect the mode where the bias is clipped to 16bits (default in TDA4VM).
                #'advanced_options:bias_clipping': 1,
                'advanced_options:bias_calibration': 1,
                'advanced_options:channel_wise_quantization': 0,
                # mixed precision options - this is just a placeholder
                # output/params names need to be specified according to a particular model
                'advanced_options:output_feature_16bit_names_list':'',
                'advanced_options:params_16bit_names_list':'',
                # optimize data conversion options by moving them from arm to c7x
                'advanced_options:add_data_convert_ops': 3,
            }
    If you would like to have additional flexibility, you can set accuracy_level to 9 and then set try to change
              'advanced_options:activation_clipping'
               'advanced_options:weight_clipping'
               'advanced_options:bias_clipping'