This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM68A: Layer trace inspector for debugging model

Part Number: AM68A


Tool/software:

Hi,

I am trying to find out why the performance of my model is not good on the TDA platform. I am trying to integrate a transformer-based model and see a big issue with int16 performance. 

Here is inference log, which also has a comparison of final featur maps:
3755.inference.log

You can see that only 38 % of feature vectors are in error margin +-10 % of values. 

To debug the issue, I've tried to use layer_trace_inspector.py (https://github.com/TexasInstruments/edgeai-tidl-tools/blob/95ba2c7ec62bbedeb637d7a5c0273fcede21cac9/scripts/tidl_debug_scripts/layer_trace_inspector.py), but I don't know where to get ONNX traces. I've been successful in finding only TIDL traces for my model.

Still, I tried to visualize the TIDL traces and discovered that there is some scaling between _float.bin and .y values (I assume due to quantiazation). Please advise how I can compare those. In particular, I am using next configuration for quantization:

'tensor_bits': 16, 'accuracy_level': 9,
'advanced_options:add_data_convert_ops': 1, 'advanced_options:quantization_scale_type': 4, 'advanced_options:high_resolution_optimization': 0, 'advanced_options:activation_clipping': 1, 'advanced_options:weight_clipping': 1, 'advanced_options:bias_calibration': 1, 'advanced_options:per_channel_quantization': 1


Thanks,
Roman

  • Here are examples of .y and _float.bin 

    a.zip

  • Hi Roman,

    The .y files you receive are what comes out of the accelerator. 

    The float.bin files are the dequantized versions of those. It is effectively taking the .y file/trace and undoing quantization (add bias, multiply by scaler) to get the float version. 

    The float.bin should be compared to float.bin's from running in tensor_bits=32 or with the runtime itself (e.g. adding an output for every layer and saving those same traces as float32.

    Br,
    Reese

  • Hi;

    My co-worker has shared a script for adding outputs to your ONNX model, this may help you. Please take a look.

    import onnx
      
    model_name = f"test_model.onnx"
    modified_model_name = model_name.replace(".onnx", "_ref.onnx")
    
    def convert(model_name, modified_model_name):
        """
        This function modifies the ONNX model to include intermediate layer outputs
        as graph outputs. This is useful for debugging and comparison purposes.
        """
        # Load the original ONNX model
        onnx_model = onnx.load(model_name)
        intermediate_layer_value_info = onnx.helper.ValueInfoProto()
        intermediate_layer_value_info.name = ''
    
        for i in range(len(onnx_model.graph.node)):
            for j in range(len(onnx_model.graph.node[i].output)):
                intermediate_layer_value_info.name = onnx_model.graph.node[i].output[j]
                onnx_model.graph.output.append(intermediate_layer_value_info)
    
        onnx.save(onnx_model, modified_model_name)
    
    convert(model_name, modified_model_name)

    You may also look into this info, if you have not.

    https://github.com/TexasInstruments/edgeai-tidl-tools/blob/95ba2c7ec62bbedeb637d7a5c0273fcede21cac9/docs/tidl_osr_debug.md

    We will provide more info later. 

    Thanks and regards

    Wen Li 

  • Thanks a lot !