AM68A: Layer trace inspector for debugging model

Roman Milishchuk

Part Number: AM68A

Tool/software:

Hi,

I am trying to find out why the performance of my model is not good on the TDA platform. I am trying to integrate a transformer-based model and see a big issue with int16 performance.

Here is inference log, which also has a comparison of final featur maps:
3755.inference.log

You can see that only 38 % of feature vectors are in error margin +-10 % of values.

To debug the issue, I've tried to use layer_trace_inspector.py (https://github.com/TexasInstruments/edgeai-tidl-tools/blob/95ba2c7ec62bbedeb637d7a5c0273fcede21cac9/scripts/tidl_debug_scripts/layer_trace_inspector.py), but I don't know where to get ONNX traces. I've been successful in finding only TIDL traces for my model.

Still, I tried to visualize the TIDL traces and discovered that there is some scaling between _float.bin and .y values (I assume due to quantiazation). Please advise how I can compare those. In particular, I am using next configuration for quantization:

'tensor_bits': 16, 'accuracy_level': 9,
'advanced_options:add_data_convert_ops': 1, 'advanced_options:quantization_scale_type': 4, 'advanced_options:high_resolution_optimization': 0, 'advanced_options:activation_clipping': 1, 'advanced_options:weight_clipping': 1, 'advanced_options:bias_calibration': 1, 'advanced_options:per_channel_quantization': 1

Thanks,
Roman

3 months ago

0 Roman Milishchuk 3 months ago

Intellectual 280 points

Here are examples of .y and _float.bin

a.zip

+1 Reese Grimsley 3 months ago in reply to Roman Milishchuk

TI__Genius 16906 points

Hi Roman,

The .y files you receive are what comes out of the accelerator.

The float.bin files are the dequantized versions of those. It is effectively taking the .y file/trace and undoing quantization (add bias, multiply by scaler) to get the float version.

The float.bin should be compared to float.bin's from running in tensor_bits=32 or with the runtime itself (e.g. adding an output for every layer and saving those same traces as float32.

Br,
Reese

0 Wen Li 3 months ago

TI__Expert 7891 points

Hi;

My co-worker has shared a script for adding outputs to your ONNX model, this may help you. Please take a look.

import onnx
  
model_name = f"test_model.onnx"
modified_model_name = model_name.replace(".onnx", "_ref.onnx")

def convert(model_name, modified_model_name):
    """
    This function modifies the ONNX model to include intermediate layer outputs
    as graph outputs. This is useful for debugging and comparison purposes.
    """
    # Load the original ONNX model
    onnx_model = onnx.load(model_name)
    intermediate_layer_value_info = onnx.helper.ValueInfoProto()
    intermediate_layer_value_info.name = ''

    for i in range(len(onnx_model.graph.node)):
        for j in range(len(onnx_model.graph.node[i].output)):
            intermediate_layer_value_info.name = onnx_model.graph.node[i].output[j]
            onnx_model.graph.output.append(intermediate_layer_value_info)

    onnx.save(onnx_model, modified_model_name)

convert(model_name, modified_model_name)

You may also look into this info, if you have not.

https://github.com/TexasInstruments/edgeai-tidl-tools/blob/95ba2c7ec62bbedeb637d7a5c0273fcede21cac9/docs/tidl_osr_debug.md

We will provide more info later.

Thanks and regards

Wen Li

0 Roman Milishchuk 3 months ago in reply to Wen Li

Intellectual 280 points

Thanks a lot !

Processors

Processors forum

AM68A: Layer trace inspector for debugging model