AM67A：when tensor_bits = 16，the compiled model becomes completely unusable

Ylinly Yi

Part Number: AM67A

Tool/software:

Hi, I have a question. When I compile my model with edgeai-tidl-tools version 11_00_06_00 and set tensor_bits = 8, the compiled model works fine (though the accuracy is too low). But when I change tensor_bits = 16, the compiled model becomes completely unusable.

I also tried adding some layers to output_feature_16bit_names_list, but as long as I add anything there, the compiled model cannot be used at all. Do you know why this happens?

tensor_bits = 8：

tensor_bits = 16：

22 days ago

0 Reese Grimsley 22 days ago

TI__Genius 15616 points

Hello Ylinly Yi,

This is certainly not expected behavior. We'll find a resolution to this.

Please assist with answers to the following:

Was there any warning or error log shown when compiling the model for tensor_bits=16? If you are comfortable sharing compile logs from PC, please do.
1. I would recommend setting debug_level=2 for your model (either as a runtime_option in your model_configs.py entry or as a global setting in common_utils.py in edgeai-tidl-tools/examples/osrt_python)
2. Please also share inference logs with the same debug_level=2
I assume this behavior is seen on the AM67A target device. Do you see the same result with x86 PC emulation? If target and PC do not give same result, then this is a bug on device side.
1. Some error logs may indicate that a model may work on PC but not on device.
Can you supply me the model-artifacts for 8 and 16 bit modes? I understand this can be sensitive IP -- it would be helpful to me but not 100% required. The SVG files in your artifacts/tempDir file would also suffice.
Are you using an object detection meta-architecture? Relevant documentation [1]. I ask because the angled boxes are an interesting case, but I do not believe our meta-architectures are equipped for this
When using output_feature_16bit_names_list, are you still using tensor_bits=16? This option is only valid when used with tensor_bits=8

Ylinly Yi said:
the compiled model cannot be used at all

By this, you mean the accuracy is so bad that it is useless, or that the model cannot run at all?

It is strange that adding any layer to the 16-bit-names-list would result in this behavior. Note that you should only use that option when tensor_bits=8. If you set that names-list parameter to a layer name that does not exist, does the error show then too?

[1] https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/tidl_fsg_od_meta_arch.md

BR,
Reese

0 Ylinly Yi 19 days ago in reply to Reese Grimsley

Prodigy 40 points

1.Below is the log when I set tensor_bits = 16 and debug_level = 2, running python3 onnxrt_ep.py -c -o. I didn’t see any obvious errors.

2.For x86 PC simulation, should I run python3 onnxrt_ep.py? If yes, then the results on x86 PC are correct, but on AM67A they are not.

3.The model artifacts for 8-bit mode are as follows:

The model artifacts for 16-bit mode are as follows:

4.I did not use meta_arch_type.

5.When using output_feature_16bit_names_list, I set tensor_bits = 8.
When tensor_bits = 16, the compiled model runs on AM67A but produces incorrect bounding boxes, like this:

0 Reese Grimsley 18 days ago in reply to Ylinly Yi

TI__Genius 15616 points

Hi Yliny Yi,

It looks like you uploaded some log files / artifacts that I cannot view/access. Can you drag those into the comment window on next reply? Perhaps they are private to your user

Ylinly Yi said:
2.For x86 PC simulation, should I run python3 onnxrt_ep.py? If yes, then the results on x86 PC are correct, but on AM67A they are not.

Yes, this is correct. If you run like "python3 onnxrt_ep.py -m MODEL_NAME", then it should emulate the accelerator. This requires that the model-artifacts be present (meaning you compiled beforehand) -- the default location is fine. If I see a log file, I can verify whether it was running with TIDL emulation or not.

Running with '-d' tag as well will "disable offload" and run on CPU. We do not want to do that at this stage

Now assuming it did run in an emulated mode, then this is useful information -- it suggests PC and target have some difference in execution, where they typically should not.

Most commonly, difference here means an issue related to memory handling (e.g. unable to allocate) or a bug within some layer.

Ylinly Yi said:
4.I did not use meta_arch_type.

5.When using output_feature_16bit_names_list, I set tensor_bits = 8.
When tensor_bits = 16, the compiled model runs on AM67A but produces incorrect bounding boxes, like this:

Understood, thanks.

So please confirm for me: When you run on PC with tensor_bits=16 or with anything defined in output_feature_16bit_names_list, there is no issue compiling. When you run on PC, you see correct results. When you run on target, you get noisy, nonsensical output.

Is this correct?

Once you provide you logs and model-artifacts, I will take a deeper look and seek a workaround. If the above points are true, there is a bug that I will file with our team as well.

BR,
Reese

0 Reese Grimsley 18 days ago in reply to Reese Grimsley

TI__Genius 15616 points

Please also supply your model-config.py entry for this model, as well. Please include any non-default options you are using in the runtime_options dictionary

e.g. https://github.com/TexasInstruments/edgeai-tidl-tools/blob/23b72b5781569a261792d98f6c17503b30b4a283/examples/osrt_python/model_configs.py#L1030

0 Ylinly Yi 17 days ago in reply to Reese Grimsley

Prodigy 40 points

Sorry, can you see it like this? The log with tensor_bits = 16 and debug_level = 2.

edgeai-tidl-tools.log

0 Ylinly Yi 17 days ago in reply to Reese Grimsley

Prodigy 40 points

The Runtime_options dictionary was not set in model-config.py

"best_iketest35": {

"model_path": os.path.join(models_base_path, "best_iketest35.onnx"),

"num_images": numImages,

"task_type": "detection",

"preprocess": dict(

resize=448,

crop=448,

data_layout="NCHW",

resize_with_pad=False,

reverse_channels=False,

"session": dict(

session_name="onnxrt",

model_path=os.path.join(models_base_path, "best_iketest35.onnx"),

input_mean=[0, 0, 0],

input_scale=[0.003921568627, 0.003921568627, 0.003921568627],

input_optimization=True,

"extra_info": dict(

num_images=numImages,

num_classes=15,

framework="",

label_offset_type="n2n+1",

od_type="YoloV11_OBB",

0 Reese Grimsley 17 days ago in reply to Ylinly Yi

TI__Genius 15616 points

Hi Ylinly,

Ylinly Yi said:
Sorry, can you see it like this? The log with tensor_bits = 16 and debug_level = 2.

Yes, I see your logs now. Including files in this way worked well.

I agree that the logs look fine -- nothing concerning. The logs are entirely normal for a model compiled with this debug level.

Please also supply the runtime log on target/EVM with debug_level set to 2. Please also supply again the 8-bit and 16-bit versions of the model artifacts -- these were not accessible in the previous message.

Another question about the output with tensor_bits = 16:

Are the values all within the expected range? i.e. confidence levels between [0,1), class numbers, box dimensions are within the frame

BR,
Reese

Processors

Processors forum

AM67A：when tensor_bits = 16，the compiled model becomes completely unusable