AM68A: Compilation not work properly on SDK 9.1

Roman Vei

Part Number: AM68A

We compiled our model using SDK 8.6 and it works well for us. After switching to newer version (SDK 9.1) we encountered following issues:

* large output mismatch after calibration comparing to ONNX model if add_data_ops: 0

* more similar values, but still large mismatch after calibration comparing to ONNX model if add_data_ops: 1

It feels like default add_data_ops: 0 not works and produce bad outputs even for simple Neural Network (only Add, Conv and Relu operations)
This model should be easily compiled using SDK 9.1 but it doesn't work in that way.

Here I provide assets to compile and readme how to run this example, inference, onnx model, logs for both variants. I believe it should work for add_data_ops: 0 too and maybe somebody could help me to determine what is wrong?

assets.zip

over 1 year ago

0 Pratik Kedar over 1 year ago

TI__Mastermind 24041 points

Hi,

Can you share exact sdk tags you have used for 8.6 and 9.1 sdks ?

Moreover is this issue coming for 16bit quantization only ? Can you share observation on 8 bit ?

0 Roman Vei over 1 year ago in reply to Pratik Kedar

Prodigy 50 points

Hi, the tags used:
09_01_00_02 and 08_06_00_05 tags used. For 8bit right now I don`t have results, but I will share when I test it. Here you can see how large degradation is with same setup (number of frames, number of iterations, same model, same frames and same images for inference) between SDK 8.6 and 9.1 (add_data_ops: 0) for 16bit compilation.
Moreover, it also produce large mismatch if add_data_ops: 1 and denylist: Transpose, Resize, Reshape, MatMul, it feels like there is some problems with handling data in SDK 9.1, but it is topic for separate post, right now it would be good to understand why this simple neural network cannot compile properly on SDK 9.1 if add_data_ops equals to 0

0 Pratik Kedar over 1 year ago in reply to Roman Vei

TI__Mastermind 24041 points

Let me note the observations:

1. You are working with 16 bit network

2. On 9.1 sdk (09_01_00_02 sdk tag) you are facing issue for model compilation when add_data_ops is set to 1, however the same observation is not coming for 08_06_00_05 tag ?

Can you confirm my understanding is correct on this issue ? In particularly am trying to understand how did inference worked if model compilation itself is failing, as there wont be model artifacts generated per say.

Can you elaborate how are you comparing the results ? 16bits quantized values to de quantized to flaot 32 and compare with oonx flot 32 values ?

If you can try to add comparison table of 8.6 and 9.1 with add_data_ops flags set and not set that would be good to understand.

0 Roman Vei over 1 year ago in reply to Pratik Kedar

Prodigy 50 points

1. Yes, I am working with 16bit network

2. On 9.1 sdk (09_01_00_02 sdk tag) I facing issue when add_data_ops is set to 0 and yes, the same observation is not coming for 08_06_00_05 tag

3. Yes, I take 16bits quantized values after converting to float32 and compare to original onnx float32 values

Here name is a name column of setup on which I test, error margin <number> column values show how similar are outputs of original onnx float32 model to 16bit model on this specific setup where <number> represents threshold (you can see compare_float_3d_arrays in assets). Those numbers are approximate and variate depending on sample, but you could clearly see that 09_01_00_02 (add_data_ops: 0) produce not relevant outputs at all. For all of those experiments I take same onnx model, same calibration images and options (except add_data_ops), so it feels like something wrong under the hood in this new SDK when add_data_ops: 0

name	error margin 0.1	error margin 0.01	error margin 0.001	error margin 0.0001
08_06_00_05	99.70%	51.25%	5.87%	0.59%
09_01_00_02 (add_data_ops: 1)	99.59%	48.27%	5.56%	0.54%
09_01_00_02 (add_data_ops: 0)	10.35%	1.10%	0.11%	0.01%

0 Pratik Kedar over 1 year ago in reply to Roman Vei

TI__Mastermind 24041 points

Thanks for sharing the info.

Let me check this internally and get back to you within 1 week time frame.

0 Roman Vei over 1 year ago in reply to Pratik Kedar

Prodigy 50 points

Thank you!

+1 Pratik Kedar over 1 year ago in reply to Roman Vei

TI__Mastermind 24041 points

Hi Roman,

The observation pointed by are correct, i have verified these at my end.

As i see, when you set the add_data_ops to 1 you get 99.59% correlation percentage, however with default i.e 0 (The same is reflected in OSRT example script) its poor. With default add_data_ops set to 0 this operation is scheduled on ARM core, and its showing functional miss match in the current case, however with flag set to 1 the same is scheduled on DSP core and its working fine.

I have filed the JIRA to track this issue, this will be fixed in coming 9.2 release tentatively.

Adding JIRA link for TI's internal tracking purpose.

jira.itg.ti.com/.../TIDL-3572

0 Pratik Kedar over 1 year ago in reply to Pratik Kedar

TI__Mastermind 24041 points

Hi Roman,

I would need model which is showing this behavior, so that i can try to reproduce this issue at my end and try to add fix.

If in case you are not able to share exact same model, you can share the TOY MODEL(dummy model) where this issue is reproducible.

Thank you

0 Roman Vei over 1 year ago in reply to Pratik Kedar

Prodigy 50 points

Sure! Shared a dummy model in assets, thank you

0 Roman Vei over 1 year ago in reply to Pratik Kedar

Prodigy 50 points

Thank you! Also, maybe it will help, I wanted to point out that the same error occurred even if add_data_ops: 1, but denyList is not empty (for example: Reshape, Transpose, MatMul etc., but not sure about other operations) For me, it feels like they are related and scheduling to ARM not work properly (maybe because of this 6 dimension tensors?)

0 Pratik Kedar over 1 year ago in reply to Roman Vei

TI__Mastermind 24041 points

Roman Vei said:
Sure! Shared a dummy model in assets, thank you

Didn't understood what is asset here ! but generally user attach the model as zip files here.

Could you please attach the model here ? So can check the same

0 Roman Vei over 1 year ago in reply to Pratik Kedar

Prodigy 50 points

sub_graph.onnx in assets.zip folder. Also, is there any link to JIRA to track a status? For us it is important issue
4452.assets.zip

0 Pratik Kedar over 1 year ago in reply to Roman Vei

TI__Mastermind 24041 points

Hi,

Thanks for sharing model.

I will attach the same in above shared JIRA so that this issue can be reproducible by developer.

Shall we close this thread until then ?

Processors

Processors forum

AM68A: Compilation not work properly on SDK 9.1