TDA4VM: TIDL-RT inference error when input channel number greater than 4

Renf Shi

Part Number: TDA4VM

Hello TI:

We have found that when the input channel number greater than 4, the inference results of the TIDL model is incorrect.

But when the input channel is less than 4, the inference results of the TIDL model is correct compared with the Pytorch model.

Here is our test:

When the input is a 3-channel RGB image, the output of python and tda4 board side reasoning is roughly aligned:

The first 200 values of the Pytorch inference output:

```
tensor([0.2725, 1.8240, 0.9287, 1.0627, 1.4536, 1.3616, 0.9387, 1.0476, 1.8022,
1.8972, 1.7944, 1.9753, 2.0522, 1.6511, 1.3245, 1.2046, 1.6687, 2.0710,
1.8052, 1.4341, 1.5159, 1.3245, 1.4829, 1.6804, 1.4365, 1.7385, 1.6130,
1.3274, 1.6187, 1.7466, 2.0071, 2.1467, 2.0046, 1.9263, 1.8596, 1.9434,
1.7725, 1.8657, 2.3823, 2.6248, 2.2700, 1.6094, 1.6526, 1.5845, 1.5388,
1.8826, 1.5706, 1.2964, 1.7742, 1.6318, 1.6179, 2.2485, 2.1445, 1.4136,
1.3721, 1.9043, 1.7217, 1.6313, 1.8491, 1.7224, 2.1184, 2.7771, 2.4487,
1.9482, 2.0022, 1.9531, 1.7512, 1.6338, 1.5212, 1.1211, 1.2781, 1.3647,
1.5173, 1.9783, 2.0759, 1.4543, 1.0547, 1.2500, 1.0984, 0.9580, 1.2556,
1.5652, 1.2031, 1.4524, 1.6968, 2.1414, 2.0488, 1.4937, 1.6997, 1.7988,
1.3606, 1.0713, 1.5337, 1.0088, 1.2561, 1.0376, 0.4551, 2.0996, 1.4224,
1.6277, 2.4272, 2.2100, 0.6223, 1.2932, 3.0981, 2.4563, 1.1636, 2.5801,
4.0732, 2.2795, 1.2170, 1.1143, 2.1606, 2.6863, 1.8450, 1.2229, 1.4094,
0.9294, 1.5471, 1.7346, 0.7766, 1.6616, 2.5457, 1.8667, 2.5105, 1.5776,
1.5364, 2.8516, 1.8926, 1.4551, 1.8564, 2.5530, 2.6946, 2.3096, 2.2207,
2.2153, 2.1553, 1.9778, 2.7310, 2.2983, 1.2312, 2.0168, 2.3958, 1.2305,
2.1570, 2.5898, 1.5598, 3.1360, 3.1733, 1.5193, 1.4409, 3.0139, 2.2844,
1.8755, 3.4807, 3.0547, 2.7522, 3.6755, 2.7910, 2.1860, 2.1313, 1.6392,
2.4373, 1.9514, 1.8430, 1.5222, 1.5061, 1.6863, 1.9619, 2.7542, 3.1331,
1.2903, 0.5537, 2.2219, 1.8372, 0.5667, 1.0918, 2.1479, 1.3037, 1.6404,
1.3792, 2.1606, 2.7366, 1.1772, 2.5527, 2.2637, 1.3894, 1.1377, 2.1436,
1.0078, 1.2209, 0.3113, 0.2517, 1.6677, 1.6304, 1.9866, 2.1826, 2.0413,
0.8716, 1.7761], device='cuda:0')

The first 200 values of the TIDL model inference output:

```
0.272461 1.82422 0.928711 1.0625 1.4541 1.36133 0.938477 1.04688 1.80176 1.89746
1.79395 1.97559 2.05273 1.65137 1.3252 1.20508 1.66895 2.07031 1.80469 1.43359
1.51562 1.3252 1.4834 1.68066 1.43652 1.73828 1.61328 1.32715 1.61816 1.74609
2.00684 2.14648 2.00391 1.92578 1.85938 1.94336 1.77246 1.86523 2.38281 2.62402
2.27051 1.60938 1.65234 1.58496 1.53906 1.88281 1.57031 1.29688 1.77441 1.63184
1.61816 2.24902 2.14453 1.41406 1.37207 1.9043 1.72168 1.63086 1.84961 1.72266
2.11816 2.77734 2.44824 1.94824 2.00195 1.95312 1.75098 1.63379 1.52148 1.12109
1.27832 1.36426 1.51758 1.97852 2.0752 1.4541 1.05566 1.25 1.09863 0.958008
1.25586 1.56543 1.20312 1.45215 1.69629 2.1416 2.04883 1.49414 1.7002 1.79883
1.36133 1.07129 1.5332 1.00879 1.25586 1.03711 0.455078 2.09961 1.42188 1.62793
2.42773 2.20996 0.62207 1.29297 3.09668 2.45703 1.16406 2.58008 4.07324 2.2793
1.2168 1.11426 2.16016 2.68555 1.8457 1.22363 1.40918 0.929688 1.54688 1.73438
0.777344 1.66211 2.5459 1.86719 2.51074 1.57812 1.53613 2.85156 1.89258 1.45508
1.85645 2.55273 2.69434 2.30859 2.2207 2.21484 2.15527 1.97754 2.73047 2.29883
1.23145 2.0166 2.39551 1.23047 2.15723 2.58984 1.55957 3.13574 3.17285 1.51953
1.44141 3.01367 2.28418 1.875 3.48047 3.05371 2.75195 3.67578 2.79102 2.18555
2.13184 1.63867 2.43652 1.95117 1.84277 1.52246 1.50586 1.68555 1.96191 2.75391
3.13281 1.29004 0.553711 2.22168 1.83691 0.566406 1.0918 2.14746 1.30371 1.64062
1.37891 2.16113 2.73633 1.17676 2.55273 2.2627 1.38965 1.1377 2.14355 1.00781
1.2207 0.311523 0.251953 1.66797 1.63086 1.9873 2.18262 2.04102 0.87207 1.77637
```

When the model structure is the same, only the input is changed to the input channel number is 6(left and right graphs are concatenated), the output of the python side and tda4 side is very different, and the result cannot be aligned:

The first 200 values of the Pytorch inference output:

```
tensor([0.0000e+00, 2.7285e+00, 2.8269e+00, 1.6304e+00, 1.3997e+00, 1.3689e+00,
1.4131e+00, 1.3860e+00, 1.3345e+00, 1.3909e+00, 1.4070e+00, 1.3889e+00,
1.3545e+00, 1.3674e+00, 1.3391e+00, 1.3613e+00, 1.3481e+00, 1.3748e+00,
1.4243e+00, 1.4846e+00, 1.4036e+00, 1.4185e+00, 1.3950e+00, 1.3835e+00,
1.3638e+00, 1.3652e+00, 1.4065e+00, 1.3867e+00, 1.3027e+00, 1.3511e+00,
1.3777e+00, 1.3601e+00, 1.3704e+00, 1.3643e+00, 1.3557e+00, 1.3911e+00,
1.4158e+00, 1.4163e+00, 1.3875e+00, 1.4126e+00, 1.4490e+00, 1.4751e+00,
1.5117e+00, 1.5244e+00, 1.5005e+00, 1.4155e+00, 1.4963e+00, 1.4980e+00,
1.4241e+00, 1.4448e+00, 1.4675e+00, 1.4526e+00, 1.4204e+00, 1.4573e+00,
1.4790e+00, 1.4539e+00, 1.4944e+00, 1.4873e+00, 1.4543e+00, 1.5115e+00,
1.5195e+00, 1.4326e+00, 1.4438e+00, 1.5266e+00, 1.4805e+00, 1.4639e+00,
1.4380e+00, 1.4746e+00, 1.4836e+00, 1.5225e+00, 1.5859e+00, 1.5027e+00,
1.4927e+00, 1.6277e+00, 1.6060e+00, 1.4963e+00, 1.5686e+00, 1.5256e+00,
1.4050e+00, 1.4802e+00, 1.4988e+00, 1.5071e+00, 1.4541e+00, 1.4600e+00,
1.4753e+00, 1.5767e+00, 1.4890e+00, 1.5300e+00, 1.5010e+00, 1.4807e+00,
1.5544e+00, 1.5212e+00, 1.4985e+00, 1.5979e+00, 1.6865e+00, 2.4600e+00,
0.0000e+00, 7.2339e-01, 0.0000e+00, 6.0815e-01, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 3.5156e-02, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 3.7598e-02, 0.0000e+00, 0.0000e+00,
0.0000e+00, 1.4404e-02, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 9.6924e-02, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
5.1514e-02, 1.4648e-03, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.9053e-02, 0.0000e+00,
0.0000e+00, 1.1248e+00, 3.2275e-01, 4.7437e-01, 4.4336e-01, 3.3008e-01,
3.9404e-01, 4.8340e-01], device='cuda:0')
```

The first 200 values of the TIDL model inference output:

```
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
```

When the input channel of the ONNX model is greater than 4, we use the TIDL import tools to transform the ONNX model and inference the transformed model on TDA4VM board, but the results is incorrect.

Could you give some suggestion about when the input channel of ONNX model is greater than 4, how can we transform the model using TIDL model importer tools ?

over 2 years ago

0 Anshu Jain over 2 years ago

TI__Guru 56820 points

Hi,

How are you feeding input data to TIDL when number of channel is 4? You should be using inFileFormat = 1 and provide a raw binary input , can you confirm if you are doing the same?

Regards,

ANshu

Processors

Processors forum

TDA4VM: TIDL-RT inference error when input channel number greater than 4