Part Number: TDA4VH-Q1
Tool/software:
Hello,
I am trying to deploy an object detection neural network on TI chip. I am using TIDL software to convert onnx model into TI deployable format. To quickly describe the operation that causes the problem: The multi camera approaches in the literature use something called a Bird-Eye View. All camera information is projected for the purpose of unification onto a single common tensor that represents divisions of real world coordinates called voxels. Each voxel represents for example a cubic meter of the real world. The simplest way to do this, is to compute the geometry from camera parameters to real world coordinates, and directly copy information from image features to BEV tensor. It is as simple as that, copy few thousand values and that's it.
So the problem arises when we use more than one camera in this operation. Normally the backbone of the network prior to BEV projection takes 4 dimensions: batch, channel, height, width. And the only way to squeeze multi camera information into this part is to concatenate camera information within batch dimension. But when we do so a very simple operation of copying image features to a BEV plane fails at TI.
I'll give an example code for the operation we are using. It flattens the image features. For testing purposes batch size = number of cameras is 2.
x_flatten = x.view(2,6336,64) # image feature shape is 48x132 batch_size, h, w, c = 2, 48, 132, 64 flat_locations = locations.long() # shape is 96x64x8 Depth (0 to 48m) x Width (-16 to 16m) x 8 separate height batch_idx = torch.arange(2).view(2, 1).repeat(1, 49152) bev_flat = x_flatten[batch_idx, flat_locations] bev_flat = bev_flat.view(batch_size, *bev_size, c)
I am also adding onnx and TI graphs from the input of Feature-to-BEV locations (that are computed offline and inputed to the network) to the output of BEV projected image features after taking a sigmoid. This is practically the part of code I have shared turned into onnx or TI.


,
The projected image features are meaningful in onnx inference. The same features are all zeros for TI inference. They become zero during the operations whose code I have shared. Values before the BEV copy is also correct on TI inference.
I have no idea what is going on and how to fix. I'd be glad if you can lend me some help and direction.
Thank you in advance,
Cem Tarhan