Tool/software:
This is an inquiry when trying to load the edgeyolov5 model, yolov5m6_640_ti_lite_44p1_62p9 model, on a TI board.
When the yolov5m6_640_ti_lite_44p1_62p9.onnx model was converted using the "python3 onnxrt_ep.py -c" command, a bin file was created.
yolov5m6_640_ti_lite_44p1_62p9.onnx "python3 onnxrt_ep.py -c" When converted to , it was converted to a subgraph_0_tidl_net.bin file.
when I tried to load it on the TI board, I confirmed that if the onnx file is present, it loads and works well.
However, when I delete the onnx file, a message appeared saying that the yolov5m6_640_ti_lite_44p1_62p9.onnx model was also required.
When loading a model in TIDLExecutionProvider mode, I would like to inquire whether inference can be achieved by loading only the bin file without the onnx file.
Hello Kim,
I understand the query. You have compiled the model successfully to produce TIDL-compatible binaries, but when the .ONNX is removed, TIDLExecutionProvider fails.
This is expected. To use the ONNXRuntime API's, the ONNX model must be present. When TIDLExecutionProvider is used, that will be used to search for the artifacts directory and find a file (allowedNode.txt) that identifies which layers will be accelerated. Any unlisted layers will default back to CPUExecutionProvider.
To use only the .bin files, it is necessary to use the TIDL-RT interface. There are is an example in edgeai-tidl-tools/examples/tidlrt_cpp. This interface is only available for C/C++ (no python) . We have limited support available for this interface on AM62A, but the user-guides/documentation from similar J7/TDA4 devices apply for AM62A w.r.t. TIDLRT
BR,
Reese
Thank you very much for your answer.
I checked and the code was for classification.
Could you please provide the cpp and h files for running the yolov5m6_640_ti_lite_44p1_62p9 model?
The model is an object detection model.
Hello,
Correct, this example is for classification networks. We have limited TIDL-RT examples because this API is less supported than OSRT interface like Onnx runtime / TFlite.
For object detection network, the TIDL-RT reference code does not exist in edgeai-tidl-tools. You may modify the classification example to match the input and output sizes for the yolov5 network, and you would need to add some postprocessing code to format output tensors to extract bounding box information.
I can point you towards some instances where this output postprocessing is being handled in C code, but otherwise the example you're looking for is not present.
Can I ask why TIDL-RT is necessary in your case? Is it to remove need for the original ONNX model?
BR,
Reese
If you can tell me some postprocessing C code to format output tensors to extract bounding box information, I would appreciate it.
The onnx file is large, so I need to reduce the size when updating the model file.
Please..
status = preprocImage<uint8_t>(s->input_image_name, (uint8_t*)in[j]->ptr, 384, 640, 3, s->input_mean, s->input_std);
I need the latter part of the code above.
Hello,
status = preprocImage<uint8_t>(s->input_image_name, (uint8_t*)in[j]->ptr, 384, 640, 3, s->input_mean, s->input_std);
I need the latter part of the code above.
Are you referring to the source for this function? It is in the tidlrt_cpp/classification.cpp file:
If you can tell me some postprocessing C code to format output tensors to extract bounding box information, I would appreciate it.
Of course. The best reference is probably the internals of the post-processing TIOVX kernel. This will accept a set of buffers from the previous node in the graph -- for TIDLRT, you are effectively getting this output directly from the set of output pointers provided during IO setup
Understood on why ONNX file must be removed.
BR,
Reese
<code>
float input_mean = 0.0;
float input_std = 1.0;
status = preprocImage<uint8_t>(input_image_name, (uint8_t*)in[j]->ptr, 640, 384, 3, input_mean, input_std);
LOG(INFO) << "invoked \n";
for (int i = 0; i < loop_count; i++)
{
TIDLRT_invoke(handle, in, out);
}
LOG(INFO) << "invoked success\n";
const float threshold = 0.5f; // Object detection confidence threshold
float* output = (float*)out[j]->ptr;
int num_detections = 100; // Assuming max 100 detections
std::vector<std::array<int, 4>> boxes;
std::vector<float> scores;
std::vector<int> labels;
for (int i = 0; i < num_detections; i++) {
float score = output[i * 6 + 4];
int label = (int)output[i * 6 + 5];
std::array<int, 4> box;
box[0] = output[i * 6 + 0]; // xmin
box[1] = output[i * 6 + 1]; // ymin
box[2] = output[i * 6 + 2]; // xmax
box[3] = output[i * 6 + 3]; // ymax
if (box[0] == -1 && box[1] == -1 && box[2] == -1 && box[3] == -1){
printf("end object\n", i);
break;
}
boxes.push_back(box);
scores.push_back(score);
labels.push_back(label);
}
<result>
iltered Output: 87, 41, 225, 398, 0.00881985, 0 (person)
iltered Output: 153, 43, 294, 397, 0.00781274, 0 (person)
iltered Output: 154, 254, 293, 585, 0.00689718, 0 (person)
iltered Output: 17, 212, 158, 484, 0.00689718, 0 (person)
iltered Output: 95, 264, 216, 575, 0.00689718, 0 (person)
iltered Output: 23, 45, 161, 402, 0.00680563, 0 (person)
iltered Output: 222, 245, 360, 602, 0.0061037, 0 (person)
iltered Output: 154, 2, 293, 205, 0.00601215, 0 (person)
iltered Output: 226, 41, 364, 398, 0.00601215, 0 (person)
iltered Output: -3, 192, 98, 503, 0.00589007, 0 (person)
iltered Output: 333, 333, 383, 598, 0.00579852, 0 (person)
iltered Output: 90, 2, 229, 205, 0.00531022, 0 (person)
iltered Output: -3, 12, 98, 284, 0.00521867, 0 (person)
iltered Output: 1, 187, 52, 420, 0.004944, 0 (person)
iltered Output: 293, 0, 379, 311, 0.00476089, 0 (person)
iltered Output: 281, 328, 384, 639, 0.00427259, 0 (person)
iltered Output: 285, 178, 387, 532, 0.00415052, 0 (person)
iltered Output: 7, 348, 168, 626, 0.00415052, 0 (person)
iltered Output: -2, 7, 57, 212, 0.00402844, 0 (person)
iltered Output: 61, 201, 154, 406, 0.00402844, 0 (person)
iltered Output: 325, 41, 384, 374, 0.00396741, 0 (person)
iltered Output: 17, 2, 158, 205, 0.00372326, 0 (person)
iltered Output: 225, 2, 366, 205, 0.00372326, 0 (person)
iltered Output: 193, 217, 286, 450, 0.0036317, 0 (person)
iltered Output: 161, 189, 254, 422, 0.0036317, 0 (person)
iltered Output: 97, 281, 187, 462, 0.0036317, 0 (person)
iltered Output: 227, 217, 316, 454, 0.0036317, 0 (person)
iltered Output: 3, 372, 92, 627, 0.00357067, 0 (person)
Above is the output code for the object detection result I created. The input is a photo with one person in it. However, The number of coordinate results and coordinate result value is strange. Could it be that the Python code that infers with the onnx file works normally, but the C code that infers only with the bin file is missing the onnx file, resulting in strange results? Or is it because my code is wrong? If it's the latter, could you please look into the problem with my code?
Hello,
It is not uncommon to have many additional low-confidence boxes. This is ordinarily suppressed by NMS within the model -- the parameters for this are within the prototxt used during model compilation.
If you draw the boxes onto an image, do the results make sense? Following image shows a typical scenario where NMS is producing too many boxes that have partial overlaps with the object of interest (from google):
The values seem realistic, but I agree there are too many, and they are all very low confidence. The prototxt typically also has a confidence threshold too, such that nothing lower than, for example 0.3, confidence will even show up in the model outputs from TIDL.
int num_detections = 100; // Assuming max 100 detections
Note that the output size of a network with TIDL will always be fixed size. You should be able to see this in the subgraph SVG file(s) for your model. Default is 200 boxes/detections