This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: Inference Time too High for Yolov5 Object Detection on the Target Hardware

Part Number: TDA4VM

Hello,

I am using the run_od.sh script in order to see the inference time it takes on the tidl node. I trained yolov5, using edgeai yolov5 repository, using the medium weights, because small weights after quantization has very bad results. I am using 16 bit, because 8 bit also affects the accuracy of the model a lot. Then I took the bin files and put them on the hardware, unfortunately on the hardware the minimum time it takes is 200 ms, which is not acceptable. I tried many different approaches, also trying to keep some layers 16, and tried int8 quantization, but still the accuracy is very bad. I am not sure what to do. I would appreciate your help on this.

Thanks

  • Hi,

    Which SDK you are using ? Which flow you are opting for TIDL-RT or OSRT ? 

    Regarding accuracy of model, use are welcome to train the model to get best accuracy that suits there requirement.

    While doing inference in 8/16 bits there will be slight deviation in regards to model accuracy and if you feel that that is root cause you can do layerwize dump of all the model layers in float 32 vs INT8/INT16.

    You can refer to our accuracy guide for debugging here : https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/tidl_osr_debug.md#steps-to-debug-functional-mismatch-in-host-emulation

    Am not completely sure on 200 ms time, could be the case where you are running arm inference, can you help us with your test setup ?

  • Hi, thanks for the response. 

    The OSRT compiled and running using the object detection example in vision apps.

    We have the SDK 8.5 

    It is basically a very simple object detection application with vision apps using .yuv files for test data. 

    I am using the image size as 640x640 and i havent changed anything, only finetuned the weights which I got from edgeai-yolov5.

    We have different tda4vms, which we were able to perform some tests. If we use a segmentation network, all the inference results are exactly the same for different boards. 

    But if we run the object detection on different tda4vms, on one of them it takes 200 ms, on the other one it takes 70 ms, on the third one it takes 100 ms. We thought that underlying firmware files could be different, therefore updated all the boards to the same files, so all the boards have the same hash executables and same firmware files. but it did not change anything.  Also we have found out that only for object detection task changing settings like writing_latest_img is affecting the time on TIDL node. Another thing that we are thinking is, the segmentation network takes around 40ms, while object detection has the values that I have shared above, which i think also a very strange behaviour.

    I am not really sure why we are seeing this behaviour, is it something known that there is a non determinism in the object detection? Is there something we can do to solve this issue?

  • Listing down few points : 

    1. SDK 8.5, with vision apps object detection demo

    2. Artifacts are compiled using OSRT and flashed in SD Card.

    3. Difference in inference time observed with TDA4VM-EVMs, with exact same setup 

    But if we run the object detection on different tda4vms, on one of them it takes 200 ms, on the other one it takes 70 ms, on the third one it takes 100 ms.

    This is quite strange behavior, 

    before jumping to any other conclusion can you try out few things 

    1. Please confirm is this observation is specific to yolov5 model ? or in general across other model you see change in inference time with exact same setting?

    2. Can you try to compile the model using TIDL-RT model import tool and redo the experiment get the inference numbers?

    3. Since team has migrated to latest sdk which has locus of fixes added, please share the observation on latest 9.1 SDK to cross out sdk version specific compatibility.