This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM68: YOLOX Pose detection quality poor on AM68 (Edge AI 9.2)

Part Number: AM68
Other Parts Discussed in Thread: TDA4VM, ,

Tool/software:

Hello,

We are experiencing very poor detection results with the YOLOX pose model on the SK-AM68 board with Edge AI SDK 9.2, while the precompiled model performs well on a BeagleBoard with TDA4VM using Edge AI 8.2. The issue affects both TI’s precompiled model and self-compiled model (using edgeai-tidl-tools).

Issue Description

  • Setup:
    • AM68: SK-AM68, Edge AI 9.2, YOLOX pose (TI’s r9.2 modelzoo + self-compiled).
    • TDA4VM: BeagleBone AI-64, Edge AI 8.2, YOLOX pose (TI’s r8.2 modelzoo + self-compiled).
    • Identical input data.
    • Inference using edge-ai-apps / edgeai-gst-apps

 

  • Problem: On AM68, bboxes are inaccurate, with very low confidence scores, often people are completely undetected. Decresing confidence threshold reults in false positives, also pose estimation is inaccurate. TDA4VM performance is similar to original .pth model detection - see images below.
  • Experiments on AM68:
    • We tried experimenting with some compilation parameters like high_resolution, tensorbit, quantization_scale_type, and add_data_convert_ops, but detection results remained very poor.

 

Image comparison:

yolox_s on AM68

 

yolox_s on TDA4VM

We would very much appreciate your help in resolving this issue.

Regards

  • Hello;

    Thanks for the question. 

    I try to understand your problem/question better. 

    1.  You are saysing " We are experiencing very poor detection results with the YOLOX pose model on the SK-AM68 board with Edge AI SDK 9.2, while the precompiled model performs well on a BeagleBoard with TDA4VM using Edge AI 8.2" Does this mean Edge AI 8.2  worked better than Edge AI SDK. Or the precompiled model worked better. Can you compare the detection test on the same hardware (SK-AM68) with YOLOX pose model vs precompiled model?

    2. What is the your definition of "very poor detection"? What is the your detection accuracy rate. If you use the TI model make/train tool and feed enough images to it, you will get the model trained accuracy. Did you do that? what is the your rate? you can do this on the PC/Linux.

    Please take a look at this link; some of info there may help you. You can do some tests based on info. And see you can achieve during the emulation.  

    https://dev.ti.com/edgeaistudio/

    Best regards

    Wen Li

      

  • Hello, 
    Thank you for your response. 

    1. The main issue is that the yolox-human-pose-s model, whether trained and compiled using edgeai-yolox and edgeai-tidl-tools, or downloaded precompiled from the model zoo, shows: 
    • Good detection accuracy on BeagleBoard BBAI-64 (TDA4VM) with SDK 8.2 and on SK-AM68 with SDK 8.6. 
    • Poor detection accuracy on SK-AM68 (AM68A) with SDK 9.2 and 10.1. 
    1. This applies to models trained on both COCO keypoint dataset and a custom dataset. 
    2. By "poor detection," we mean that: 
    • Persons (or objects in the case of the custom dataset) are sometimes not detected at all. 
    • In other cases, detections are inaccurate — for example, bounding boxes do not fully enclose the person, false negatives and false positives, and keypoints can be incorrectly placed, as shown in the image above.

    • Here's a video of detections on both boards:

    Human pose model on AM68A with edgeai gst-apps 9.2:

    Human pose model on beagleboard with TDA4VM with edgeai apps 8.2:

     

  • Hi Hans Peter;

    Thanks for providing the video. 

    After reviewing your video. I think the first thing we would like confirm if the input frames and resolutions of the two input video stream are the same. If they are very different, or we are dealing with the different bandwidth requirements, than the accuracy may be affected.

    Second; based on your feedback, the key factor affecting the performance are the various versions of the SDK. And also you are using the Edge AI Studio. Please note, the compatibility of the SDK and Edge AI tools. When you load the application to the target board, follow the compatibility. Please see following link for the details.

    https://github.com/TexasInstruments/edgeai-tidl-tools/blob/10_00_07_00/docs/version_compatibility_table.md

    Usually the edge-AI studio release is later than the other TIDL tool. But we will have a new studio version coming soon.

    Thanks and regards

    Wen Li

      

  • Hey Wen Li,

    at least for the AM68/j721s2 I can confirm that the "poor result" happens with the video as well as the example model and yaml-defined pipeline as provided from the TI edgeai-gst-apps/model_zoo/test_data etc., downloaded and build with meta-edgeai. So if there is an "compatibility issue", it is to be searched there. I suggest you take your SK-AM68 board with the default image based on SDK 8.6 and SDK 9.2 and you will likely be able to reproduce exactly the two above videos.

    Regards

    Steffen

  • Hi Wen Li,

    Thank you for your suggestions.

    We’ve verified that the input frames and resolutions of the video streams are the same for the AM68A (SK-AM68) with SDK 8.6 and 9.2 and 8.2 on TDA4VM. Additionally, the SDK versions are compatible with the precompiled YOLOX pose models from the model zoo, as per the compatibility table (github.com/.../version_compatibility_table.md).

    As Steffen Hemer suggested we would also appreciate it, if you could try to reproduce the results using the default settings as we did. 

    Thank you and regards

  • Hi; That is a good suggestion. I will allocate a AM68A board to test on my end as well. 

    Apparently, you did this test by using Model Composer. Have you used the Model-Analyzer or Model-Maker, or both? So I will try to use the exact same tools as you did. 

    Thanks and regards

    Wen Li 

  • Hi Wen Li,

    at least for me, I just used the model as it comes from the model_zoo recipe https://git.ti.com/cgit/edgeai/meta-edgeai/tree/recipes-tisdk/edgeai-components/edgeai-tidl-models.bb?h=kirkstone

    Regards

    Steffen

  • Hello Wen Li,

    We did not use Model Composer, Model Analyzer, or Model Maker. Our model was trained and exported using this repo: github.com/.../edgeai-yolox with our custom training data. To simplify and ensure reliable testing, we suggest reproducing the issue using the precompiled YOLOX pose model from the model zoo.

    Thank you and regards

  • Hi Peter;

    I had setup the AM68 and ran the YOLO pose detection with the same model you are using. I comfirm your results

    For the version 10.1 when I set the viz-threshold = 0.3. the bounding box is more stable. Please see the first video clip. (viz=0.3)

    Yes, the version 8.6. is more stable/accurate. Please to refer to the 2nd version. But 10.1 has the red frame indication feature.

    Please take a look, we can discuss them later.

    Best regards 

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/output_5F00_video0_5F00_viz0p3.mkv

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/output_5F00_video0_5F00_8_5F00_6_5F00_short.mkv 

  • Started a TI internal tracking for this issue:TIDL-7591

  • Hello; 

    Could you provide your model's configuration info. Like the info in the "model_configs.py" file, "create_model_config()" section, so I can do the same thing as you did, for the fist step.  

    The model I plan to use is in this link.

    https://github.com/TexasInstruments/edgeai-tensorlab/blob/r10.1/edgeai-modelzoo/modelartifacts/AM68A/8bits/kd-7060_onnxrt_coco_edgeai-yolox_yolox_s_pose_ti_lite_640_20220301_model_onnx.tar.gz.link

    Thanks and regards

    Wen Li

  • Hello,

    we have tested the precompiled model where no model configuration was used by us. This model already had the detection problems.

    As the issue already exists with this precompiled model we do not think that the error is caused by wrong compilation settings.

    Anyway we compiled the model ourselves using edgeai-tidl-tools with configs extracted from the configs in edgeai-benchmark (https://github.com/TexasInstruments/edgeai-tensorlab/blob/c710e3cd20b4e6164e107d4103bd65a7f264a40e/edgeai-benchmark/configs/human_pose_estimation.py#L58) and the param.yaml (which is included in the archive and also already downloaded on the target using download_model.sh script) and got the same poor accuracy.

    Thanks and regards

  • Hello;

    Thank you very much for the information. 

    Regards

    Wen Li 

  • Hi Hans Peter,

    We have root caused this issue to preprocessing module.
    Please use attached module to /opt/edgeai-tiovx-modules to fix the issue

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/0001_2D00_tiovx_5F00_dl_5F00_pre_5F00_proc_5F00_module_2D00_Use_2D00_DSP_2D00_kernel_2D00_instead_2D00_of_2D00_A.patch

    Regards
    Rahul T R

  • Hey Rahul,

    I wonder why you nail the 'target_type' string in line 369 fix to TIVX_TARGET_DSP1, making the method parameter 'target_string' completely obsolete. That let me assume that the fix is only temporary? Additionally, the duplicate module over at edgeai-tiovx-apps:modules also creates a node of type 'tivxDLPreProcArmv8Node', does that mean an additional fix is needed there?

    Overall, can you explain a little more in detail why this is an issue that was introduced in later SDK versions than 8.6? Commit b283ab53d7f77355e56835c1e01b2d73255a08ff introduces the change to Armv8 as a bugfix without further explanation, could you please explain why you revert it again, what the initial reason/bug was that it should have fixed and so on... In general, it would be great to have a more coherent path for the user to choose a branch for downstream development, so commit messages with more explanation would be a start.

    Regards

    Steffen

  • Hi Steffen,

    Sorry for not explaining more about the fix

    Here are the details

    During 9.0 we had moved pre proc kernels from DSP to Arm neon for better scalability.
    Seems like some bug in Arm neon pre proc module, which is reducing the DL input quality.

    As a temporary solution I have given patch to edgeai-tiovx-modules to use DSP version of the
    pre proc kernel. We will work on fixing the ARM neon kernel in upcoming release

    This is temporary solution to rule out issues with TIDL model

    Regards
    Rahul T R