AM68A: Model performance lags under Optiflow versus Python

T S

Part Number: AM68A
Other Parts Discussed in Thread: TDA4VL,

I've got a relatively large semantic segmentation model that I've compiled and am running on my SK-AM68A dev board. I've got the AM68A part derated via 'k3conf' so that it behaves like a TDA4VL. The model is integrated with the TI model configs so that I can run it via "/opt/edgeai-gst-apps/optiflow/optiflow.sh" or "/opt/edgeai-gst-apps/apps_python/app_edgeai.py". I'm getting ~10fps using Optiflow and ~8fps using Python. While the Optiflow execution is giving me a better frame rate overall, I've noticed that the latency is pretty bad compared to using Python. Optiflow has a noticeable lag of around 500ms which is visibly apparent when using my IMX219 camera input. I can easily see that motion lags on the connected display. On the other hand, the Python implementation does not appear to suffer this same amount of lag.

I took a step back and decided to test out one of the stock models provided by TI, ONR-SS-8610-deeplabv3lite-mobv2-ade20k32-512x512, via /opt/edgeai-gst-apps/configs/imx219_cam_example.yaml. That has better performance than my custom model (30fps via Optiflow, 26fps via Python), but I still see a noticeable lag. If I enable gstreamer latency tracers for pipeline+element, I see a reported total latency of 198ms for Optiflow vs 59ms for Python. That aligns with what I observe on the connected display.

Is there any explanation why Optiflow would have this larger latency? Is there anything that can be done to address it?

For what it's worth, I'm using https://github.com/podborski/GStreamerLatencyPlotter to report and plot the latency info.

over 1 year ago

0 Pratik Kedar over 1 year ago

TI__Mastermind 24041 points

Hi,

We have assigned this thread to correct expert, please expect response from her.

Thanks

0 T S over 1 year ago in reply to Pratik Kedar

Intellectual 310 points

I haven't seen any updates on this, but I have a few of my own and figured I'd update this post:

- I've updated post_process.py to honor the dataset.yaml colormap settings for the segmentation overlay. Overall processing time from this consumes around 26ms for my model.

- I've updated infer_pipe.py to be multi-threaded: one thread for pulling the frame, one for doing inference, and one for producing the frame + inference result output image. Total time for these parallelized thread operations is ~104ms (model inference consumes about 90ms of that!) So, I'm a little sitting at a little less than 10fps which is pretty close to the Optiflow approach.

- With my segmentation model, the gstreamer side of things takes roughly 72ms. Totaling this with the Python code, I'm looking at about a 104ms+72ms=176ms latency measurement. This is better than using the OptiFlow approach (as noted above, Optiflow appears to have around 500ms of latency).

The long and short of all this is that I don't believe I'll proceed with using Optiflow. It'd still be nice to know why latency appears to be so high with it, but it's now less of a concern as I'll be using Python as my baseline for future development.

0 Fabiana Jaimes over 1 year ago in reply to T S

TI__Mastermind 19670 points

Hello,

Could you please share the commands you are running for these tests? Are you directly running the GStreamer pipeline that is generated by the Optiflow application?

Thank you,

Fabiana

0 T S over 1 year ago in reply to Fabiana Jaimes

Intellectual 310 points

Fabiana,

Here is a list of comprehensive instructions for replicating the issue:

Dev kit requires imx219 camera attached on CAM0 connector, monitor connected via HDMI, and ethernet connection

Program SD card w/09.02.00.05 (available here: https://www.ti.com/tool/download/PROCESSOR-SDK-LINUX-AM68A)

Modify SD card's "uEnv.txt" to enable imx219 cam by adding "k3-am68-sk-bb-rpi-cam-imx219.dtbo" to "name_overlays" overlay definition.

Insert SD card into dev kit and apply power

Wait for "Edge AI gallery" screen to appear on monitor

Login to dev kit via ssh

k3conf set clock 8 0 500000000 # derates dev kit to behave as TDA4VL

vi /opt/edgeai-gst-apps/configs/imx219_cam_example.yaml
-> change 'flow0' to reference model2: "flow0: [input0,model2,output0,[320,150,1280,720]]"

export GST_DEBUG_FILE=/run/trace-optiflow.log
export GST_DEBUG_NO_COLOR=1
export GST_DEBUG="GST_TRACER:7"
export GST_TRACERS="latency(flags=pipeline+element)"

./optiflow/optiflow.py ./configs/imx219_cam_example.yaml
-> Note framerate reported on screen at top-right corner: 30
-> With yourself in view, simply observe lag by moving your arm up and down. Lag is noticeable.
-> hit Ctrl-C to terminate after ~60 seconds of operation

export GST_DEBUG_FILE=/run/trace-python.log

./apps_python/app_edgeai.py ./configs/imx219_cam_example.yaml
-> Note framerate reported on screen at top-right corner: 25
-> With yourself in view, simply observe lag by moving your arm up and down. Visible lag is less than with Optiflow.
-> Observe reported "total time" from app_edgeai.py is ~40ms.
-> hit Ctrl-C to terminate after ~60 seconds of operation

Do the following from a Linux machine:

Clone this repo: github.com/.../GStreamerLatencyPlotter.git and follow install instructions

cd GStreamerLatencyPlotter

mkdir captures

Copy /run/trace-optiflow.log and /run/trace-python.log from dev kit to captures/ dir

node main.js captures/trace-optiflow.log
-> Observe "TOTAL" latency mean is ~210ms

node main.js captures/trace-python.log
-> Observe "TOTAL" latency mean is 33ms. Overall latency measure is 33ms + 40ms ("total time" reported by app_edgeai.py) = ~73ms.

Python app's 73ms latency is a lot less than the 210ms of Optiflow. Further, you can visibly see
the lag difference between the Optiflow and Python approach.

0 Fabiana Jaimes over 1 year ago in reply to T S

TI__Mastermind 19670 points

Thank you for providing detailed steps. I have ran this and observed similar results on my end. I am currently looking into what is causing this latency with my team and running a few tests.

Thanks,

Fabiana

0 T S over 1 year ago in reply to Fabiana Jaimes

Intellectual 310 points

Any updates on this? I'm curious what your team has found out.

0 Fabiana Jaimes over 1 year ago in reply to T S

TI__Mastermind 19670 points

Please allow me until the end of the week to get back to about this.

Thank you for your patience.

- Fabiana

Processors

Processors forum

AM68A: Model performance lags under Optiflow versus Python