This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: [TDA4 OpenGL] glReadPixels takes unusual longer time to finish.

Part Number: TDA4VM

Hi,

When using OpenGL to rendering surrounding view, we use  glReadPixels to read image pixels from frame buffer to memory:

The total pixels of image reading is 120*40 pixels, and the image format is BGRA.

Normally gReadPixels takes around 0.397ms to finish reading 120*40 pixels, as showing in red box below:

However, gReadPixels sometimes takes more than 100ms to finish, and it severely causes 3-frame lag in our 30-fps surrounding view rendering. 

Why does glReadPixel sometimes take such unusual longer time (> 100ms) to read 140*40 pixels?

Regards

Mark Kang.

  • Hello Mark,

    Can you please let us know which SDK you have tried this with? And if you have seen this issue across multiple SDK versions?

    The first thing we can do is try and investigate what is different during the lag-spikes. Are you familiar with collecting PVRTune data?

    Regards,

    Erick

  • Hi Erick

    (1) We used Vision SDK v08.04.00.02 and found this issue when implementing this openGL-based view function for our product. We don't try other SDK version.

    (2) We have used PVRTune to watch the CPU/GPU loading. We don't see any clue about 3-frame lag issue when inspecting PVRTune data.

    Regards.

    Mark Kang.

  • Hello Mark,

    What is the FPS you are getting? Is there a stutter in the FPS?

    Sometimes, it is difficult to measure the time around functions due to the deferred-rendering nature of the GPU we use.

    Could you share the PVRTune you captured, the actual file, so we can take a look? Hopefully you are able to capture for a stretch of time to see how the frames are behaving.

    Thanks,

    Erick

  • Hi Erick,

    (1) In normal case we got 30 fps, but when "glReadPixel takes  >100ms to finish" case occurs, the frame rate drop to 26fps at that time.

    (2)  The PVRTune data we captured is put on https://drive.google.com/drive/folders/1Tu9gelR9IaUAu7HW8xvlvPBT8S3c8vz3 , we observe that issue occurs at time 2414.5  (see the figure below) and most  part of loading is 0%.

    Mark Kang.

  • Mark,

    I am unable to access Google Drive link from here. Would you be able to upload this file to the ticket? You can use the "Insert" tab to do so.

    Thanks,

    Erick

  • Hi Erick,

    Sorry to enable the access right on  https://drive.google.com/drive/folders/1Tu9gelR9IaUAu7HW8xvlvPBT8S3c8vz3 . Please try again.

    Mark Kang.

  • Mark,

    It's not the access rights on the google drive. My company network blocks traffic to google drive, so I'll need another way to share. The E2E ticket allows upload if the file is not too big, if you just zip it up and upload it here that would be the best way.

    Regards,

    Erick

  • Hi Erick, 

    Please refer to the following .rar file. The file size is 26,713,011 bytes.

    PVRTune_data.rar

    Mark Kang.

  • Mark,

    Thank you, I can see this issue clearly here. I'm requesting some support from our vendor as well on this topic, hope to get an explanation for this soon.

    Meanwhile, a few questions for you:

    1) Reproducing the issue on our side is usually the best way to get a bug solved quickly, is there any opportunity to share the application that exhibits this behavior?

    2) Are there any other applications running in the background that could interfere with this app? Is this app part of a larger usecase? We want to see if the lag is due to something else that could preempt the driver from running. I see this dip in the memory utilization is periodically happening:

    Any idea what this could be? It looks like the big time-gap happens around the time one of these happens.

    Regards,

    Erick

  • Hello,

    Any updates on the queries above? I've spoken to our GPU vendor, we have an advanced set of tools that will allow us to check more values/counters that can help in our debug. Currently you are using PVRTuneDeveloper, but to have access to the advanced tool you will need to sign an NDA with our GPU vendor, Imagination technologies.

    If you do not wish to go through this route, we can run the tool for you, but will need access to at least the binary to run this application, will this be a possibility?

    Thanks,

    Erick

  • Hello Erick,

    (1) As to providing application for reproducing same issue, it requires some effort for our software team modify the code from running from our own board to TDA4 EVM (most part is changing BSP from our own board to TDA4 EVM and try to reproduce the issue again). Sorry that until now we can't prepare this TDA4 EVM application since our software team has other higher priority TDA4 related SDK issue to solve.

    (2) Instead, last week our engineer try to use OpenGL ES3 PBO (Pixel Buffer Object) method to workaround this issue. As the below figure shown, OpenGL ES3 PBO uses asynchronous DMA transfer and can avoid long-time blocking against CPU (in our case, 100ms). The good news is that we have run this workaround method for 3 hours and didn't see any 100ms CPU waiting. We will run the overnight test and hope that no any occurrence of 100ms CPU waiting .  Could you also confirm with the GPU vendor whether our PBO method is the right choice to solve this issue?

    Mark Kang.

  • Hello,

     Could you also confirm with the GPU vendor whether our PBO method is the right choice to solve this issue?

    Yes, I've requested they verify this solution for you, will reply as soon as they do.

    Thanks,

    Erick

  • Hello,

    Yes, Imagination has confirmed that this is a valid approach.

    Regards,

    Erick