This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA2SX: Picture freeze observed during GPU Processing

Part Number: TDA2SX

Hi,

I am working on camera based application running on TDA2x processor.

The application is using SYS-BIOS os on cores - IPU0_1, IPU1_0, DSP1 and Linux on A15_1(A15_1 core supports GPU for rendering (PowerVR SGX 544MP) ).

This application was running fine for few seconds (video capture in IPU0_1, rendering using GPU in A15_1 and video display in IPU0_1.

But i am observing Video freeze once A15 rendering starts. After debugging and ruled-out there is no issue with

- Video capture (tried running stop watch infront of the camera and dumped the frames during freeze. Observed capture is progressing fine.

- Added logs in A15 to check Video display driver completes its processing and notification is received by A15 renderer. Observed notification received by A15 core.

- Added further logs in A15 render implementation and observed GPU processing is not happening (eglWaitGL) and leads to picture freeze. Please find the attached log for more understanding ("t" is in millisecs).

Observed the following errors from GPU once it is left it for long run with freeze scenario,

Please let me know why GPU is not returning from it's processing?.

Is it a GPU hardware issue or did i miss any configuration for GPU?.

Thanks and Regards,

Nithin Scaria

  • Hello Nithin,

    Could you provide some additional information relating to the issue you are seeing

    - What is the TI baseline release that is being used here?

    - "I am working on camera based application running" - is this a custom application of yours?

    - Could you post the entire error log here.

    - "This application was running fine for few seconds" - how many frames were you able to see before you starting seeing the issue.

    Thanks,

    Gowtham

  • Hello Gowtham,

    Thanks for your reply. Please find the attached complete log for your reference(HOST in the log refers to A15 core).

    - What is the TI baseline release that is being used here?

       We are using VISION_SDK_02_08_01_00, below are the package versions,

       bsp_01_04_00_08  

       edma3_lld_02_12_00_20

       starterware_01_04_00_10

       bios_6_42_02_29

       xdctools_3_31_02_38_core

     

    - "I am working on camera based application running" - is this a custom application of yours?

    This is our custom application which uses 4 cameras for displaying several views.

    - "This application was running fine for few seconds" - how many frames were you able to see before you starting seeing the issue.?

       This happens randomly and issue is 100% reproducible and the number of frames vary for every boot-up

       During bootup initial rendering is done by DSP core(default view), and once linux is up A15_1 starts rendering

       Verified by keeping only DSP rendering mode and there is no freeze observed.

       Once A15 starts rendering for few frames and while using GPU, observed the picture freeze.

    Log_picture_freeze.txt

    Thanks,

    Nithin

  • Hello Nithin,

    Is this all the logs that you have? Is there any more information in the kernel logs before what you have provided. By the time force-flip workaround kicks in, something already has gone wrong.

    Can you please share complete set of logs?

    Regards

    Hemant

  • Hi Hemant,

    I've enabled kernel traces in our application(by default it was disabled). Observed some PVR dumps after picture freeze.
    Please find the attached complete logs for your reference.

    complete_log.txt

  • Hello Nithin,

    Thank you for the logs.

    After reviewing the logs and as you mentioned, I see:

    [21.267788 0.007338] [HOST ] [INFO] 27.481536 s: Log3: After drmEglSwapBuffers() t: 19263 
    [21.274499 0.006711] [HOST ] [INFO] 27.489501 s: Log4: before eglWaitGL() t: 19268

    Yes - the application did submit a task to the GPU and is now waiting.

    After sometime:

    [70.011968 47.011224] [   68.647106]  remoteproc3: failed to load dra7-dsp2-fw.xe66

    Can we figure out why this happened?

    Looks like the host driver waited for a very long time and then eventually timed out:

    [520.330150 450.318182] PVR:(Error): WaitForRender: Trying force-flip workaround [117, /sgxif.c]
    [1020.351638 500.021488] PVR:(Error): WaitForRender: Timeout [125, /sgxif[ 1018.946824] PVR_K: User requested SGX debug info

    At the very end of the log, I see:

    [1022.366751 0.007319] [ 1020.961859] PVR_K: SGX Kernel CCB WO:0x35 RO:0x35

    At first glance, it looks like SGX itself is fine but there was an unmet dependency. Which EGL is being used here? How is the render output consumed? How is rendered output synchronized with consumer?

    I will forward this to Imagination as well but in the meanwhile, can we ensure that buffer dependencies are met for SGX to complete its processing?

    Regards

    Hemant

  • Hi Hemant,

    We are using only DSP1 not DSP2. So I think that may be the reason why we are getting this error log: [70.011968 47.011224] [   68.647106]  remoteproc3: failed to load dra7-dsp2-fw.xe66. Any updates from Imagination?

    Best Regards

    Nithin

  • Nithin,

    IMG's response was was the same as I gave - there is no information about SGX going wrong anywhere.

    Can you please answer these questions:

    Which EGL is being used here? How is the render output consumed? How is rendered output synchronized with consumer?

    The first place to look is to ensure that resources needed for the GPU are indeed freed (render buffer, textures etc).

    In the meanwhile, I will check if we can get you debug binaries.

    Regards

    Hemant

  • Hi Hemant,

    EGL version details

    EGL: version 1.4
    EGL: using EGL_KHR_fence_sync extension
    EGL: GL Version = OpenGL ES 2.0 build 1.9@2253347
    EGL: GL Vendor = Imagination Technologies
    EGL: GL Renderer = PowerVR SGX 544MP

    About renderer output synchronisation

    VOP display task which is running in IPU1_0 core waits for the rendered output from A15 core and once received the rendered output frame, display driver starts its processing to display.

    Until display driver finishes its processing, A15 will wait for the display completed notification to begin the rendering again.

    When picture freeze happens, we verified that VOP task sends the display completed notification to A15 core, but its not getting the rendered output post from A15 (because of continuous wait in eglWaitGL(), which leads to freeze).

    You mentioned something about debug binary in the last post. Is it for validating chip functionality?

    Best Regards

    Nithin

  • Nithin,

    Thank you for sharing the details. The GPU does not seem to have any operations pending. We are trying to get you a set of debug binaries for SGX driver. You will however need to build the sgx kernel module.

    In the meanwhile, in order to get a better understanding of the use case, can you please capture PVRTrace of your application scenario:

    https://www.imgtec.com/developers/powervr-sdk-tools/pvrtrace/

    Regards

    Hemant

  • Hello Nithin,

    Can you please provide us with PVRTrace?

    Regards

    Hemant

  • Hi Hemanth,

    Please find the attached PVR trace file and corresponding terminal logs for your reference.

    We've followed some predefined steps for getting pvr trace (we have used .so files from PVR recorder libraries and .json file),

    but  observed, when enabling pvr trace, views are not coming as expected and got below message in the terminal logs:

    "PVRTrace has disabled support for shader/program binaries in the API"

    Please let us know whether the attached trace is a valid one and if you find any info, kindly share the details or suggestions.

    Regards,

    Nithin

    PVR_Trace.zip

  • Thank you Nithin,

    I will take a look at the trace and get back to you.

    In the meanwhile, we want to get you debug binaries for the setup. This is a two stage process:

    1. Migrate to a later version of the graphics driver

    2. Use debug binaries corresponding to that version.

    Is this something you can do?

    Here is the link to kernel module source:

    https://git.ti.com/cgit/graphics/omap5-sgx-ddk-linux/log/?h=dra7/k3.14

    Here is the link to corresponding binaries:

    https://git.ti.com/cgit/graphics/omap5-sgx-ddk-um-linux/log/?h=dra7/k3.14

    Can you please try an migrate to this version? I  recommend having a backup of your setup in case something breaks. Once you have given this a shot and confirm that you can see the problem, we will provide you with debug binaries to collect more information.

    Please note that it is still worth keeping on a lookout for dependencies that may cause this.

    Regards

    Hemant