This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VEN-Q1: GPU hangs when surround view app is running

Part Number: TDA4VEN-Q1
Other Parts Discussed in Thread: TDA4VL, TDA4VM

Tool/software:

Hi TI experts,

HW: custom board

SDK: j722s, 10.0

I am currently facing GPU hanging issue when I ported the surround view app code from TDA4VL SDK9.2 to TDA4VEN SDK10.0. Rendering sometimes stops when running the surround view application. Please help me to solve the problem, thanks!

The following are dmesg logs and pvrlogdump logs:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[20241107_08:06:47:018]root@j722s-evm:/app# dmesg
[20241107_08:06:47:038][ 0.000000] OF: reserved mem: initialized node vision-apps-rtos-ipc-memory-region@a5000000, compatible id shared-dma-pool
[20241107_08:06:47:058][ 0.000000] OF: reserved mem: 0x00000000a5000000..0x00000000a6ffffff (32768 KiB) nomap non-reusable vision-apps-rtos-ipc-memory-region@a5000000
[20241107_08:06:47:058][ 0.000000] Reserved memory: created DMA memory pool at 0x00000000a7000000, size 96 MiB
[20241107_08:06:47:068][ 0.000000] OF: reserved mem: initialized node vision-apps-dma-memory@a7000000, compatible id shared-dma-pool
[20241107_08:06:47:078][ 0.000000] OF: reserved mem: 0x00000000a7000000..0x00000000acffffff (98304 KiB) nomap non-reusable vision-apps-dma-memory@a7000000
[20241107_08:06:47:088][ 0.000000] Reserved memory: created DMA memory pool at 0x00000000ad000000, size 1 MiB
[20241107_08:06:47:098][ 0.000000] OF: reserved mem: initialized node vision-apps-c71-dma-memory@ad000000, compatible id shared-dma-pool
[20241107_08:06:47:118][ 0.000000] OF: reserved mem: 0x00000000ad000000..0x00000000ad0fffff (1024 KiB) nomap non-reusable vision-apps-c71-dma-memory@ad000000
[20241107_08:06:47:118][ 0.000000] Reserved memory: created DMA memory pool at 0x00000000ad100000, size 63 MiB
[20241107_08:06:47:128][ 0.000000] OF: reserved mem: initialized node vision-apps-c71_0-memory@ad100000, compatible id shared-dma-pool
[20241107_08:06:47:138][ 0.000000] OF: reserved mem: 0x00000000ad100000..0x00000000b0ffffff (64512 KiB) nomap non-reusable vision-apps-c71_0-memory@ad100000
[20241107_08:06:47:148][ 0.000000] Reserved memory: created DMA memory pool at 0x00000000b1000000, size 1 MiB
[20241107_08:06:47:158][ 0.000000] OF: reserved mem: initialized node vision-apps-c71_1-dma-memory@b1000000, compatible id shared-dma-pool
[20241107_08:06:47:168][ 0.000000] OF: reserved mem: 0x00000000b1000000..0x00000000b10fffff (1024 KiB) nomap non-reusable vision-apps-c71_1-dma-memory@b1000000
[20241107_08:06:47:188][ 0.000000] Reserved memory: created DMA memory pool at 0x00000000b1100000, size 63 MiB
[20241107_08:06:47:198][ 0.000000] OF: reserved mem: initialized node vision-apps-c71_1-memory1b1100000, compatible id shared-dma-pool
[20241107_08:06:47:208][ 0.000000] OF: reserved mem: 0x00000000b1100000..0x00000000b4ffffff (64512 KiB) nomap non-reusable vision-apps-c71_1-memory1b1100000
[20241107_08:06:47:208][ 0.000000] Reserved memory: created DMA memory pool at 0x00000000b5000000, size 24 MiB
[20241107_08:06:47:218][ 0.000000] OF: reserved mem: initialized node vision-apps-core-heap-memory-lo@b5000000, compatible id shared-dma-pool
[20241107_08:06:47:228][ 0.000000] OF: reserved mem: 0x00000000b5000000..0x00000000b67fffff (24576 KiB) nomap non-reusable vision-apps-core-heap-memory-lo@b5000000
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

pvrlogdump_j722s-evm_2411070022.txt.gz

Thanks
Regards
quanfeng

  • Hello quanfeng, 

    And these errors are not present in SDK 9.2? Could you point me to where you are getting the surround view application you are using?

    Regards,
    Sarabesh S.

  • Hi Sarabesh,

    And these errors are not present in SDK 9.2?

    We have only tested on TDA4VEN SDK10.0 not on SDK9.2 yet.

    Could you point me to where you are getting the surround view application you are using?

    The surround view application is our custom application and works fine on TDA4VL without errors. 

    In TDA4VEN SDK10.0 we tested again and the application could run for more than a full night, but with the following PVR error:

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    [2024/11/8 8:55:07] [ 70.594826] PVR_K: 979: RGX Firmware image 'rgx.fw.36.53.104.796' loaded
    [2024/11/8 8:55:07] [ 70.621180] PVR_K: 979: Shader binary image 'rgx.sh.36.53.104.796' loaded
    [2024/11/8 8:55:07] [19620.759373] PVR_K: 421: ------------[ PVR DBG: START (High) ]------------
    [2024/11/8 8:55:07] [19620.766347] PVR_K: 421: OS kernel info: Linux 6.6.32 #2 SMP PREEMPT Thu Oct 17 09:54:25 CST 2024 aarch64
    [2024/11/8 8:55:07] [19620.776087] PVR_K: 421: DDK info: Rogue_DDK_Linux_WS rogueddk 24.1@6554834 (release) j722s_linux
    [2024/11/8 8:55:07] [19620.785058] PVR_K: 421: Time now: 19620785044us
    [2024/11/8 8:55:07] [19620.789710] PVR_K: 421: Services State: OK
    [2024/11/8 8:55:07] [19620.793904] PVR_K: 421: Server Errors: 0
    [2024/11/8 8:55:07] [19620.797921] PVR_K: 421: Connections Device ID:0(128) P966-V966-T979-avp_master.out
    [2024/11/8 8:55:07] [19620.805586] PVR_K: 421: ------[ Driver Info ]------
    [2024/11/8 8:55:07] [19620.810651] PVR_K: 421: Comparison of UM/KM components: MATCHING
    [2024/11/8 8:55:07] [19620.816779] PVR_K: 421: KM Arch: 64 Bit
    [2024/11/8 8:55:07] [19620.820707] PVR_K: 421: Driver Mode: Native
    [2024/11/8 8:55:07] [19620.824979] PVR_K: 421: UM Connected Clients: 64 Bit
    [2024/11/8 8:55:07] [19620.830029] PVR_K: 421: UM info: 24.1 @ 6554834 (release) build options: 0x80000810
    [2024/11/8 8:55:07] [19620.837858] PVR_K: 421: KM info: 24.1 @ 6554834 (release) build options: 0x00000810
    [2024/11/8 8:55:07] [19620.845739] PVR_K: 421: Window system: lws-generic
    [2024/11/8 8:55:07] [19620.850652] PVR_K: 421: Power lock status: Free
    [2024/11/8 8:55:07] [19620.855274] PVR_K: 421: ------[ Server Thread Summary ]------
    [2024/11/8 8:55:07] [19620.861107] PVR_K: 421: pvr_defer_free : Running
    [2024/11/8 8:55:07] [19620.865994] PVR_K: 421: Number of deferred cleanup items: QUEUED: 00000 CONNECTION : 00000 MMU : 00000 OSMEM : 00000 PMR : 00000
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    Thanks
    Regards
    quanfeng

  • Ok, thanks for the information. I will investigate this and see if there is a known fix available.

    Regards,
    Sarabesh S.

  • Hi Sarabesh,

    I will investigate this and see if there is a known fix available.

    Is there a known fix for this?

    The "CCCB has not progressed" error has appeared before on TDA4VM with SDK8.x and can be resolved by setting GPU Qos as shown in the following link, can TDA4VEN also be resolved by setting GPU Qos? Another question, there is no rgx_kicksync_test app on SDK10.0, where can I get this test app?

    [FAQ] TDA4VM: Are there any known bugs and patches that I should use in my GPU driver?

    Thanks
    Regards
    quanfeng

  • Hello, 

    I'll refer to our SDK development team to see if there's a known fix for the error and HWR you're seeing. If not known I'll escalate with our IP vendor. The 10.0 SDK is the latest 24.1 DDK release, so it should be easier to investigate. 

    By the way, for 8.x, that QOS patch is a hack that is deprecated- the new fix is to revert that KM QOS patch and apply the KM cache-mapping patch, then copy over the stable UM binary onto your filesystem. Both are located here: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1316731/faq-tda4vl-q1-what-are-the-gpu-driver-bug-fixes-for-sdk-8-6-or-earlier

    Regards,
    Sarabesh S.

  • Hi Sarabesh,

    Below are our latest test logs showing that the GPU driver causes the kernel to hang, hopefully these will help you figure out the problem quickly.

    1124.log

    1127.log
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    [20241126_21:34:19:134][ 3142.214515] Unable to handle kernel paging request at virtual address ffff007e7bbf52fe
    [20241126_21:34:19:364][ 3142.222499] Mem abort info:
    [20241126_21:34:19:364][ 3142.225282] ESR = 0x0000000096000005
    [20241126_21:34:19:364][ 3142.229019] EC = 0x25: DABT (current EL), IL = 32 bits
    [20241126_21:34:19:364][ 3142.234316] SET = 0, FnV = 0
    [20241126_21:34:19:364][ 3142.237358] EA = 0, S1PTW = 0
    [20241126_21:34:19:364][ 3142.240488] FSC = 0x05: level 1 translation fault
    [20241126_21:34:19:364][ 3142.245351] Data abort info:
    [20241126_21:34:19:364][ 3142.248220] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
    [20241126_21:34:19:364][ 3142.253691] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [20241126_21:34:19:364][ 3142.258727] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [20241126_21:34:19:364][ 3142.264024] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000082fba000
    [20241126_21:34:19:364][ 3142.270708] [ffff007e7bbf52fe] pgd=18000000fdff8003, p4d=18000000fdff8003, pud=0000000000000000
    [20241126_21:34:19:364][ 3142.279398] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
    [20241126_21:34:19:364][ 3142.285651] Modules linked in: pwm_tiehrpwm rpmsg_ctrl rpmsg_char ti_k3_dsp_remoteproc ti_k3_r5_remoteproc pvrsrvkm(O) drm backlight drm_panel_orientation_quirks
    [20241126_21:34:19:364][ 3142.300082] CPU: 3 PID: 177 Comm: avp_master.out Tainted: G O 6.6.32 #1
    [20241126_21:34:19:364][ 3142.308070] Hardware name: Texas Instruments J722S EVM (DT)
    [20241126_21:34:19:364][ 3142.313628] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [20241126_21:34:19:364][ 3142.320574] pc : remove_entity_load_avg+0x24/0x8c
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    We are in the process of porting the application to the EVM board for testing, after which we will give the application to you for synchronised testing.

    Thanks
    Regards
    quanfeng