Other Parts Discussed in Thread: TDA4VM
Hi, Erick
Customer now meet GPU issue. They do pressure test, 30 seconds power on, 10 seconds power off. Issue reproduce about once every 2000 times.
Scenario is power on, issue happen, there is no display in the screen. After 17 mins, coredump. During this period, there is no display all the time.
When the issue happen, I let them capture pvr log.pvrlogdump_error.txt
------------[ PVR DBG: START (High) ]------------ OS kernel info: Linux 5.10.120 #1 SMP PREEMPT Thu Mar 21 20:35:58 CST 2024 aarch64 DDK info: Rogue_DDK_Linux_WS rogueddk 1.15@6133109 (release) j721e_linux Time now: 425641748us Services State: OK Server Errors: 0 Connections Device ID:0(128) P289-V289-T313-avmMain, P353-V353-T391-mv_psd, P435-V435-T444-mv_remote ------[ Driver Info ]------ Comparison of UM/KM components: MATCHING KM Arch: 64 Bit UM Connected Clients: 64 Bit UM info: 1.15 @ 6133109 (release) build options: 0x80000810 KM info: 1.15 @ 6133109 (release) build options: 0x00000810 Window system: lws-generic ------[ RGX Device ID:0 Start ]------ ------[ RGX Info ]------ Device Node (Info): 0000000094b10562 (000000007d2b5739) RGX BVNC: 22.104.208.318 (rogue) RGX Device State: Active RGX Power State: ON FW info: 1.15 @ 6133109 (release) build options: 0x80000810 BIF0 - OK RGX FW State: NOT RESPONDING - KCCB stalled (HWRState 0x00000001: HWR OK;) RGX FW Power State: RGXFWIF_POW_ON (APM disabled: 0 ok, 0 denied, 0 non-idle, 0 retry, 0 other, 0 total. Latency: 100 ms) RGX DVFS: 0 frequency changes. Current frequency: 750.000 MHz (sampled at 420828061435 ns). FW frequency: 100.000 MHz. RGX FW OS 0 - State: active; Freelists: Ok; Priority: 0; MTS on; RGX PHR configuration: (1) reset RD hardware RGX Kernel CCB WO:0xE RO:0x0 RGX Firmware CCB WO:0x0 RO:0x0 RGX Kernel CCB commands executed = 0 RGX SLR: Forced UFO updates requested = 0 RGX Errors: WGP:0, TRP:0 FW System config flags = 0x00020000 (Ctx switch options: Medium CSW profile; VDM CS INDEX mode;) FW OS config flags = 0x0000000F (Ctx switch: TDM; TA; 3D; CDM;) ------[ RGX registers ]------ RGX Register Base Address (Linear): 0x00000000f8d2fc51 RGX Register Base Address (Physical): 0x4E20000000 CORE_ID : 0x0000000008470000 CORE_REVISION : 0x00D0013E DESIGNER_REV_FIELD1 : 0x00000000 DESIGNER_REV_FIELD2 : 0x00000000 CHANGESET_NUMBER : 0x0000000000000000 CLK_CTRL : 0x0aaaaa002a2aaaaa CLK_STATUS : 0x0000000000600000 CLK_CTRL2 : 0x0000000000000000 CLK_STATUS2 : 0x0000000000000000 EVENT_STATUS : 0x00000400 TIMER : 0x000000004a209206 BIF_FAULT_BANK0_MMU_STATUS : 0x00000000 BIF_FAULT_BANK0_REQ_STATUS : 0x0000000000000000 BIF_FAULT_BANK1_MMU_STATUS : 0x00000000 BIF_FAULT_BANK1_REQ_STATUS : 0x0000000000000000 BIF_MMU_STATUS : 0x00000000 BIF_MMU_ENTRY : 0x00000000 BIF_MMU_ENTRY_STATUS : 0x0000000000000000 BIF_STATUS_MMU : 0x00000000 BIF_READS_EXT_STATUS : 0x00000000 BIF_READS_INT_STATUS : 0x00000000 BIFPM_STATUS_MMU : 0x00000000 BIFPM_READS_EXT_STATUS : 0x00000000 BIFPM_READS_INT_STATUS : 0x00000000 BIF_CAT_BASE_INDEX : 0x0000000000000000 BIF_CAT_BASE0 : 0x0000000000000000 BIF_CAT_BASE1 : 0x0000000000000000 BIF_CAT_BASE2 : 0x0000000000000000 BIF_CAT_BASE3 : 0x0000000000000000 BIF_CAT_BASE4 : 0x0000000000000000 BIF_CAT_BASE5 : 0x0000000000000000 BIF_CAT_BASE6 : 0x0000000000000000 BIF_CAT_BASE7 : 0x0000000000000000 BIF_CTRL_INVAL : 0x00000000 BIF_CTRL : 0x000000C0 BIF_PM_CAT_BASE_VCE0 : 0x0000000000000000 BIF_PM_CAT_BASE_TE0 : 0x0000000000000000 BIF_PM_CAT_BASE_ALIST0 : 0x0000000000000000 BIF_PM_CAT_BASE_VCE1 : 0x0000000000000000 BIF_PM_CAT_BASE_TE1 : 0x0000000000000000 BIF_PM_CAT_BASE_ALIST1 : 0x0000000000000000 PERF_TA_PHASE : 0x00000000 PERF_TA_CYCLE : 0x00000000 PERF_3D_PHASE : 0x00000000 PERF_3D_CYCLE : 0x00000000 PERF_TA_OR_3D_CYCLE : 0x00000000 PERF_TA_AND_3D_CYCLE : 0x00000000 PERF_COMPUTE_PHASE : 0x00000000 PERF_COMPUTE_CYCLE : 0x00000000 PM_PARTIAL_RENDER_ENABLE : 0x00000000 ISP_RENDER : 0x00000000 TLA_STATUS : 0x0000000000000000 MCU_FENCE : 0x0000000000000000 VDM_CONTEXT_STORE_STATUS : 0x00000001 VDM_CONTEXT_STORE_TASK0 : 0x0000000000000000 VDM_CONTEXT_STORE_TASK1 : 0x0000000000000000 VDM_CONTEXT_STORE_TASK2 : 0x0000000000000000 VDM_CONTEXT_RESUME_TASK0 : 0x0000000000000000 VDM_CONTEXT_RESUME_TASK1 : 0x0000000000000000 VDM_CONTEXT_RESUME_TASK2 : 0x0000000000000000 ISP_CTL : 0x00000000 ISP_STATUS : 0x00000000 MTS_INTCTX : 0x00000000 MTS_BGCTX : 0x00000001 MTS_BGCTX_COUNTED_SCHEDULE : 0x00000000 MTS_SCHEDULE : 0x00000000 MTS_GPU_INT_STATUS : 0x00000400 CDM_CONTEXT_STORE_STATUS : 0x00000000 CDM_CONTEXT_PDS0 : 0x0000000000000000 CDM_CONTEXT_PDS1 : 0x0000000000000000 CDM_TERMINATE_PDS : 0x0000000000000000 CDM_TERMINATE_PDS1 : 0x0000000000000000 SIDEKICK_IDLE : 0x0000007A SLC_IDLE : 0x000000FF SLC_STATUS0 : 0x00000000 SLC_STATUS1 : 0x0000000000000000 SLC_STATUS2 : 0x0000000000000000 SLC_CTRL_BYPASS : 0x00000000 SLC_CTRL_MISC : 0x0000000000200003 MIPS_ADDR_REMAP1_CONFIG1 : 0x1FC00001 MIPS_ADDR_REMAP1_CONFIG2 : 0x00000008abd5f00c MIPS_ADDR_REMAP2_CONFIG1 : 0x1FC01001 MIPS_ADDR_REMAP2_CONFIG2 : 0x00000008abd4200c MIPS_ADDR_REMAP3_CONFIG1 : 0x1FC02001 MIPS_ADDR_REMAP3_CONFIG2 : 0x00000008abd6000c MIPS_ADDR_REMAP4_CONFIG1 : 0x1FC00000 MIPS_ADDR_REMAP4_CONFIG2 : 0x000000000000000c MIPS_ADDR_REMAP5_CONFIG1 : 0x00000001 MIPS_ADDR_REMAP5_CONFIG2 : 0x00000008abd5f00c MIPS_WRAPPER_CONFIG : 0x000000000001cf80 MIPS_EXCEPTION_STATUS : 0x00000000 ---- [ MIPS internal state ] ---- PC : 0xC00073BC STATUS_REGISTER : 0x00481004 CAUSE_REGISTER : 0x40800C08 BAD_REGISTER : 0xC0007934 EPC : 0xC0007934 SP : 0xCF600F40 BAD_INSTRUCTION : 0x00000000 TLB : 0) VA 0xCF800000 ( 64k) -> PA0 0xe20000000 DV , PA1 0x00000000 C 1) VA 0xCF000000 ( 16k) -> PA0 0x8abfb0000 DVGC, PA1 0x8abfb4000 DVGC 2) VA 0xCF600000 ( 4k) -> PA0 0x8abd41000 DV C, PA1 0x00000000 C 3) VA 0xC0032000 ( 4k) -> PA0 0x8abd45000 DVGC, PA1 0x8abd44000 DVGC 4) VA 0xC0006000 ( 4k) -> PA0 0x8abd72000 DVGC, PA1 0x8abd71000 DVGC 5) VA 0xC0016000 ( 4k) -> PA0 0x8abd62000 DVGC, PA1 0x8abd61000 DVGC 6) VA 0xC1FF0000 ( 4k) -> PA0 0x8abd96000 DVGC, PA1 0x8abd97000 DVGC 7) VA 0xC0020000 ( 4k) -> PA0 0x8abd57000 DVG , PA1 0x8abd30000 DVG 8) VA 0x00000000 ( 4k) -> PA0 0x8abd78000 DVGC, PA1 0x8abd77000 DVGC BRN63553 WA present with a valid TLB entry mapping address 0x0. 9) VA 0xC001E000 ( 4k) -> PA0 0x8abd92000 VGC, PA1 0x8abd94000 DVG 10) VA 0xF0014000 ( 4k) -> PA0 0x00000000 C, PA1 0x00000000 C 11) VA 0xC1FD0000 ( 4k) -> PA0 0x8abd47000 DVG , PA1 0x8abd48000 DVG 12) VA 0xC0008000 ( 4k) -> PA0 0x8abd70000 DVGC, PA1 0x8abd6f000 DVGC 13) VA 0xF001A000 ( 4k) -> PA0 0x00000000 C, PA1 0x00000000 C 14) VA 0xC1FE0000 ( 4k) -> PA0 0x8abd7b000 DVGC, PA1 0x8abd7c000 DVGC 15) VA 0xC001A000 ( 4k) -> PA0 0x8abd8b000 DVG , PA1 0x8abd8d000 DVG -------------------------------- ------[ RGX FW Trace Info ]------ Debug log type: trace ( main ) ------[ RGX FW thread 0 trace START ]------ FWT[traceptr]: 0 FWT[tracebufsize]: 2EE0 FWT[00000000]: 00000000 ... 00000000 FWT[END]: 400 lines were all zero ------[ RGX FW thread 0 trace END ]------ ------[ Full CCB Status ]------ FWCtx 0xC0028300 (TQ_3D-P289-T313-avmMain) |--Waiting TQ_3D @ 0 Int=1 Ext=1 |--Waiting UPDATE @ 200 Int=1 Ext=1 | |--Addr:0xc002b000 Val=0x00000001 | `--Addr:0xc002e001 Val=0x00000519 |--Waiting TQ_3D @ 256 Int=2 Ext=2 |--Waiting UPDATE @ 456 Int=2 Ext=2 | `--Addr:0xc002b000 Val=0x00000002 |--Waiting TQ_3D @ 504 Int=3 Ext=3 |--Waiting UPDATE @ 704 Int=3 Ext=3 | `--Addr:0xc002b000 Val=0x00000003 |--Waiting TQ_3D @ 752 Int=4 Ext=4 |--Waiting UPDATE @ 952 Int=4 Ext=4 | `--Addr:0xc002b000 Val=0x00000004 |--Waiting TQ_3D @ 1000 Int=5 Ext=5 |--Waiting UPDATE @ 1200 Int=5 Ext=5 | `--Addr:0xc002b000 Val=0x00000005 |--Waiting TQ_3D @ 1248 Int=6 Ext=6 `--Waiting UPDATE @ 1448 Int=6 Ext=6 |--Addr:0xc002b000 Val=0x00000006 `--Addr:0xc002e009 Val=0x00000519 FWCtx 0xC0028040 (TA-P289-T313-avmMain) |--Waiting FENCE @ 0 Int=8 Ext=0 | |--Addr:0xc002d000 Val=0x00000000 | `--Addr:0xc002a000 Val=0x00000000 |--Waiting TA @ 56 Int=8 Ext=0 `--Waiting UPDATE @ 168 Int=8 Ext=0 |--Addr:0xc002d000 Val=0x00000001 |--Addr:0xc002a000 Val=0x00000001 `--Addr:0xc002e031 Val=0x00000519 FWCtx 0xC00280E0 (3D-P289-T313-avmMain) |--Waiting FENCE_PR @ 0 Int=8 Ext=0 | `--Addr:0xc002d000 Val=0x00000001 |--Waiting 3D @ 48 Int=8 Ext=0 `--Waiting UPDATE @ 440 Int=8 Ext=0 |--Addr:0xc002d000 Val=0x00000002 |--Addr:0xc002a000 Val=0x00000002 |--Addr:0xc002e029 Val=0x00000519 `--Addr:0xc002e039 Val=0x00000519 FWCtx 0xC002F000 (TA-P353-T391-mv_psd) |--Waiting FENCE @ 0 Int=7 Ext=0 | |--Addr:0xc0031000 Val=0x00000000 | `--Addr:0xc0030000 Val=0x00000000 |--Waiting TA @ 56 Int=7 Ext=0 `--Waiting UPDATE @ 168 Int=7 Ext=0 |--Addr:0xc0031000 Val=0x00000001 |--Addr:0xc0030000 Val=0x00000001 `--Addr:0xc002e019 Val=0x00000519 FWCtx 0xC002F0A0 (3D-P353-T391-mv_psd) |--Waiting FENCE_PR @ 0 Int=7 Ext=0 | `--Addr:0xc0031000 Val=0x00000001 |--Waiting 3D @ 48 Int=7 Ext=0 `--Waiting UPDATE @ 440 Int=7 Ext=0 |--Addr:0xc0031000 Val=0x00000002 |--Addr:0xc0030000 Val=0x00000002 |--Addr:0xc002e011 Val=0x00000519 `--Addr:0xc002e021 Val=0x00000519 FWCtx 0xC002F500 (TA-P435-T444-mv_remote) `--<Empty> FWCtx 0xC002F5A0 (3D-P435-T444-mv_remote) `--<Empty> ------[ RGX Device ID:0 End ]------ ------[ System Summary Device ID:0 ]------ Device System Power State: ON MaxHWTOut: 500000us, WtTryCt: 10000, WDGTOut(on,off): (10000ms,3600000ms) ------[ Server Thread Summary ]------ pvr_defer_free : Running Number of deferred cleanup items : 0 pvr_device_wdg : Running pvr_cacheop : Running Configuration: QSZ: 16, UKT: -1, KDFT: 131072, LINESIZE: 64, PGSIZE: 4096, KDF: Yes, URBF: Yes Pending deferred CacheOp entries : 0 ------[ AppHint Settings ]------ Build Vars EnableTrustedDeviceAceConfig: N CleanupThreadPriority: 0x00000005 CacheOpThreadPriority: 0x00000001 WatchdogThreadPriority: 0x00000000 HWPerfClientBufferSize: 0x000c0000 Module Params none Debug Info Params CacheOpConfig: 0x0000000c CacheOpUMKMThresholdSize: 0xffffffff Debug Info Params Device ID: 0 EnableLogGroup: main ------[ HTB Log state: Off ]------ ------[ Active Sync Checkpoints ]------ - ID = 7, FWAddr = 0xc002e038, r1:e1:f0: es3_DoKick3D_0 - ID = 6, FWAddr = 0xc002e030, r1:e1:f0: es3_DoKickTA_0 - ID = 5, FWAddr = 0xc002e028, r1:e1:f0: update fence - ID = 4, FWAddr = 0xc002e020, r1:e1:f0: es3_DoKick3D_0 - ID = 3, FWAddr = 0xc002e018, r1:e1:f0: es3_DoKickTA_0 - ID = 2, FWAddr = 0xc002e010, r1:e1:f0: update fence - ID = 1, FWAddr = 0xc002e008, r1:e1:f0: TQM - ID = 0, FWAddr = 0xc002e000, r1:e1:f0: TQM ------[ Native Fence Sync: timelines ]------ foreign_sync: @0 ctx=1 refs=1 sw: RM_SWTimeline-v_avm-avmMain-289 @0 cur=0 rogue-ta3d: @1 ctx=3 refs=2 @0: (+-) refs=5 fwaddr=0xc002e029 enqueue=1 status=Active 0-update fence rogue-tq3d: @0 ctx=5 refs=1 QE-mv_avm-avmMain-289: @2 ctx=6 refs=3 @0: (+-) refs=2 fwaddr=0xc002e001 enqueue=1 status=Active 0-TQM @1: (+-) refs=2 fwaddr=0xc002e009 enqueue=1 status=Active 1-TQM sw: RM_SWTimeline-mv_psd-353 @0 cur=0 rogue-ta3d: @1 ctx=8 refs=2 @0: (+-) refs=6 fwaddr=0xc002e011 enqueue=1 status=Active 0-update fence V3-mv_psd-353: @1 ctx=10 refs=2 @0: (+-) refs=2 fwaddr=0xc002e019 enqueue=1 status=Active 0-es3_DoKickTA_0 P3-mv_psd-353: @1 ctx=11 refs=2 @0: (+-) refs=2 fwaddr=0xc002e021 enqueue=1 status=Active 0-es3_DoKick3D_0 V3-mv_avm-avmMain-289: @1 ctx=12 refs=2 @0: (+-) refs=2 fwaddr=0xc002e031 enqueue=1 status=Active 0-es3_DoKickTA_0 P3-mv_avm-avmMain-289: @1 ctx=13 refs=2 @0: (+-) refs=2 fwaddr=0xc002e039 enqueue=1 status=Active 0-es3_DoKick3D_0 sw: RM_SWTimeline-mv_remote-435 @0 cur=0 rogue-ta3d: @0 ctx=15 refs=1 ------------[ PVR DBG: END ]------------
Here is the GPU log in terminal.
Here is the function call in their code project:
Here is the core dump log:
Please help to further debug.
Regards
Zekun