DM3730: drm/omap: Kernel panic warning observed in omap_overlay_release() during suspend/resume cycles on DM3730

Part Number: DM3730

Tool/software:

Platform: DM3730
Kernel Version: 6.1.119
Subsystem: OMAP DRM / Display

I'm encountering a reproducible kernel panic warning during suspend/resume cycles on DM3730. The issue manifests starting from the second suspend/resume cycle and occurs consistently on every subsequent cycle.

root@dev:~# echo mem > /sys/power/state
[  230.821838] PM: suspend entry (deep)
[  230.864044] Filesystems sync: 0.042 seconds
[  230.864776] Freezing user space processes
[  230.866363] Freezing user space processes completed (elapsed 0.001 seconds)
[  230.866394] OOM killer disabled.
[  230.866424] Freezing remaining freezable tasks
[  230.868011] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  230.868041] printk: Suspending console(s) (use no_console_suspend to debug)
[  230.985168] plane-0: visible 1 -> 0
[  230.985198] in gfx new_plane_state->visible
[  230.985198] gfx: release from plane plane-0
[  230.998138] gfx: disabled
[  231.118713] Disabling non-boot CPUs ...
[  231.118743] Successfully put all powerdomains to target state
[  231.162597] plane-0: visible 0 -> 1
[  231.162628] plane: plane-0 overlay_id: 0
[  231.162780] plane-0, crtc=b5ec2eb2 fb=e11e1f30
[  231.162811] gfx: 800x480 -> 800x480 (800)
root@dev:~# [  231.162811] 0,0 0x9e900000 0x00000000
[  231.510681] OOM killer enabled.
[  231.510681] Restarting tasks ... done.
[  231.528442] random: crng reseeded on system resumption
[  231.530090] PM: suspend exit
 
root@dev:~# 
root@dev:~# 
root@dev:~# echo mem > /sys/power/state
[  259.101531] PM: suspend entry (deep)
[  259.200714] Filesystems sync: 0.099 seconds
[  259.201416] Freezing user space processes
[  259.203002] Freezing user space processes completed (elapsed 0.001 seconds)
[  259.203033] OOM killer disabled.
[  259.203033] Freezing remaining freezable tasks
[  259.204650] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  259.204681] printk: Suspending console(s) (use no_console_suspend to debug)
[  259.324981] plane-0: visible 1 -> 0
[  259.325012] in gfx new_plane_state->visible
[  259.325012] ------------[ cut here ]------------
[  259.325012] WARNING: CPU: 0 PID: 585 at drivers/gpu/drm/omapdrm/omap_overlay.c:131 omap_plane_atomic_check+0x144/0x4f8
[  259.325073] Modules linked in:
[  259.325103] CPU: 0 PID: 585 Comm: sh Not tainted 6.1.119-rt45-baxter-dm3730-dirty #21
[  259.325134] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[  259.325164]  unwind_backtrace from show_stack+0x10/0x14
[  259.325195]  show_stack from dump_stack_lvl+0x40/0x4c
[  259.325225]  dump_stack_lvl from __warn+0x74/0x12c
[  259.325256]  __warn from warn_slowpath_fmt+0x118/0x220
[  259.325286]  warn_slowpath_fmt from omap_plane_atomic_check+0x144/0x4f8
[  259.325347]  omap_plane_atomic_check from drm_atomic_helper_check_planes+0xe4/0x228
[  259.325378]  drm_atomic_helper_check_planes from drm_atomic_helper_check+0x44/0x90
[  259.325408]  drm_atomic_helper_check from omap_atomic_check+0x18/0x22c
[  259.325439]  omap_atomic_check from drm_atomic_check_only+0x6a0/0xa88
[  259.325469]  drm_atomic_check_only from drm_atomic_commit+0x64/0xe8
[  259.325500]  drm_atomic_commit from drm_atomic_helper_disable_all+0x1bc/0x1cc
[  259.325531]  drm_atomic_helper_disable_all from drm_atomic_helper_suspend+0x98/0x1fc
[  259.325561]  drm_atomic_helper_suspend from drm_mode_config_helper_suspend+0x2c/0x80
[  259.325622]  drm_mode_config_helper_suspend from dpm_run_callback+0x4c/0x164
[  259.325653]  dpm_run_callback from __device_suspend+0x114/0x510
[  259.325683]  __device_suspend from dpm_suspend+0xf8/0x26c
[  259.325714]  dpm_suspend from dpm_suspend_start+0x88/0x98
[  259.325744]  dpm_suspend_start from suspend_devices_and_enter+0x13c/0x838
[  259.325805]  suspend_devices_and_enter from pm_suspend+0x2c8/0x398
[  259.325805]  pm_suspend from state_store+0x68/0xc8
[  259.325836]  state_store from kernfs_fop_write_iter+0x10c/0x1cc
[  259.325866]  kernfs_fop_write_iter from vfs_write+0x240/0x348
[  259.325897]  vfs_write from ksys_write+0x60/0xec
[  259.325927]  ksys_write from __sys_trace_return+0x0/0x10
[  259.325958] Exception stack(0xe0909fa8 to 0xe0909ff0)
[  259.325988] 9fa0:                   00000004 0013c950 00000001 0013c950 00000004 00000000
[  259.325988] 9fc0: 00000004 0013c950 b6f0bba0 00000004 b6f87080 b6f0c15c 00000000 00000000
[  259.326019] 9fe0: 001042f4 bef3f940 b6e0ffdc b6e7c768
[  259.326019] ---[ end trace 0000000000000000 ]---
[  259.330078] gfx: disabled
[  259.458618] Disabling non-boot CPUs ...
[  259.458648] Successfully put all powerdomains to target state
[  259.502410] plane-0: visible 0 -> 1
[  259.502471] plane: plane-0 overlay_id: 0
[  259.502624] plane-0, crtc=b5ec2eb2 fb=e11e1f30
[  259.502655] gfx: 800x480 -> 800x480 (800)
root@dev:~# [  259.502655] 0,0 0x9e900000 0x00000000
[  259.989013] OOM killer enabled.
[  259.989044] Restarting tasks ... done.
[  260.006652] random: crng reseeded on system resumption
[  260.008331] PM: suspend exit

 

File: drivers/gpu/drm/omapdrm/omap_overlay.c
Function: omap_overlay_release()

void omap_overlay_release(struct drm_atomic_state *s, struct omap_hw_overlay *overlay)
{
	/* Get the global state of the current atomic transaction */
	struct omap_global_state *state = omap_get_global_state(s);
	struct drm_plane **overlay_map = state->hwoverlay_to_plane;

	if (!overlay)
		return;

	if (WARN_ON(!overlay_map[overlay->idx]))
		return;

	DBG("%s: release from plane %s", overlay->name, overlay_map[overlay->idx]->name);

	overlay_map[overlay->idx] = NULL; // ← This line triggers the crash on second suspend/resume
}

The issue appears to be related to atomic state management during the overlay cleanup process. The omap_overlay_release() function is called during the atomic_check phase, and the overlay mapping  is set to NULL (overlay_map[overlay->idx] = NULL).

The problem occurs when suspend /resume is invoked for the second time, since this time overlay mapping is already set to NULL and it was not reassigned back during the previous resume operation, hence the line " if (WARN_ON(!overlay_map[overlay->idx]))" return true.

As as temporary patch, I found that commenting "overlay_map[overlay->idx] = NULL" solves the problem. But this doesn't seem right to me.

So here are my questions,

  1. Has anyone else encountered this issue on OMAP3/DM3730 platforms?
  2. Is this a known regression in the OMAP DRM atomic framework implementation?
  3. What's the correct approach for overlay cleanup in the atomic model? Should this happen elsewhere in the pipeline?
  4. Are there any existing patches or workarounds for this specific issue?