AM625: PMIC IRQ causes kernel oops on AM625

Part Number: AM625

Hello,

We're having an issue on our AM625-based product where, when the PMIC asserts its IRQ, the kernel generates an oops related to improper handing of the IRQ.

Our Yocto OS (Scarthgap) is based on TI SDK 11 with kernel 6.12.43. The PMIC is a TI TPS6521903. The PMIC nINT (11) is connected to the AM625 EXTINTn (D16) as per the TI reference designs.

The oops yields the following:

Unable to handle kernel NULL pointer dereference at virtual address 00000000000000dc
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[00000000000000dc] user address but active_mm is swapper
Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 UID: 0 PID: 63 Comm: irq/26-tps65219 Not tainted 6.12.43-ti-gccfe8fee8026-dirty #1
Hardware name: <removed>
pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : regulator_notifier_call_chain+0x24/0x98
lr : tps65219_regulator_irq_handler+0x34/0x80
sp : ffff80008177bcc0
x29: ffff80008177bcc0 x28: ffff8000800bccec x27: ffff800081039ed0
x26: ffff000000684e80 x25: ffff000001a7e4ac x24: ffff000001a7e600
x23: 000000000000002d x22: 000000000000000d x21: 0000000000000000
x20: ffff000003bc5258 x19: 0000000000000004 x18: 0000000000000000
x17: 00000000ff6c8d26 x16: 00000000b58a5f26 x15: ffff000037d6eb00
x14: 0000000000000000 x13: 000000000000006e x12: 000000000000036d
x11: 00000000000000c0 x10: ffff800080b7fe60 x9 : 1fffe000007b81c1
x8 : 0000000000000001 x7 : ffff000003dc0e00 x6 : ffff000003dc0e08
x5 : 0000000000000021 x4 : ffff7fffb6eb1000 x3 : 0000000000000001
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000003bc5258
Call trace:
regulator_notifier_call_chain+0x24/0x98
tps65219_regulator_irq_handler+0x34/0x80
handle_nested_irq+0xbc/0x13c
regmap_irq_thread+0x2dc/0x718
irq_thread_fn+0x2c/0xa8
irq_thread+0x16c/0x2f4
kthread+0x110/0x114
ret_from_fork+0x10/0x20
Code: aa0003f4 f90013f5 aa0203f5 f941d801 (7941b820)
---[ end trace 0000000000000000 ]---
genirq: exiting task "irq/26-tps65219" (63) is an active IRQ thread (irq 26)
irq 26: nobody cared (try booting with the "irqpoll" option)
CPU: 0 UID: 0 PID: 63 Comm: irq/26-tps65219 Tainted: G      D            6.12.43-ti-gccfe8fee8026-dirty #1
Tainted: [D]=DIE
Hardware name: <removed>
Call trace:
dump_backtrace+0x90/0xe8
show_stack+0x18/0x24
dump_stack_lvl+0x74/0x8c
dump_stack+0x18/0x24
__report_bad_irq+0x38/0xe4
note_interrupt+0x308/0x350
handle_irq_event+0x9c/0xa8
handle_fasteoi_irq+0xa4/0x230
handle_irq_desc+0x40/0x58
generic_handle_domain_irq+0x1c/0x28
gic_handle_irq+0x54/0xd0
do_interrupt_handler+0x50/0x8c
el1_interrupt+0x34/0x68
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x64/0x68
handle_softirqs+0xa8/0x248
__do_softirq+0x14/0x20
____do_softirq+0x10/0x1c
call_on_irq_stack+0x30/0x64
do_softirq_own_stack+0x1c/0x28
__irq_exit_rcu+0xd8/0x110
irq_exit_rcu+0x10/0x1c
el1_interrupt+0x38/0x68
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x64/0x68
_raw_spin_unlock_irq+0x10/0x44
irq_thread_dtor+0x88/0xdc
task_work_run+0x74/0xd0
do_exit+0x2c8/0x8f4
make_task_dead+0x84/0x17c
arm64_force_sig_fault+0x0/0x70
die_kernel_fault+0x1bc/0x3a4
__do_kernel_fault+0x12c/0x16c
do_page_fault+0x1bc/0x4e4
do_translation_fault+0x9c/0xa8
do_mem_abort+0x44/0x94
el1_abort+0x40/0x64
el1h_64_sync_handler+0xa4/0xe4
el1h_64_sync+0x64/0x68
regulator_notifier_call_chain+0x24/0x98
tps65219_regulator_irq_handler+0x34/0x80
handle_nested_irq+0xbc/0x13c
regmap_irq_thread+0x2dc/0x718
irq_thread_fn+0x2c/0xa8
irq_thread+0x16c/0x2f4
kthread+0x110/0x114
ret_from_fork+0x10/0x20
handlers:
[<00000000a8558464>] irq_default_primary_handler threaded [<000000004f168193>] regmap_irq_thread
Disabling IRQ #26

I believe the failure is ultimately stemming from 

irq 26: nobody cared (try booting with the "irqpoll" option)

meaning nobody suitably serviced the interrupt, and then the kernel disables it to prevent reoccurrence

Disabling IRQ #26

I haven't tried adding the `irqpoll` option- I expect this would resolve the oops, but it of course is not the correct solution.

`/proc/interrupts` shows the following (relevant to the PMIC)

26:          0          0          0          0     GICv3 256 Level     tps65219_irq
 27:          0          0          0          0  tps65219_irq   0 Edge      LDO3_SCG
 28:          0          0          0          0  tps65219_irq   1 Edge      LDO3_OC
 29:          0          0          0          0  tps65219_irq   2 Edge      LDO3_UV
 30:          0          0          0          0  tps65219_irq   3 Edge      LDO4_SCG
 31:          0          0          0          0  tps65219_irq   4 Edge      LDO4_OC
 32:          0          0          0          0  tps65219_irq   5 Edge      LDO4_UV
 33:          0          0          0          0  tps65219_irq  12 Edge      LDO1_SCG
 34:          0          0          0          0  tps65219_irq  13 Edge      LDO1_OC
 35:          0          0          0          0  tps65219_irq  14 Edge      LDO1_UV
 36:          0          0          0          0  tps65219_irq  15 Edge      LDO2_SCG
 37:          0          0          0          0  tps65219_irq  16 Edge      LDO2_OC
 38:          0          0          0          0  tps65219_irq  17 Edge      LDO2_UV
 39:          0          0          0          0  tps65219_irq  18 Edge      BUCK3_SCG
 40:          0          0          0          0  tps65219_irq  19 Edge      BUCK3_OC
 41:          0          0          0          0  tps65219_irq  20 Edge      BUCK3_NEG_OC
 42:          0          0          0          0  tps65219_irq  21 Edge      BUCK3_UV
 43:          0          0          0          0  tps65219_irq  22 Edge      BUCK1_SCG
 44:          0          0          0          0  tps65219_irq  23 Edge      BUCK1_OC
 45:          0          0          0          0  tps65219_irq  24 Edge      BUCK1_NEG_OC
 46:          0          0          0          0  tps65219_irq  25 Edge      BUCK1_UV
 47:          0          0          0          0  tps65219_irq  26 Edge      BUCK2_SCG
 48:          0          0          0          0  tps65219_irq  27 Edge      BUCK2_OC
 49:          0          0          0          0  tps65219_irq  28 Edge      BUCK2_NEG_OC
 50:          0          0          0          0  tps65219_irq  29 Edge      BUCK2_UV
 51:          0          0          0          0  tps65219_irq  38 Edge      BUCK1_RV
 52:          0          0          0          0  tps65219_irq  39 Edge      BUCK2_RV
 53:          0          0          0          0  tps65219_irq  40 Edge      BUCK3_RV
 54:          0          0          0          0  tps65219_irq  41 Edge      LDO1_RV
 55:          0          0          0          0  tps65219_irq  42 Edge      LDO2_RV
 56:          0          0          0          0  tps65219_irq  45 Edge      LDO3_RV
 57:          0          0          0          0  tps65219_irq  46 Edge      LDO4_RV
 58:          0          0          0          0  tps65219_irq  47 Edge      BUCK1_RV_SD
 59:          0          0          0          0  tps65219_irq  48 Edge      BUCK2_RV_SD
 60:          0          0          0          0  tps65219_irq  49 Edge      BUCK3_RV_SD
 61:          0          0          0          0  tps65219_irq  50 Edge      LDO1_RV_SD
 62:          0          0          0          0  tps65219_irq  53 Edge      LDO2_RV_SD
 63:          0          0          0          0  tps65219_irq  54 Edge      LDO3_RV_SD
 64:          0          0          0          0  tps65219_irq  55 Edge      LDO4_RV_SD
 65:          0          0          0          0  tps65219_irq  56 Edge      TIMEOUT
 66:          0          0          0          0  tps65219_irq  30 Edge      SENSOR_3_WARM
 67:          0          0          0          0  tps65219_irq  31 Edge      SENSOR_2_WARM
 68:          0          0          0          0  tps65219_irq  32 Edge      SENSOR_1_WARM
 69:          0          0          0          0  tps65219_irq  33 Edge      SENSOR_0_WARM
 70:          0          0          0          0  tps65219_irq  34 Edge      SENSOR_3_HOT
 71:          0          0          0          0  tps65219_irq  35 Edge      SENSOR_2_HOT
 72:          0          0          0          0  tps65219_irq  36 Edge      SENSOR_1_HOT
 73:          0          0          0          0  tps65219_irq  37 Edge      SENSOR_0_HOT

My device tree configuration for the PMIC is as follows:

&main_i2c0 {
	status = "okay";
	pinctrl-names = "default";
	pinctrl-0 = <&i2ccore_pins_default>;
	clock-frequency = <400000>;

	tps65219: pmic@30 {
		compatible = "ti,tps65219";
		reg = <0x30>;
		buck1-supply = <&p5v0>;
		buck2-supply = <&p5v0>;
		buck3-supply = <&p5v0>;
		ldo1-supply = <&p3v3>;
		ldo2-supply = <&p1v8>;
		ldo3-supply = <&p3v3>;
		ldo4-supply = <&p3v3>;

		pinctrl-names = "default";
		pinctrl-0 = <&pmic_irq_pins_default>;
		interrupt-parent = <&gic500>;
		interrupts = <GIC_SPI 224 IRQ_TYPE_LEVEL_HIGH>;
		interrupt-controller;
		#interrupt-cells = <1>;

		system-power-controller;

		regulators {
			pmic_buck1: buck1 {
				regulator-name = "pmic_buck1";
				regulator-min-microvolt = <750000>;
				regulator-max-microvolt = <750000>;
				regulator-boot-on;
				regulator-always-on;
				system-critical-regulator;
			};

			pmic_buck2: buck2 {
				regulator-name = "pmic_buck2";
				regulator-min-microvolt = <1800000>;
				regulator-max-microvolt = <1800000>;
				regulator-boot-on;
				regulator-always-on;
				system-critical-regulator;
			};

			pmic_buck3: buck3 {
				regulator-name = "pmic_buck3";
				regulator-min-microvolt = <1200000>;
				regulator-max-microvolt = <1200000>;
				regulator-boot-on;
				regulator-always-on;
				system-critical-regulator;
			};

			pmic_ldo1: ldo1 {
				regulator-name = "pmic_ldo1";
				regulator-min-microvolt = <3300000>;
				regulator-max-microvolt = <3300000>;
				regulator-allow-bypass;
				regulator-boot-on;
				regulator-always-on;
			};

			pmic_ldo2: ldo2 {
				regulator-name = "pmic_ldo2";
				regulator-min-microvolt = <850000>;
				regulator-max-microvolt = <850000>;
				regulator-boot-on;
				regulator-always-on;
			};

			pmic_ldo3: ldo3 {
				regulator-name = "pmic_ldo3";
				regulator-min-microvolt = <1800000>;
				regulator-max-microvolt = <1800000>;
				regulator-boot-on;
				regulator-always-on;
			};

			pmic_ldo4: ldo4 {
				regulator-name = "pmic_ldo4";
				regulator-min-microvolt = <2500000>;
				regulator-max-microvolt = <2500000>;
				regulator-boot-on;
				regulator-always-on;
			};
		};
	};
};

which is pretty much per the TI evaluation boards.

There's a similar thread here, but with no answer from TI: TPS65219: IRQ issue - Power management forum - Power management - TI E2E support forums.

FYI, I put this request in the processors forum rather than the power management forum because I don't believe this issue ultimately has anything to do with the PMIC itself.

Kind regards,
Ben

 

 

 

  • Hi Ben,

    Thanks for the details. Can you share some steps so I can replicate the issue on my end?

    Best Regards,

    Anshu

  • Hi Anshu,

    You just need to assert the PMIC interrupt nINT (pin 11). The easiest way I think would be to just short the line to GND so that EXTINTn (D16) on the AM625 is activated.

    Thanks,
    Ben

  • Hi Ben,

    Let me replicate this and get back to you.

    Thanks,

    Anshu

  • Hi Ben,

    I was testing around with the EXTINn signal. So what I did was grounded the TP25 to both the RPI Header ground and tried again with the board's ground. By default the signal is 3.3V so I confirmed it went to 0V.

    I didn't observe any errors unlike what you described.

    The EXTINTn signal is a software programmable pin so the SoC won't do anything if there is a signal change on the pin. It will register an interrupt on the GIC due to the device tree but it requires software to do the ISR.

    Are there any other steps or software changes you've made to do this?

    Best Regards,

    Anshu