This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM62X: Kernel IRQ of 485c0100.dma-controller crash issue

Part Number: PROCESSOR-SDK-AM62X

Tool/software:

Hi, TI expert

When running arecord, we observed that the system encountered the following kernel IRQ crash error message:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[ 5637.696595] ti-udma 485c0100.dma-controller: chan2 teardown timeout!
Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
[ 5660.061570] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 5660.061607] rcu: 0-....: (5457 ticks this GP) idle=7acc/1/0x4000000000000000 softirq=0/0 fqs=2574 rcuc=5455 jiffies(starved)
[ 5660.061623] (t=5251 jiffies g=35081 q=283 ncpus=4)
[ 5660.061644] CPU: 0 PID: 901 Comm: irq/95-485c0100 Tainted: G O 6.1.33-rt11-g685e771524 #1
[ 5660.061654] Hardware name: Texas Instruments AM625 SK (DT)
[ 5660.061661] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 5660.061669] pc : _raw_spin_unlock_irq+0x18/0x70
[ 5660.061716] lr : irq_finalize_oneshot.part.0+0x68/0x110
[ 5660.061755] sp : ffff800009b5bd90
[ 5660.061760] x29: ffff800009b5bd90 x28: 0000000000000000 x27: 0000000000000000
[ 5660.061792] x26: ffff8000080ac380 x25: ffff8000080ac5d0 x24: ffff000004397980
[ 5660.061805] x23: ffff000001344c00 x22: ffff000001344c60 x21: ffff000001344cdc
[ 5660.061816] x20: ffff000004397980 x19: ffff000001344c00 x18: 0000000000000000
[ 5660.061827] x17: 0000000000000001 x16: 0000000000000001 x15: 0000b6762e2321a2
[ 5660.061838] x14: 02da0938abd6dca0 x13: 000058fa38cd37d2 x12: 01641952b170bc26
[ 5660.061850] x11: 00000000000002ef x10: 000000000000b67e x9 : 0000000000000001
[ 5660.061861] x8 : ffff00000719cc58 x7 : ffff00000719cc68 x6 : ffffffffffffffe0
[ 5660.061871] x5 : ffff000001372680 x4 : 0000000000000000 x3 : ffffffffffffffe0
[ 5660.061883] x2 : ffff800009f00000 x1 : ffff00000719c880 x0 : 0000000100000001
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Operating Environment: SK-AM62x EVM
SDK Version: ti-processor-sdk-linux-rt-am62xx-evm-09.00.00.03.tgz
Boot Method: Using tisdk-default-image-am62xx-evm.wic.xz from the SDK to create an SD Card for booting on the EVM

Steps to reproduce kernel crash:

1. To simplify testing conditions, we first disabled the following services after booting from the SD Card and then rebooted.

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
systemctl stop irqbalanced.servic
systemctl disable irqbalanced.service
systemctl stop weston.service
systemctl disable weston.service
systemctl stop docker.service
systemctl disable docker.service
systemctl stop startwlanap.service
systemctl stop startwlansta.service
systemctl stop strongswan-starter.service
systemctl disable strongswan-starter.service
systemctl disable startwlansta.service
systemctl disable startwlanap.service
systemctl stop atd.service
systemctl disable atd.service
systemctl stop bluetooth.service
systemctl stop bt-enable.service
systemctl disable bluetooth.service
systemctl disable bt-enable.service
systemctl stop ti-apps-launcher.service
systemctl disable ti-apps-launcher.service
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

2. Created an audio test file, audio.sh, with the following content:

Fullscreen
1
2
3
4
5
6
#!/bin/bash
while :
do
arecord -f S16_LE -r 48000 -c 4 > /dev/null
done
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

3. Created a crash test file, test.sh, with the following content:

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash
modprobe -r tidss
./audio.sh &
sleep 3
memtester 256M > /dev/null &
sleep 3
memtester 256M > /dev/null &
sleep 3
memtester 256M > /dev/null &
sleep 3
memtester 256M > /dev/null &
sleep 3
memtester 256M > /dev/null &
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

4. Executed the following commands:

Fullscreen
1
2
chmod 777 test.sh audio.sh
./test.sh &
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

5. A crash occurs after approximately three hours.

Full log:

am62x_evm_kernel_irq_crash.log

Running five instances of memtester is intended to accelerate the issue, while removing the tidss driver is to avoid the following issues that could complicate the problem:

AM625: Issue about tidss rcu_preempt self-detected stall on CPU - Processors forum - Processors - TI E2E support forums

AM625: rcu_preempt self-detected stall on CPU error during runtime - Processors forum - Processors - TI E2E support forums

We noticed that recent patches appear to address the above-mentioned tidss IRQ flood issue:

https://lore.kernel.org/lkml/20241021-tidss-irq-fix-v1-4-82ddaec94e4a@ideasonboard.com/T/#mfadbc7283ea4db24ee390b2322c39df34faba7b5

However, we are not using tidss-related functions, nor have we installed the associated driver, yet we are experiencing similar issues.

It has been tested that if arecord is not run, this IRQ crash does not occur. 

Additionally, when a crash occurs, if there's a chance to successfully run cat /proc/interrupts, you can see that the interrupt count for the crash-related IRQ has increased significantly, as follows:

Could you please help check the cause of this crash? Thank you.