Hi, we are using Intel 82599ES PCIe card on our board, and the PCIe card is attached to the 2 LANEs of SERDES2.
The 82599ES card is stable if our board runs SDK 6.2, but A72 reports ABORT if the board runs SDK 7.0.
Since the SERDES2 of EVM board is used as M.2 slot, so I am not sure if the EVM's SERDES2 has this issue too, but 82599ES card works well when it is plugged into EVM's PCIe x2 slot on SDK 7.0.
The UART crash log:
ERROR: Unhandled External Abort received on 0x80000001 from S-EL1
ERROR: exception reason=0 syndrome=0xbf000000
Unhandled Exception from EL1
x0 = 0x0000000000000000
x1 = 0x0000000000010148
x2 = 0xffff000840073e05
x3 = 0x0000000000000000
x4 = 0x0000000000000027
x5 = 0x0000006562677869
x6 = 0xffff00084ed3276c
x7 = 0x0000000000000018
x8 = 0xfefefefefefefeff
x9 = 0xffff00084ed3276c
x10 = 0x00000000000009e0
x11 = 0x0000000000000000
x12 = 0x0000000000000001
x13 = 0x0000000000000000
x14 = 0x0000000000000000
x15 = 0x0000000000000000
x16 = 0x0000000000000000
x17 = 0x0000000000000000
x18 = 0x0000000000000000
x19 = 0x0000000000010148
x20 = 0x0000000000000000
x21 = 0xffff000846301980
x22 = 0xffff800015300000
x23 = 0xffff000840073e00
x24 = 0xffff0008463027b0
x25 = 0x0000000000000000
x26 = 0xffff80001a62fce8
x27 = 0xffff0008463007a8
x28 = 0xffffffffffffe098
x29 = 0xffff80001a6efc80
x30 = 0xffff800010626970
scr_el3 = 0x000000000000073d
sctlr_el3 = 0x0000000030cd183f
cptr_el3 = 0x0000000000000000
tcr_el3 = 0x0000000080803520
daif = 0x00000000000002c0
mair_el3 = 0x00000000004404ff
spsr_el3 = 0x0000000060000005
elr_el3 = 0xffff800010624ed0
ttbr0_el3 = 0x0000000070010b00
esr_el3 = 0x00000000bf000000
far_el3 = 0x0000000000000000
spsr_el1 = 0x0000000040000005
elr_el1 = 0xffff800010086aa8
spsr_abt = 0x0000000000000000
spsr_und = 0x0000000000000000
spsr_irq = 0x0000000000000000
spsr_fiq = 0x0000000000000000
sctlr_el1 = 0x0000000034d4d91d
actlr_el1 = 0x0000000000000000
cpacr_el1 = 0x0000000000300000
csselr_el1 = 0x0000000000000000
sp_el1 = 0xffff80001a6efc80
esr_el1 = 0x0000000056000000
ttbr0_el1 = 0x00000008d51d1c00
ttbr1_el1 = 0x059e000080bc0000
mair_el1 = 0x0000bbff440c0400
amair_el1 = 0x0000000000000000
tcr_el1 = 0x00000034f5507510
tpidr_el1 = 0xffff80086ee50000
tpidr_el0 = 0x0000000000000000
tpidrro_el0 = 0x0000000000000000
par_el1 = 0x0000000000000000
mpidr_el1 = 0x0000000080000001
afsr0_el1 = 0x0000000000000000
afsr1_el1 = 0x0000000000000000
contextidr_el1 = 0x0000000000000000
vbar_el1 = 0xffff800010081800
cntp_ctl_el0 = 0x0000000000000005
cntp_cval_el0 = 0x000000f833b4db18
cntv_ctl_el0 = 0x0000000000000000
cntv_cval_el0 = 0x0000000000000000
cntkctl_el1 = 0x00000000000000e6
sp_el0 = 0x000000007000abd0
isr_el1 = 0x0000000000000040
dacr32_el2 = 0x0000000000000000
ifsr32_el2 = 0x0000000000000000
cpuectlr_el1 = 0x0000001b00000040
cpumerrsr_el1 = 0x0000000000000000
l2merrsr_el1 = 0x0000000000000000
[ 5349.102182] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 5349.108267] rcu: 1-...0: (195 ticks this GP) idle=116/1/0x4000000000000000 softirq=48546/48548 fqs=2625
[ 5349.117809] (detected by 0, t=5252 jiffies, g=89765, q=883)
[ 5349.123449] Task dump for CPU 1:
[ 5349.126663] kworker/u4:1 R running task 0 2058 2 0x0000002a
[ 5349.133703] Workqueue: ixgbe ixgbe_service_task
[ 5349.138218] Call trace:
[ 5349.140655] __switch_to+0x104/0x170
[ 5349.144216] 0xffff0008412fe000
[ 5412.122183] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 5412.128269] rcu: 1-...0: (195 ticks this GP) idle=116/1/0x4000000000000000 softirq=48546/48548 fqs=10497
[ 5412.137897] (detected by 0, t=21007 jiffies, g=89765, q=1239)
[ 5412.143711] Task dump for CPU 1:
[ 5412.146924] kworker/u4:1 R running task 0 2058 2 0x0000002a
[ 5412.153965] Workqueue: ixgbe ixgbe_service_task
[ 5412.158480] Call trace:
[ 5412.160918] __switch_to+0x104/0x170
[ 5412.164479] 0xffff0008412fe000
[ 5475.142183] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 5475.148270] rcu: 1-...0: (195 ticks this GP) idle=116/1/0x4000000000000000 softirq=48546/48548 fqs=18369
[ 5475.157898] (detected by 0, t=36762 jiffies, g=89765, q=2380)
[ 5475.163711] Task dump for CPU 1:
[ 5475.166924] kworker/u4:1 R running task 0 2058 2 0x0000002a
[ 5475.173966] Workqueue: ixgbe ixgbe_service_task
[ 5475.178480] Call trace:
[ 5475.180918] __switch_to+0x104/0x170
[ 5475.184480] 0xffff0008412fe000
[ 5504.050190] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 5524 jiffies s: 85 root: 0x2/.
[ 5504.060708] rcu: blocking rcu_node structures:
[ 5504.065520] Task dump for CPU 1:
[ 5504.068800] kworker/u4:1 R running task 0 2058 2 0x0000002a
[ 5504.075893] Workqueue: ixgbe ixgbe_service_task
[ 5504.080503] Call trace:
[ 5504.082995] __switch_to+0x104/0x170
[ 5504.086615] 0xffff0008412fe000
Could you please tell me how to debug this issue?
And is there any big changes about PCIe between SDK 6.2 and SDK 7.0?
Thanks