TDA4VH-Q1: Kernel panic - not syncing: Asynchronous SError Interrupt - Wave5 driver

Part Number: TDA4VH-Q1

Tool/software:

Hi,

Please point me in the right direction for debugging this System Error.  I have a custom board with a TDA4VH-Q1 SOC.  This async error consistently occurs in the wave5_dec_clr_disp_flag() call while starting up a gstreamer pipeline with the v4l2h264dec decoder.  Thanks for the help...

[ 2833.600901] SError Interrupt on CPU5, code 0x00000000bf000000 -- SError
[ 2833.600920] CPU: 5 UID: 0 PID: 4129 Comm: queue1:src Not tainted 6.15.8-dirty #2 PREEMPT(voluntary)
[ 2833.600926] Hardware name: ---
[ 2833.600929] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2833.600933] pc : wave5_dec_clr_disp_flag+0x40/0x80 [wave5]
[ 2833.600959] lr : wave5_dec_clr_disp_flag+0x40/0x80 [wave5]
[ 2833.600964] sp : ffff800095ebba30
[ 2833.600965] x29: ffff800095ebba30 x28: ffff0008021acd30 x27: 0000000000000000
[ 2833.600972] x26: ffff000805894010 x25: ffff800079a02e98 x24: ffff000807f1ba00
[ 2833.600977] x23: ffff800095ebbcc8 x22: ffff0008021acd50 x21: ffff0008065b8000
[ 2833.600981] x20: ffff000805894000 x19: ffff000805894000 x18: 0000000000000000
[ 2833.600986] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff80006278
[ 2833.600990] x14: 0000000100000000 x13: 0000000000000000 x12: 0000000000000000
[ 2833.600995] x11: ffffffffffffffff x10: ffffffffffffffff x9 : 0000000000000000
[ 2833.600999] x8 : ffff000807f1baa0 x7 : 0000000000000000 x6 : 0000000000000001
[ 2833.601003] x5 : 0000000000000001 x4 : ffff000807f1b8b0 x3 : 0000000000000000
[ 2833.601007] x2 : 0000000000000000 x1 : ffff800082ea0118 x0 : ffff800082ea0000
[ 2833.601014] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 2833.601016] CPU: 5 UID: 0 PID: 4129 Comm: queue1:src Not tainted 6.15.8-dirty #2 PREEMPT(voluntary)
[ 2833.601020] Hardware name: ---
[ 2833.601022] Call trace:
[ 2833.601025] show_stack+0x18/0x30 (C)
[ 2833.601039] dump_stack_lvl+0x60/0x80
[ 2833.601046] dump_stack+0x18/0x24
[ 2833.601050] panic+0x168/0x360
[ 2833.601054] nmi_panic+0x88/0x90
[ 2833.601059] arm64_serror_panic+0x64/0x80
[ 2833.601064] do_serror+0x3c/0x70
[ 2833.601068] el1h_64_error_handler+0x30/0x50
[ 2833.601076] el1h_64_error+0x6c/0x70
[ 2833.601079] wave5_dec_clr_disp_flag+0x40/0x80 [wave5] (P)
[ 2833.601085] wave5_vpu_dec_clr_disp_flag+0x54/0x80 [wave5]
[ 2833.601090] wave5_vpu_dec_buf_queue+0x148/0x150 [wave5]
[ 2833.601095] __enqueue_in_driver+0x3c/0x80 [videobuf2_common]
[ 2833.601100] vb2_core_qbuf+0x438/0x5b0 [videobuf2_common]
[ 2833.601104] vb2_qbuf+0xac/0x190 [videobuf2_v4l2]
[ 2833.601111] v4l2_m2m_qbuf+0x6c/0x240 [v4l2_mem2mem]
[ 2833.601119] v4l2_m2m_ioctl_qbuf+0x18/0x490 [v4l2_mem2mem]
[ 2833.601123] v4l_qbuf+0x48/0x70 [videodev]
[ 2833.601136] __video_do_ioctl+0x3f4/0x470 [videodev]
[ 2833.601144] video_usercopy+0x1e4/0x690 [videodev]
[ 2833.601151] video_ioctl2+0x18/0x30 [videodev]
[ 2833.601159] v4l2_ioctl+0x40/0x60 [videodev]
[ 2833.601167] __arm64_sys_ioctl+0xac/0xe0
[ 2833.601176] invoke_syscall+0x48/0x110
[ 2833.601182] el0_svc_common.constprop.0+0x40/0xe0
[ 2833.601186] do_el0_svc+0x1c/0x30
[ 2833.601189] el0_svc+0x30/0xd0
[ 2833.601193] el0t_64_sync_handler+0x10c/0x140
[ 2833.601197] el0t_64_sync+0x198/0x19c
[ 2833.601201] SMP: stopping secondary CPUs
[ 2833.601215] Kernel Offset: disabled
[ 2833.601217] CPU features: 0x0400,00040050,01000400,8200421b
[ 2833.601221] Memory Limit: none
[ 2833.880806] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---

  • Hi, 

    What HLOS SDK version are you using? Is there a particular use-case that you are seeing this happen with? I have not noticed this kernel panic when starting a decoder pipeline yet.

    Thanks,
    Sarabesh S.

  • Hi,

    I am currently on the 6.15.8 linux kernel.  The panic occurs when I start a single gstreamer decode pipeline with the v4l2h264dec plugin.  The panic seems to always occur in  wave5_dec_clr_disp_flag().  I have been decoding a live h264 video stream.  Today, I sourced video from an MP4 file and the panic occurred at the same point. The panic occurs suspiciously close to wave5_vpu_dec_start_streaming(), but I am not sure how to isolate the source of the async error.

    Thanks,

    Jeff

  • Could you share the gstreamer pipeline?

    Thanks,
    Sarabesh S.

  • Hi Sarabesh,

    Below is a pipeline which induces the Serror every time.  The stack trace always indicates the issue is encountered when wave5_dec_clr_disp_flag() is executing.  I have built a debug kernel and hit the breakpoint in wave5_dec_clr_disp_flag() using KGDB.  As I step into the code, the SError is triggered immediately.  I wanted to read the CFSR register at 0xE000ED28, but the address was not valid (I think because there are multiple cores).  Not sure of the next steps for diagnosing the cause of the fault...

    Note $1 is just the path to an MP4 file..

    /opt/GStreamer/bin/gst-launch-1.0 \
    filesrc location=${1} \
    ! qtdemux name=demux demux.video_0 \
    ! h264parse config-interval=0 \
    ! v4l2h264dec capture-io-mode=4 \
    ! videoconvert \
    ! videorate ! video/x-raw,max-rate=30/1 \
    ! queue \
    ! rtpvrawpay \
    ! 'application/x-rtp, media=(string)video, encoding-name=(string)RAW' \
    ! udpsink host=127.0.0.1 port=5000 sync=false async=false

    Thanks again,
    Jeff

  • Hi Jeff, 

    Thanks for the information. I'll review this and get back to you. 

    Regards,
    Sarabesh S.

  • Thanks Sarabesh. 

    I did a bit more digging.  I see that the ESR is 0x00000000bf000000.  This decodes to a SError with IDS==1.  The Arm A-profile Architecture Reference Manual indicates that in this case, ESR_EL1[23:0] are IMPLEMENTATION DEFINED.  I am not sure where to find the implementation specific definition for ISS==24'b0.  Please let me know what you find.  I am stuck.

    Thanks, Jeff.

  • Thanks Jeff, 

    Currently discussing with the team. I'll follow up soon.

    Regards,
    Sarabesh S.