Other Parts Discussed in Thread: AM67A
Tool/software:
Hi,
I wanted to share an issue we've been facing with our Fusion board setup that involves four D3RCM-IMX390-953 cameras.
We've noticed that restarting the GStreamer pipeline leads to a hard lockup of the board, along with kernel errors. This seems to happen only when we're using multiple cameras at the same time.
gst-launch-1.0 -v v4l2src device=/dev/video-imx390-cam0 ! video/x-bayer, width=1936, height=1100, framerate=30/1, format=rggb12 ! queue leaky=2 ! tiovxisp target=1 sink_0::device=/dev/v4l-imx390-subdev0 sensor-name=SENSOR_SONY_IMX390_UB953_D3 dcc-isp-file=/opt/imaging/imx390/linear/dcc_viss.bin sink_0::dcc-2a-file=/opt/imaging/imx390/linear/dcc_2a.bin format-msb=11 ! tiovxldc target=1 dcc-file=/opt/imaging/imx390/linear/dcc_ldc.bin sensor-name=SENSOR_SONY_IMX390_UB953_D3 sink_0::pool-size=8 c::pool-size=8 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! v4l2h265enc ! fakesink \
v4l2src device=/dev/video-imx390-cam1 ! video/x-bayer, width=1936, height=1100, framerate=30/1, format=rggb12 ! queue leaky=2 ! tiovxisp target=1 sink_0::device=/dev/v4l-imx390-subdev1 sensor-name=SENSOR_SONY_IMX390_UB953_D3 dcc-isp-file=/opt/imaging/imx390/linear/dcc_viss.bin sink_0::dcc-2a-file=/opt/imaging/imx390/linear/dcc_2a.bin format-msb=11 ! tiovxldc target=1 dcc-file=/opt/imaging/imx390/linear/dcc_ldc.bin sensor-name=SENSOR_SONY_IMX390_UB953_D3 sink_0::pool-size=8 c::pool-size=8 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! v4l2h265enc ! fakesink \
v4l2src device=/dev/video-imx390-cam2 ! video/x-bayer, width=1936, height=1100, framerate=30/1, format=rggb12 ! queue leaky=2 ! tiovxisp target=1 sink_0::device=/dev/v4l-imx390-subdev2 sensor-name=SENSOR_SONY_IMX390_UB953_D3 dcc-isp-file=/opt/imaging/imx390/linear/dcc_viss.bin sink_0::dcc-2a-file=/opt/imaging/imx390/linear/dcc_2a.bin format-msb=11 ! tiovxldc target=1 dcc-file=/opt/imaging/imx390/linear/dcc_ldc.bin sensor-name=SENSOR_SONY_IMX390_UB953_D3 sink_0::pool-size=8 c::pool-size=8 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! v4l2h265enc ! fakesink \
v4l2src device=/dev/video-imx390-cam3 ! video/x-bayer, width=1936, height=1100, framerate=30/1, format=rggb12 ! queue leaky=2 ! tiovxisp target=1 sink_0::device=/dev/v4l-imx390-subdev3 sensor-name=SENSOR_SONY_IMX390_UB953_D3 dcc-isp-file=/opt/imaging/imx390/linear/dcc_viss.bin sink_0::dcc-2a-file=/opt/imaging/imx390/linear/dcc_2a.bin format-msb=11 ! tiovxldc target=1 dcc-file=/opt/imaging/imx390/linear/dcc_ldc.bin sensor-name=SENSOR_SONY_IMX390_UB953_D3 sink_0::pool-size=8 c::pool-size=8 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! v4l2h265enc ! fakesink
The first run works perfectly fine, but on subsequent attempts (usually the second try), the board just hangs. When this happens, we see a lot of errors from the kernel:
[ 929.808516] omap_i2c 2050000.i2c: controller timed out
[ 929.808528] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 930.768499] ti-sci 44083000.system-controller: Mbox timedout in resp(caller: 0xffff8000087a1674)
[ 930.768506] ti-sci 44083000.system-controller: Mbox send fail -110
[ 931.343487] omap_i2c 2050000.i2c: controller timed out
[ 931.343497] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 932.878460] omap_i2c 2050000.i2c: controller timed out
[ 932.878469] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 934.418436] omap_i2c 2050000.i2c: controller timed out
[ 934.418444] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 935.953408] omap_i2c 2050000.i2c: controller timed out
[ 935.953417] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 937.488383] omap_i2c 2050000.i2c: controller timed out
[ 937.488390] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 938.958359] omap_i2c 2040000.i2c: controller timed out
[ 939.023358] omap_i2c 2050000.i2c: controller timed out
[ 939.023365] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 939.983343] omap_i2c 2040000.i2c: controller timed out
[ 940.558332] omap_i2c 2050000.i2c: controller timed out
[ 940.558339] ds90ub960 3-003d: ub960_read: cannot read register 0x24 (-110)!
[ 941.008330] omap_i2c 2040000.i2c: controller timed out
OR[ 92.146908] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 92.146918] rcu: 4-...!: (0 ticks this GP) idle=3dbc/0/0x1 softirq=0/0 fqs=0 rcuc=21003 jiffies(starved)
[ 92.146925] rcu: 5-...!: (0 ticks this GP) idle=ce04/1/0x4000000000000000 softirq=0/0 fqs=0 rcuc=21004 jiffies(starved)
[ 92.146931] rcu: 7-...!: (11 GPs behind) idle=d4e8/0/0x0 softirq=0/0 fqs=0 rcuc=21103 jiffies(starved) (false positive?)
[ 92.146936] (detected by 1, t=21006 jiffies, g=21097, q=830 ncpus=8)
[ 92.146940] Task dump for CPU 4:
[ 92.146942] task:swapper/4 state:R running task stack:0 pid:0 ppid:1 flags:0x0000000a
[ 92.146949] Call trace:
[ 92.146951] 0xffff8000089fdfb8
[ 92.146954] 0xffff0008bc30fe20
[ 92.146955] Task dump for CPU 5:
[ 92.146957] task:v4l2src3:src state:R running task stack:0 pid:3750 ppid:2366 flags:0x00000286
[ 92.146962] Call trace:
[ 92.146963] 0xffff8000089fdfb8
[ 92.146965] 0xffff0008dc4234b0
[ 92.146966] Task dump for CPU 7:
[ 92.146967] task:swapper/7 state:R running task stack:0 pid:0 ppid:1 flags:0x0000000a
[ 92.146972] Call trace:
[ 92.146973] 0xffff8000089fdfb8
[ 92.146974] 0xffff0008bc31be20
[ 92.146977] rcu: rcu_preempt kthread starved for 21006 jiffies! g21097 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=4
[ 92.146981] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 92.146982] rcu: RCU grace-period kthread stack dump:
[ 92.146984] task:rcu_preempt state:R running task stack:0 pid:16 ppid:2 flags:0x00000008
[ 92.146988] Call trace:
[ 92.146989] 0xffff8000089fdfb8
[ 92.146991] 0xffff8000089fe320
[ 92.146992] 0xffff8000089fe618
[ 92.146993] 0xffff800008a05f8c
[ 92.146994] 0xffff8000080a0928
[ 92.146996] 0xffff8000080a3f48
[ 92.146997] 0xffff80000805ce54
[ 92.146998] 0xffff800008014cb0
[ 92.147000] rcu: Stack dump where RCU GP kthread last ran:
[ 92.147001] Task dump for CPU 4:
[ 92.147002] task:swapper/4 state:R running task stack:0 pid:0 ppid:1 flags:0x0000000a
[ 92.147005] Call trace:
[ 92.147006] 0xffff8000089fdfb8
[ 92.147007] 0xffff0008bc30fe20
[ 122.341916] mmc1: Timeout waiting for hardware interrupt.
[ 122.341920] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 122.341923] mmc1: sdhci: Sys addr: 0x00000018 | Version: 0x00001004
[ 122.341927] mmc1: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000
[ 122.341930] mmc1: sdhci: Argument: 0x008854e8 | Trn mode: 0x0000002b
[ 122.341932] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x0000001f
[ 122.341935] mmc1: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 122.341938] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007
[ 122.341941] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000003
[ 122.341943] mmc1: sdhci: Int enab: 0x03ff008b | Sig enab: 0x03ff008b
[ 122.341946] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 122.341948] mmc1: sdhci: Caps: 0x3de8c801 | Caps_1: 0x18002407
[ 122.341951] mmc1: sdhci: Cmd: 0x0000193a | Max curr: 0x00000000
[ 122.341954] mmc1: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x01dbd37f
[ 122.341956] mmc1: sdhci: Resp[2]: 0x325b5900 | Resp[3]: 0x00000900
[ 122.341959] mmc1: sdhci: Host ctl2: 0x0000000b
[ 122.341961] mmc1: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x000000093f28b224
[ 122.341963] mmc1: sdhci: ============================================
[ 132.581914] mmc1: Timeout waiting for hardware interrupt.
[ 132.581919] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 132.581922] mmc1: sdhci: Sys addr: 0x00000018 | Version: 0x00001004
[ 132.581925] mmc1: sdhci: Blk size: 0x00007200 | Blk cnt: 0x00000000
[ 132.581928] mmc1: sdhci: Argument: 0x00000000 | Trn mode: 0x00000023
[ 132.581931] mmc1: sdhci: Present: 0x01f70000 | Host ctl: 0x0000001f
[ 132.581934] mmc1: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 132.581937] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007
[ 132.581939] mmc1: sdhci: Timeout: 0x0000000e | Int stat: 0x00018002
[ 132.581942] mmc1: sdhci: Int enab: 0x03ff008b | Sig enab: 0x03ff008b
[ 132.581945] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 132.581948] mmc1: sdhci: Caps: 0x3de8c801 | Caps_1: 0x18002407
[ 132.581950] mmc1: sdhci: Cmd: 0x00000c1b | Max curr: 0x00000000
[ 132.581953] mmc1: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x01dbd37f
[ 132.581956] mmc1: sdhci: Resp[2]: 0x325b5900 | Resp[3]: 0x00000900
[ 132.581958] mmc1: sdhci: Host ctl2: 0x0000000b
Do you know what could cause this? Or have any suggestions on how to resolve it?