This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM62A7: SDK 09.00.01.08: EDGEAI Multichannel demo crashed with a kernel paging request

Part Number: AM62A7
Other Parts Discussed in Thread: TFP410, TLV320AIC26, SK-AM62A-LP

Left the Mutichannel demo running overnight and found it had crashed by the morning.

root@mitysom-am62ax:/opt/edgeai-gst-apps# [21026.592781] Unable to handle kernel paging request at virtual address dead000000000100
[21026.601648] Mem abort info:
[21026.604795]   ESR = 0x0000000096000004
[21026.611177]   EC = 0x25: DABT (current EL), IL = 32 bits
[21026.617857]   SET = 0, FnV = 0
[21026.622117]   EA = 0, S1PTW = 0
[21026.625357]   FSC = 0x04: level 0 translation fault
[21026.630343] Data abort info:
[21026.633335]   ISV = 0, ISS = 0x00000004
[21026.637235]   CM = 0, WnR = 0
[21026.640245] [dead000000000100] address between user and kernel address ranges
[21026.647464] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[21026.653751] Modules linked in: overlay cfg80211 bluetooth ecdh_generic ecc rfkill xhci_plat_hcd rpmsg_ctrl rpmsg_char dwc3 cdns_csi2rx v4l2_fwnode snd_soc_tlv320aic26 crct10dif_ce snd_soc_simple_card snd_soc_simple_card_utils e5010_jpeg_enc dwc3_am62 ti_k3_r5_remoteproc wave5 j721e_csi2rx ti_k3_dsp_remoteproc videobuf2_dma_contig virtio_rpmsg_bus v4l2_mem2mem videobuf2_memops rpmsg_ns videobuf2_v4l2 videobuf2_common ti_k3_common v4l2_async sa2ul videodev tidss mc drm_dma_helper cdns_dphy_rx ti_tfp410 display_connector drm_kms_helper ltc2945 cfbfillrect syscopyarea cfbimgblt at24 pwm_tiehrpwm spi_omap2_mcspi sysfillrect sysimgblt optee_rng fb_sys_fops cfbcopyarea pwm_omap_dmtimer rng_core cryptodev(O) fuse drm drm_panel_orientation_quirks ipv6
[21026.719742] CPU: 2 PID: 1234 Comm: multifilesrc9:s Tainted: G           O       6.1.46-g74c66da4a2 #1
[21026.728952] Hardware name: Critical Link MitySOM-AM62A (DT)
[21026.734513] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[21026.741464] pc : wave5_handle_bitstream_buffer+0xc8/0x1ec [wave5]
[21026.747579] lr : wave5_handle_bitstream_buffer+0x100/0x1ec [wave5]
[21026.753758] sp : ffff800014b339f0
[21026.757060] x29: ffff800014b339f0 x28: ffff00001074a010 x27: ffff000005d44520
[21026.764189] x26: 0000000000000000 x25: 00000000c058560f x24: ffff00001074bcb8
[21026.771313] x23: 0000000000000326 x22: deacfffffffffd58 x21: ffff800015245000
[21026.778438] x20: ffff00001074a000 x19: ffff0000041d5c00 x18: 0000000000000000
[21026.785564] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff042bf9a8
[21026.792688] x14: 0000000100067abf x13: 0000000000000000 x12: 0000000000000000
[21026.799813] x11: 0000000000000000 x10: 0000000000067ac3 x9 : ffff80000105d138
[21026.806938] x8 : ffff00000a3fc658 x7 : 0000000000000030 x6 : 00000000fffffffd
[21026.814062] x5 : ffff0000050fa4f0 x4 : 0000000000000000 x3 : 0000000000000000
[21026.821186] x2 : dead000000000100 x1 : 0000000000000001 x0 : ffff0000050fa000
[21026.828313] Call trace:
[21026.830751]  wave5_handle_bitstream_buffer+0xc8/0x1ec [wave5]
[21026.836499]  wave5_vpu_dec_buf_queue+0x9c/0x424 [wave5]
[21026.841725]  __enqueue_in_driver+0x54/0xf0 [videobuf2_common]
[21026.847486]  vb2_core_qbuf+0x414/0x634 [videobuf2_common]
[21026.852891]  vb2_qbuf+0x9c/0xf0 [videobuf2_v4l2]
[21026.857521]  v4l2_m2m_qbuf+0x7c/0x240 [v4l2_mem2mem]
[21026.862502]  v4l2_m2m_ioctl_qbuf+0x20/0x320 [v4l2_mem2mem]
[21026.867992]  v4l_qbuf+0x50/0x6c [videodev]
[21026.872179]  __video_do_ioctl+0x190/0x3e0 [videodev]
[21026.877195]  video_usercopy+0x21c/0x7e0 [videodev]
[21026.882038]  video_ioctl2+0x20/0x3c [videodev]
[21026.886534]  v4l2_ioctl+0x48/0x6c [videodev]
[21026.890856]  __arm64_sys_ioctl+0xb0/0xf4
[21026.894777]  invoke_syscall+0x50/0x120
[21026.898522]  el0_svc_common.constprop.0+0xdc/0x100
[21026.903306]  do_el0_svc+0x38/0xe0
[21026.906615]  el0_svc+0x2c/0x84
[21026.909666]  el0t_64_sync_handler+0xbc/0x140
[21026.913928]  el0t_64_sync+0x18c/0x190
[21026.917591] Code: 52800020 390ee260 f9404e80 910ea2c2 (f941d6c1) 
[21026.923671] ---[ end trace 0000000000000000 ]---

  • Hi John,

    Based on the 'wave5', 'vpu', and 'dec', this error must have come from the video decoder IP. It's using an odd memory address here (dead000000000100) that seemed to cause the error.

    How many streams were you running and at approximately what pixel rate? Mainly curious about the max MP/s going through that IP. Any details about your configuration would be helpful, including the resolution and compression type (H264 or H265) being used here.

    If you have a setup that allows it, I would like to know if this error can be reproduced on a singular stream (might require longer runtime).

    I will also find the right expert for this question, as it's deeper into the VPU stack than I'm familiar with.

    Best regards,
    Reese

  • Thanks Reese, I'm not sure of the answers to these questions.  I just booted the default edgeai sdcard image and clicked the multichannel demo.

  • Hi Johnathan,

    I understand -- I wrongly assumed you were launching this from the command line. I'll look for the configuration used under the hood for that option

    -Reese

  • I tested the multi channel demo on the EVM as well, and it did eventually crash. Though it took over a week.  It seems like it only takes a few hours on our board.

    Apr 28 17:42:27 am62axx-evm kernel: Booting Linux on physical CPU 0x0000000000 [0x410fd034]
    Apr 28 17:42:27 am62axx-evm kernel: Linux version 6.1.46-gf8110d9ce8 (oe-user@oe-host) (aarch64-oe-l>
    Apr 28 17:42:27 am62axx-evm kernel: Machine model: Texas Instruments AM62A7 SK
    ...
    
    Dec 08 00:00:31 am62axx-evm systemd[1]: Rotate log files was skipped because of a failed condition c>
    Dec 08 16:02:40 am62axx-evm kernel: hrtimer: interrupt took 1109044 ns
    Dec 08 17:12:31 am62axx-evm kernel: Unable to handle kernel paging request at virtual address dead00>
    Dec 08 17:12:31 am62axx-evm kernel: Mem abort info:
    Dec 08 17:12:31 am62axx-evm kernel:   ESR = 0x0000000096000004
    Dec 08 17:12:31 am62axx-evm kernel:   EC = 0x25: DABT (current EL), IL = 32 bits
    Dec 08 17:12:31 am62axx-evm kernel:   SET = 0, FnV = 0
    Dec 08 17:12:31 am62axx-evm kernel:   EA = 0, S1PTW = 0
    Dec 08 17:12:31 am62axx-evm kernel:   FSC = 0x04: level 0 translation fault
    Dec 08 17:12:31 am62axx-evm kernel: Data abort info:
    Dec 08 17:12:31 am62axx-evm kernel:   ISV = 0, ISS = 0x00000004
    Dec 08 17:12:31 am62axx-evm kernel:   CM = 0, WnR = 0
    Dec 08 17:12:31 am62axx-evm kernel: [dead000000000100] address between user and kernel address ranges
    Dec 08 17:12:31 am62axx-evm kernel: Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    Dec 08 17:12:32 am62axx-evm kernel: Modules linked in: overlay cfg80211 xhci_plat_hcd rpmsg_ctrl snd>
    Dec 08 17:12:32 am62axx-evm kernel: CPU: 1 PID: 948 Comm: multifilesrc1:s Tainted: G           O    >
    Dec 08 17:12:32 am62axx-evm kernel: Hardware name: Texas Instruments AM62A7 SK (DT)
    Dec 08 17:12:32 am62axx-evm kernel: pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    Dec 08 17:12:32 am62axx-evm kernel: pc : wave5_handle_bitstream_buffer+0xc0/0x1e0 [wave5]
    Dec 08 17:12:32 am62axx-evm kernel: lr : wave5_handle_bitstream_buffer+0xf8/0x1e0 [wave5]
    Dec 08 17:12:32 am62axx-evm kernel: sp : ffff80000add3a00
    Dec 08 17:12:32 am62axx-evm kernel: x29: ffff80000add3a00 x28: ffff000809734010 x27: ffff00080180d120
    Dec 08 17:12:32 am62axx-evm kernel: x26: 0000000000000000 x25: 00000000c058560f x24: ffff000809735cb8
    Dec 08 17:12:32 am62axx-evm kernel: x23: 000000000000312e x22: deacfffffffffd58 x21: ffff80000f9b9000
    Dec 08 17:12:32 am62axx-evm kernel: x20: ffff000809734000 x19: ffff000803adc400 x18: 0000000000000000
    Dec 08 17:12:32 am62axx-evm kernel: x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff442c9b88
    Dec 08 17:12:32 am62axx-evm kernel: x14: 00000001007c185e x13: 0000000000000000 x12: 0000000000000000
    Dec 08 17:12:32 am62axx-evm kernel: x11: 0000000000000000 x10: 00000000007c1862 x9 : 0000000000000001
    Dec 08 17:12:32 am62axx-evm kernel: x8 : ffff000802374658 x7 : 0000000000000030 x6 : 00000000fffffffd
    Dec 08 17:12:32 am62axx-evm kernel: x5 : ffff000809745cf0 x4 : 0000000000000000 x3 : 0000000000000000
    Dec 08 17:12:32 am62axx-evm kernel: x2 : dead000000000100 x1 : 0000000000000001 x0 : ffff000809745800
    Dec 08 17:12:32 am62axx-evm kernel: Call trace:
    Dec 08 17:12:32 am62axx-evm kernel:  wave5_handle_bitstream_buffer+0xc0/0x1e0 [wave5]
    Dec 08 17:12:32 am62axx-evm kernel:  wave5_vpu_dec_buf_queue+0x94/0x420 [wave5]
    Dec 08 17:12:32 am62axx-evm kernel:  __enqueue_in_driver+0x3c/0x7c [videobuf2_common]
    Dec 08 17:12:32 am62axx-evm kernel:  vb2_core_qbuf+0x45c/0x5ac [videobuf2_common]
    Dec 08 17:12:32 am62axx-evm kernel:  vb2_qbuf+0x94/0xf0 [videobuf2_v4l2]
    Dec 08 17:12:32 am62axx-evm kernel:  v4l2_m2m_qbuf+0x74/0x23c [v4l2_mem2mem]
    Dec 08 17:12:32 am62axx-evm kernel:  v4l2_m2m_ioctl_qbuf+0x18/0x4f0 [v4l2_mem2mem]
    Dec 08 17:12:32 am62axx-evm kernel:  v4l_qbuf+0x48/0x60 [videodev]
    Dec 08 17:12:32 am62axx-evm kernel:  __video_do_ioctl+0x184/0x3d0 [videodev]
    Dec 08 17:12:32 am62axx-evm kernel:  video_usercopy+0x214/0x6c4 [videodev]
    Dec 08 17:12:32 am62axx-evm kernel:  video_ioctl2+0x18/0x2c [videodev]
    Dec 08 17:12:32 am62axx-evm kernel:  v4l2_ioctl+0x40/0x60 [videodev]
    Dec 08 17:12:32 am62axx-evm kernel:  __arm64_sys_ioctl+0xa8/0xf0
    Dec 08 17:12:32 am62axx-evm kernel:  invoke_syscall+0x48/0x114
    Dec 08 17:12:32 am62axx-evm kernel:  el0_svc_common.constprop.0+0xd4/0xfc
    Dec 08 17:12:32 am62axx-evm kernel:  do_el0_svc+0x30/0xd0
    Dec 08 17:12:32 am62axx-evm kernel:  el0_svc+0x2c/0x84
    Dec 08 17:12:32 am62axx-evm kernel:  el0t_64_sync_handler+0xbc/0x140
    Dec 08 17:12:32 am62axx-evm kernel:  el0t_64_sync+0x18c/0x190
    Dec 08 17:12:32 am62axx-evm kernel: Code: 52800020 390ee260 f9404e80 910ea2c2 (f941d6c1) 
    Dec 08 17:12:32 am62axx-evm kernel: ---[ end trace 0000000000000000 ]---
    

  • Hi Jonathan,

    Thanks for running the stress test on both platforms, -- that is helpful. I'm estimating that there is a latent bug in the driver, especially since we're seeing it on both your board and our EVM. If you try to rerun the application after this failure, does it run or fail immediately?

    I have a suspicion that the driver faulted and produced a bogus address starting with '0xdead' to be used as a base pointer. I'll track down more info about this driver.

    -Reese

  • Just adding in here, the source for the demo application is located on git.ti. Here is the file for AM62A pipelines: https://git.ti.com/cgit/apps/edgeai-gui-app/tree/gst_pipelines/am62a_pipelines.h

    I will run this gstreamer command standalone on my EVM to see if I can reproduce the error. One possibility is a small memory leak, which would explain why both boards fail, but yours fails first (based on my recollection that your board is 2 GB DDR)

    -Reese

  • Ran the object detection overnight on the starterkit and it also got the same error.  If I try to click on the red button to select a different demo, the mouse pointer gets stuck.  However the console continues to be responsive.

    root@am62axx-evm:/opt/edgeai-gst-apps# [26596.316407] Unable to handle kernel paging request at virtual address dead000000000100
    [26596.324485] Mem abort info:
    [26596.327316]   ESR = 0x0000000096000004
    [26596.331327]   EC = 0x25: DABT (current EL), IL = 32 bits
    [26596.336988]   SET = 0, FnV = 0
    [26596.340257]   EA = 0, S1PTW = 0
    [26596.343596]   FSC = 0x04: level 0 translation fault
    [26596.348579] Data abort info:
    [26596.351493]   ISV = 0, ISS = 0x00000004
    [26596.355390]   CM = 0, WnR = 0
    [26596.358428] [dead000000000100] address between user and kernel address ranges
    [26596.365646] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    [26596.371939] Modules linked in: overlay cfg80211 xhci_plat_hcd snd_soc_hdmi_codec rpmsg_ctrl cdns_csi2rx bluetooth dwc3 rpmsg_char v4l2_fwnode ecdh_generic ecc rfkill snd_soc_simple_card crct10dif_ce snd_soc_simple_card_utils e5010_jpeg_enc display_connector dwc3_am62 snd_soc_tlv320aic3x_i2c tps6598x typec snd_soc_tlv320aic3x ti_k3_r5_remoteproc sii902x wave5 j721e_csi2rx tidss videobuf2_dma_contig drm_dma_helper v4l2_mem2mem videobuf2_memops ti_k3_dsp_remoteproc drm_kms_helper cfbfillrect virtio_rpmsg_bus syscopyarea cfbimgblt sysfillrect videobuf2_v4l2 rpmsg_ns sysimgblt v4l2_async ti_k3_common videobuf2_common fb_sys_fops sa2ul snd_soc_davinci_mcasp videodev optee_rng snd_soc_ti_udma cfbcopyarea mc snd_soc_ti_edma snd_soc_ti_sdma cdns_dphy_rx rng_core cryptodev(O) fuse drm drm_panel_orientation_quirks ipv6
    [26596.443906] CPU: 1 PID: 934 Comm: multifilesrc0:s Tainted: G           O       6.1.46-gf8110d9ce8 #1
    [26596.453053] Hardware name: Texas Instruments AM62A7 SK (DT)
    [26596.458627] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [26596.465598] pc : wave5_handle_bitstream_buffer+0xc0/0x1e0 [wave5]
    [26596.471755] lr : wave5_handle_bitstream_buffer+0xf8/0x1e0 [wave5]
    [26596.477892] sp : ffff80000acdba00
    [26596.481208] x29: ffff80000acdba00 x28: ffff00080a49a010 x27: ffff0008041f4920
    [26596.488371] x26: 0000000000000000 x25: 00000000c058560f x24: ffff00080a49bcb8
    [26596.495526] x23: 00000000000001aa x22: deacfffffffffd58 x21: ffff80000b5ae000
    [26596.502681] x20: ffff00080a49a000 x19: ffff000808f3f800 x18: 0000000000000000
    [26596.509836] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff6c2c86e8
    [26596.516989] x14: 00000001000c198a x13: 0000000000000000 x12: 0000000000000000
    [26596.524142] x11: 0000000000000000 x10: 00000000000c198e x9 : 0000000000000001
    [26596.531294] x8 : ffff000808f3c658 x7 : 0000000000000030 x6 : 00000000fffffffd
    [26596.538447] x5 : ffff00080a4bacf0 x4 : 0000000000000000 x3 : 0000000000000000
    [26596.545600] x2 : dead000000000100 x1 : 0000000000000001 x0 : ffff00080a4ba800
    [26596.552754] Call trace:
    [26596.555203]  wave5_handle_bitstream_buffer+0xc0/0x1e0 [wave5]
    [26596.561004]  wave5_vpu_dec_buf_queue+0x94/0x420 [wave5]
    [26596.566275]  __enqueue_in_driver+0x3c/0x7c [videobuf2_common]
    [26596.572074]  vb2_core_qbuf+0x45c/0x5ac [videobuf2_common]
    [26596.577522]  vb2_qbuf+0x94/0xf0 [videobuf2_v4l2]
    [26596.582193]  v4l2_m2m_qbuf+0x74/0x23c [v4l2_mem2mem]
    [26596.587225]  v4l2_m2m_ioctl_qbuf+0x18/0x4f0 [v4l2_mem2mem]
    [26596.592768]  v4l_qbuf+0x48/0x60 [videodev]
    [26596.597065]  __video_do_ioctl+0x184/0x3d0 [videodev]
    [26596.602208]  video_usercopy+0x214/0x6c4 [videodev]
    [26596.607179]  video_ioctl2+0x18/0x2c [videodev]
    [26596.611802]  v4l2_ioctl+0x40/0x60 [videodev]
    [26596.616252]  __arm64_sys_ioctl+0xa8/0xf0
    [26596.620190]  invoke_syscall+0x48/0x114
    [26596.623958]  el0_svc_common.constprop.0+0xd4/0xfc
    [26596.628677]  do_el0_svc+0x30/0xd0
    [26596.632005]  el0_svc+0x2c/0x84
    [26596.635071]  el0t_64_sync_handler+0xbc/0x140
    [26596.639350]  el0t_64_sync+0x18c/0x190
    [26596.643033] Code: 52800020 390ee260 f9404e80 910ea2c2 (f941d6c1) 
    [26596.649129] ---[ end trace 0000000000000000 ]---
    

  • Hmm, so reproduced on our board as well, and in shorter time. I ran a similar gstreamer pipeline for several hours yesterday and didn't see any creeping RAM usage that suggests a memory leak. If we were running out of memory, then I would've expected Linux's OOM killer to run/print (and probably kill that entire application). I'm not surprised that the out-of-box demo was unresponsive after a failure.

    Since I was trying to replicate this in standalone gstreamer, I'll try doing so from the OOB demo instead now, and I'll leave it to run as long as possible. If you are okay with it, I would also ask that you try running the following pipeline from commandline -- I've put this into a SH script, but I pulled it directly from the source code used in the OOB demo.

    gst-oob-objdet.sh

    If it fails, I'm curious to know if it will rerun without a reboot. I'm trying to understand the implications of the failure, and how closely related it is to the state of the wave5 VPU/codec.

    -Reese

  • I tried running the pipeline when the UI was still crashed.  And the gst-launch-1.0 printed out nothing at all but let me ctrl-c kill it.

    root@am62axx-evm:/opt/edgeai-gst-apps# bash -x ~/gst-oob-objdet.sh                                 
    + GST_DEBUG=2                                                                                        
    + gst-launch-1.0 multifilesrc location=/opt/oob-demo-assets/oob-gui-video1.h264 loop=true '!' h264parse '!' v4l2h264dec capture-io-mode=5 '!' tiovxmemalloc '!' video/x-raw,format=NV12 '!' tiovxmultiscaler name=split_01 src_0::roi-startx=0 src_0::roi-starty=0 src_0::roi-width=1280 src_0::roi-height=768 target=0 split_01. '!' queue '!' video/x-raw, width=320, height=320 '!' tiovxdlpreproc model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 out-pool-size=4 '!' application/x-tensor-tiovx '!' tidlinferer target=1 model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 '!' post_0.tensor split_01. '!' queue '!' video/x-raw, width=1280, height=720 '!' post_0.sink tidlpostproc name=post_0 model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 alpha=0.400000 viz-threshold=0.600000 top-N=5 display-model=true '!' queue '!' mosaic_0. tiovxmosaic name=mosaic_0 target=1 src::pool-size=4 'sink_0::startx=<320>' 'sink_0::starty=<150>' 'sink_0::widths=<1280>' 'sink_0::heights=<720>' '!' video/x-raw,format=NV12, width=1920, height=1080 '!' queue '!' tiperfoverlay main-title=null '!' queue max-size-buffers=1 '!' kmssink driver-name=tidss sync=false force-modesetting=true            
    ^C                                                                                                   
    root@am62axx-evm:/opt/edgeai-gst-apps#

    After rebooting, it launched without issue:

    root@am62axx-evm:/opt/edgeai-gst-apps# bash -x ~/gst-oob-objdet.sh
    + GST_DEBUG=2
    + gst-launch-1.0 multifilesrc location=/opt/oob-demo-assets/oob-gui-video1.h264 loop=true '!' h264parse '!' v4l2h264dec capture-io-mode=5 '!' tiovxmemalloc '!' video/x-raw,format=NV12 '!' tiovxmultiscaler name=split_01 src_0::roi-startx=0 src_0::roi-starty=0 src_0::roi-width=1280 src_0::roi-height=768 target=0 split_01. '!' queue '!' video/x-raw, width=320, height=320 '!' tiovxdlpreproc model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 out-pool-size=4 '!' application/x-tensor-tiovx '!' tidlinferer target=1 model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 '!' post_0.tensor split_01. '!' queue '!' video/x-raw, width=1280, height=720 '!' post_0.sink tidlpostproc name=post_0 model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 alpha=0.400000 viz-threshold=0.600000 top-N=5 display-model=true '!' queue '!' mosaic_0. tiovxmosaic name=mosaic_0 target=1 src::pool-size=4 'sink_0::startx=<320>' 'sink_0::starty=<150>' 'sink_0::widths=<1280>' 'sink_0::heights=<720>' '!' video/x-raw,format=NV12, width=1920, height=1080 '!' queue '!' tiperfoverlay main-title=null '!' queue max-size-buffers=1 '!' kmssink driver-name=tidss sync=false force-modesetting=true
    0:00:00.249446060   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0x120000: 'AVR (Audio Visual Research)' is not mapped
    0:00:00.249540360   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0x180000: 'CAF (Apple Core Audio File)' is not mapped
    0:00:00.249570245   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0x100000: 'HTK (HMM Tool Kit)' is not mapped
    0:00:00.249607035   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0xc0000: 'MAT4 (GNU Octave 2.0 / Matlab 4.2)' is not mapped
    0:00:00.249633470   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0xd0000: 'MAT5 (GNU Octave 2.1 / Matlab 5.0)' is not mapped
    0:00:00.249658710   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0x210000: 'MPC (Akai MPC 2k)' is not mapped
    0:00:00.249687925   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0xe0000: 'PVF (Portable Voice Format)' is not mapped
    0:00:00.249717475   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0x160000: 'SD2 (Sound Designer II)' is not mapped
    0:00:00.249768580   999      0x6cb7760 WARN                 default gstsfelement.c:97:gst_sf_create_audio_template_caps: format 0x190000: 'WVE (Psion Series 3)' is not mapped
    APP: Init ... !!!
    MEM: Init ... !!!
    MEM: Initialized DMA HEAP (fd=6) !!!
    MEM: Init ... Done !!!
    IPC: Init ... !!!
    IPC: Init ... Done !!!
    ...
    Pipeline is PREROLLED ...
    Setting pipeline to PLAYING ...
    Redistribute latency...
    0:00:03.671104723   998     0x1147c6a0 WARN            kmsallocator gstkmsallocator.c:553:gst_kms_allocator_dmabuf_import:<KMSMemory::allocator> Failed to close GEM handle: Invalid argument 22
    New clock: GstSystemClock
    0:00:03.688304620   998     0x1147c6a0 WARN            kmsallocator gstkmsallocator.c:553:gst_kms_allocator_dmabuf_import:<KMSMemory::allocator> Failed to close GEM handle: Invalid argument 22
    0:00:03.704537262   998     0x1147c6a0 WARN            kmsallocator gstkmsallocator.c:553:gst_kms_allocator_dmabuf_import:<KMSMemory::allocator> Failed to close GEM handle: Invalid argument 22
    0:00:09.8 / 99:99:99.

  • Hi Jonathan,

    This supports the idea that the video decoder IP is getting into an erroneous state and not recovering until reboot. Please allow this to continue running if your hardware/development setup allows. Any WARNING level or high messages will print in additional to printk's. You can increase that GST_DEBUG value further to generate more logs, but they become very verbose at 3 (INFO) and above.

    We're working to reproduce on our side as well so we can root-cause and fix the issue. The owner of this driver is aware of the issue and looking into it as well. I appreciate your patience here.

    -Reese

  • If it fails, I'm curious to know if it will rerun without a reboot. I'm trying to understand the implications of the failure, and how closely related it is to the state of the wave5 VPU/codec.

    It does not allow me to rerun it without a reboot.  It also doesn't close correctly on a ctrl-c, I had to kill -9 it.

    43:38:00. / 99:99:99.
    ^Chandling interrupt.
    Interrupt: Stopping pipeline ...
    Execution ended after 48:26:35.687579148
    Setting pipeline to NULL ...
    
    
    
    ^C^C^C^C
    ^Z
    [1]+  Stopped(SIGTSTP)        bash -x ~/gst-oob-objdet.sh
    root@am62axx-evm:/opt/edgeai-gst-apps# kill -9 %1 
    
    [1]+  Stopped(SIGTSTP)        bash -x ~/gst-oob-objdet.sh
    root@am62axx-evm:/opt/edgeai-gst-apps# jobs
    [1]+  Killed                  bash -x ~/gst-oob-objdet.sh
    root@am62axx-evm:/opt/edgeai-gst-apps# fg
    -sh: fg: current: no such job
    root@am62axx-evm:/opt/edgeai-gst-apps# jobs
    root@am62axx-evm:/opt/edgeai-gst-apps# bash -x ~/gst-oob-objdet.sh 
    + GST_DEBUG=2
    + gst-launch-1.0 multifilesrc location=/opt/oob-demo-assets/oob-gui-video1.h264 loop=true '!' h264parse '!' v4l2h264dec capture-io-mode=5 '!' tiovxmemalloc '!' video/x-raw,format=NV12 '!' tiovxmultiscaler name=split_01 src_0::roi-startx=0 src_0::roi-starty=0 src_0::roi-width=1280 src_0::roi-height=768 target=0 split_01. '!' queue '!' video/x-raw, width=320, height=320 '!' tiovxdlpreproc model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 out-pool-size=4 '!' application/x-tensor-tiovx '!' tidlinferer target=1 model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 '!' post_0.tensor split_01. '!' queue '!' video/x-raw, width=1280, height=720 '!' post_0.sink tidlpostproc name=post_0 model=/opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320 alpha=0.400000 viz-threshold=0.600000 top-N=5 display-model=true '!' queue '!' mosaic_0. tiovxmosaic name=mosaic_0 target=1 src::pool-size=4 'sink_0::startx=<320>' 'sink_0::starty=<150>' 'sink_0::widths=<1280>' 'sink_0::heights=<720>' '!' video/x-raw,format=NV12, width=1920, height=1080 '!' queue '!' tiperfoverlay main-title=null '!' queue max-size-buffers=1 '!' kmssink driver-name=tidss sync=false force-modesetting=true
    
    ^C
    

  • Hi Jonathan,

    That is interesting, but not unexpected. I'm still running the same demo application in an attempt to reproduce and isolate the issue -- it's been up for ~2.5 days by now. It has yet to produce the error you are seeing, but I will leave it running over the weekend and continue monitoring. I see nothing that suggests a memory leak

    If there is any setting or configuration that seems to make the error show more quickly/consistently, please let us know.

    FYI, I will remain active throughout most of the holidays, but our development team may not be.

    -Reese

  • On the EVM, I am running the SDK 09 sdcard image from the TI site unmodified. So far it's hung a few times about a day each and once it took almost a week.

  • Board: SK-AM62A-LP

  • The issue finally showed itself, looks exactly the same as your logs. I am also using the 09.00.01.08 SDK on the Starter Kit EVM. I will provide updates / progress as we learn more about the cause and fix.

    -Reese

  • Hi Jonathan 

    I have unlocked this thread, Reese is currently out of office and will be back next week. 

  • I was able to reproduce this error once several weeks ago but have not been able to since, partly due to the length of the tests required. This bug is being tracked as LCPD-37373

  • Are these bug IDs internal only?

  • These bug IDs are internal markers, but the SDK release includes a set of known issues / fixed issues in the release notes, e.g.:

    Foundational SDK known issues: https://software-dl.ti.com/processor-sdk-linux-rt/esd/AM62AX/09_01_00/exports/docs/devices/AM62AX/linux/Release_Specific_Release_Notes.html#issues-tracker

    Edge AI SDK fixed issues: https://software-dl.ti.com/processor-sdk-linux-rt/esd/AM62AX/09_01_00/exports/edgeai-docs/devices/AM62AX/linux/release_notes_09_01.html#fixed-in-this-release

    This also helps us know which issue to look up for follow-up inquiries.

  • Great thanks for the info.