This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM3352: USB musb driver issues

Part Number: AM3352

Tool/software: Linux

We are using kernel version 4.14.40 (ti processor sdk am335x-05.00.00.15) on our custom AM335x board.Now there is a random musb problem and It cannot be artificially reproduced.

Following is what we are observing:

[ 2140.701953] Internal error: Oops - undefined instruction: 0 [#1] PREEMPT ARM
[ 2140.701959] Modules linked in: 8821cu algif_hash algif_skcipher af_alg rtk_btusb ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ip_tables x_tables bluetooth musb_dsps musb_hdrc udc_core phy_am335x phy_generic phy_am335x_control cfg80211 pm33xx wkup_m3_ipc wkup_m3_rproc remoteproc omap_aes_driver crypto_engine omap_crypto omap_sham ti_emif_sram pruss_soc_bus rtc_omap musb_am335x omap_wdt sch_fq_codel pps_ldisc pps_ktimer pps_gpio cryptodev(O) [last unloaded: 8821cu]
[ 2140.702039] CPU: 0 PID: 8 Comm: ktimersoftd/0 Tainted: G           O    4.14.40-necro_1_0-g22a6273 NIIC
[ 2140.702042] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 2140.702048] task: ee857100 task.stack: ee86e000
[ 2140.702143] PC is at cppi41_dma_channel_program+0x290/0x294 [musb_hdrc]
[ 2140.702168] LR is at musb_tx_dma_program+0x84/0xec [musb_hdrc]
[ 2140.702173] pc : [<bf408230>]    lr : [<bf3fee5c>]    psr: 600b0013
[ 2140.702176] sp : ee86fc10  ip : ee86fc60  fp : ee86fc5c
[ 2140.702179] r10: ed1f84f0  r9 : ad2b8000  r8 : edd36048
[ 2140.702182] r7 : 00000200  r6 : 0000008a  r5 : edd36048  r4 : edd36048
[ 2140.702186] r3 : 00000000  r2 : 00000000  r1 : 00000002  r0 : edd36048
[ 2140.702192] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[ 2140.702197] Control: 10c5387d  Table: adb9c019  DAC: 00000051
[ 2140.702201] Process ktimersoftd/0 (pid: 8, stack limit = 0xee86e208)
[ 2140.702205] Stack: (0xee86fc10 to 0xee870000)
[ 2140.702212] fc00:                                     ed1f7100 c0d12b60 00000000 eeee3800
[ 2140.702220] fc20: ee86fc84 ee86fc30 c0843620 c014aa00 ee86fc54 bf407fa0 ed1f84f0 edb25a00
[ 2140.702229] fc40: 00000000 edd36048 00000200 ed1f84f0 ee86fc8c ee86fc60 bf3fee5c bf407fac
[ 2140.702237] fc60: 0000008a ed1f8010 ed1f8010 ed07e600 ed07e600 0000008a ed2af180 bf40c484
[ 2140.702244] fc80: ee86fd24 ee86fc90 bf3ff2bc bf3fede4 00000000 0000008a 00000004 00000008
[ 2140.702252] fca0: bf40a104 00000000 00000000 0000008a ed2afe00 00000000 f10b7c00 00000000
[ 2140.702261] fcc0: edd36048 0000ffff f10b7c00 00000000 00000200 edb25a00 ed2b8000 f10b7c10
[ 2140.702269] fce0: 0000008a f10b7c00 ed1f9010 bf40c48c ed1f84f0 00000002 ee86fd24 ed07e600
[ 2140.702277] fd00: ed2afe00 ed1f8010 00000000 ed1f84f0 ed1f84f0 00000000 ee86fd64 ee86fd28
[ 2140.702284] fd20: bf3ffdd8 bf3feed0 ed2af180 00002000 00000000 00000001 ed1f8010 ed07e600
[ 2140.702293] fd40: 00002000 00000c76 edd36048 ed1f8010 ed2afe14 00000000 ee86fde4 ee86fd68
[ 2140.702301] fd60: bf401768 bf3ffc5c bf40a41c bf1c8384 ffffeffe 000001c0 ed1f8330 fffffbfe
[ 2140.702309] fd80: bf40c48c bf40c480 f10b7c00 00000000 00000002 bf40a4f0 bf40c484 ed1f84f0
[ 2140.702317] fda0: 00000003 ed2afe00 00000000 00000c76 f10b7c10 00000000 00000000 edd36048
[ 2140.702325] fdc0: ed1f84f0 edd36048 ed1f8010 edb25a28 f10b7c10 200b0013 ee86fdf4 ee86fde8
[ 2140.702334] fde0: bf3fbfa0 bf401410 ee86fe44 ee86fdf8 bf4077ec bf3fbf58 ee86fe44 ee86fe08
[ 2140.702342] fe00: ee86fe44 ee86fe10 6afffd71 000001f2 ee86fe2c ee86fe20 bf3fa064 edd36084
[ 2140.702350] fe20: edb25a1c edd36048 edb25a58 edb25a28 ed1f8010 200b0013 ee86fe74 ee86fe48
[ 2140.702358] fe40: bf407b28 bf407658 ee86fe74 ee86fe58 c0d198f8 edb25a28 6affe7f5 000001f2
[ 2140.702366] fe60: ee86e000 00000000 ee86fecc ee86fe78 c017dd5c bf407aa8 ffffe000 c0aaa7ec
[ 2140.702374] fe80: 6affe7f5 000001f2 c0d19800 ee86fe9c c0d551d0 c0d198a0 5f6bf951 00000060
[ 2140.702382] fea0: 00000000 00000000 00000000 5fe24aaf 15b223bf c0d19800 200b0013 00000008
[ 2140.702390] fec0: ee86fef4 ee86fed0 c017df70 c017dbf4 ee86e000 00000020 c0d10628 00000100
[ 2140.702398] fee0: 04208140 00000000 ee86ff34 ee86fef8 c012b9bc c017df04 6bc4f064 000001f2
[ 2140.702406] ff00: 00000100 c0d56000 00000002 ffffe000 ee81e640 00000001 c0d105f4 ffffe000
[ 2140.702414] ff20: 00000000 ee853e04 ee86ff4c ee86ff38 c012bc18 c012b834 ee86e000 ee81e640
[ 2140.702422] ff40: ee86ff74 ee86ff50 c01488a0 c012bbf0 ee81e700 00000000 ee81e6c0 ee86e000
[ 2140.702431] ff60: ee81e640 ee81e718 ee86ffac ee86ff78 c0144bb0 c0148750 c0144a80 c0148744
[ 2140.702439] ff80: 00000000 ee81e6c0 c0144a80 00000000 00000000 00000000 00000000 00000000
[ 2140.702446] ffa0: 00000000 ee86ffb0 c0107bb0 c0144a8c 00000000 00000000 00000000 00000000
[ 2140.702454] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 2140.702461] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 55555555 55555555
[ 2140.702463] Backtrace:
[ 2140.702520] [<bf407fa0>] (cppi41_dma_channel_program [musb_hdrc]) from [<bf3fee5c>] (musb_tx_dma_program+0x84/0xec [musb_hdrc])
[ 2140.702530]  r10:ed1f84f0 r9:00000200 r8:edd36048 r7:00000000 r6:edb25a00 r5:ed1f84f0
[ 2140.702533]  r4:bf407fa0
[ 2140.702581] [<bf3fedd8>] (musb_tx_dma_program [musb_hdrc]) from [<bf3ff2bc>] (musb_start_urb+0x3f8/0xd8c [musb_hdrc])
[ 2140.702589]  r9:bf40c484 r8:ed2af180 r7:0000008a r6:ed07e600 r5:ed07e600 r4:ed1f8010
[ 2140.702637] [<bf3feec4>] (musb_start_urb [musb_hdrc]) from [<bf3ffdd8>] (musb_advance_schedule+0x188/0x324 [musb_hdrc])
[ 2140.702646]  r10:00000000 r9:ed1f84f0 r8:ed1f84f0 r7:00000000 r6:ed1f8010 r5:ed2afe00
[ 2140.702648]  r4:ed07e600
[ 2140.702696] [<bf3ffc50>] (musb_advance_schedule [musb_hdrc]) from [<bf401768>] (musb_host_tx+0x364/0x794 [musb_hdrc])
[ 2140.702703]  r10:00000000 r9:ed2afe14 r8:ed1f8010 r7:edd36048 r6:00000c76 r5:00002000
[ 2140.702706]  r4:ed07e600
[ 2140.702754] [<bf401404>] (musb_host_tx [musb_hdrc]) from [<bf3fbfa0>] (musb_dma_completion+0x54/0x80 [musb_hdrc])
[ 2140.702761]  r10:200b0013 r9:f10b7c10 r8:edb25a28 r7:ed1f8010 r6:edd36048 r5:ed1f84f0
[ 2140.702764]  r4:edd36048
[ 2140.702812] [<bf3fbf4c>] (musb_dma_completion [musb_hdrc]) from [<bf4077ec>] (cppi41_trans_done+0x1a0/0x1ec [musb_hdrc])
[ 2140.702860] [<bf40764c>] (cppi41_trans_done [musb_hdrc]) from [<bf407b28>] (cppi41_recheck_tx_req+0x8c/0xf0 [musb_hdrc])
[ 2140.702868]  r10:200b0013 r9:ed1f8010 r8:edb25a28 r7:edb25a58 r6:edd36048 r5:edb25a1c
[ 2140.702870]  r4:edd36084
[ 2140.702918] [<bf407a9c>] (cppi41_recheck_tx_req [musb_hdrc]) from [<c017dd5c>] (__hrtimer_run_queues.constprop.4+0x174/0x1f8)
[ 2140.702925]  r9:00000000 r8:ee86e000 r7:000001f2 r6:6affe7f5 r5:edb25a28 r4:c0d198f8
[ 2140.702935] [<c017dbe8>] (__hrtimer_run_queues.constprop.4) from [<c017df70>] (hrtimer_run_softirq+0x78/0x110)
[ 2140.702942]  r10:00000008 r9:200b0013 r8:c0d19800 r7:15b223bf r6:5fe24aaf r5:00000000
[ 2140.702944]  r4:00000000
[ 2140.702961] [<c017def8>] (hrtimer_run_softirq) from [<c012b9bc>] (do_current_softirqs+0x194/0x23c)
[ 2140.702968]  r9:00000000 r8:04208140 r7:00000100 r6:c0d10628 r5:00000020 r4:ee86e000
[ 2140.702977] [<c012b828>] (do_current_softirqs) from [<c012bc18>] (run_ksoftirqd+0x34/0x74)
[ 2140.702984]  r10:ee853e04 r9:00000000 r8:ffffe000 r7:c0d105f4 r6:00000001 r5:ee81e640
[ 2140.702987]  r4:ffffe000
[ 2140.703002] [<c012bbe4>] (run_ksoftirqd) from [<c01488a0>] (smpboot_thread_fn+0x15c/0x288)
[ 2140.703005]  r5:ee81e640 r4:ee86e000
[ 2140.703021] [<c0148744>] (smpboot_thread_fn) from [<c0144bb0>] (kthread+0x130/0x160)
[ 2140.703028]  r9:ee81e718 r8:ee81e640 r7:ee86e000 r6:ee81e6c0 r5:00000000 r4:ee81e700
[ 2140.703043] [<c0144a80>] (kthread) from [<c0107bb0>] (ret_from_fork+0x14/0x24)
[ 2140.703050]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0144a80
[ 2140.703053]  r4:ee81e6c0
[ 2140.703062] Code: e3a01000 e1a00004 ebfffbf4 eaffff92 (e7f001f2)

  • Hi,

    Does this kernel panic happen randomly or when using other application (or multiple applications) on the AM335x board?

    Best Regards,
    Yordan

  • Thank you Reply.

    This kernel panic happens randomly,I can't reproduce it yet,I'm not sure other programs have any effect on it

  • Hi,

    How long does it take to show the issue generally?

    What is the USB device attached to the USB port? What application do you run to use this USB device?

  • It takes anywhere from a few dozen minutes to an hour

    We attached the realtek 8821cu to USB port,and  Set realtek 8821cu as AP mode.

    There are several computers connected.Use these computers to access some web pages

  • Hi,

    This issue seems to be caused by a kernel bug which is not captured in my test. Are you willing to run some debug tests on your setup and capture logs? I can provide a kernel debug patch for you to generate a debug log which I am looking for. Maybe the log will give some clue.

    Meanwhile you can disable MUSB CPPI DMA in kernel config to workaround this issue.

  • How should I do for the debug tests?

  • Thanks for offering the test. I will create a kernel debug patch for you and provide detailed instructions to run the test in the next few days.

  • Hi,

    Please apply the following debug patch to your kernel, then enable the kernel config option "Kernel hacking --> Tracers" in kernel menuconfig, and re-build the kernel and modules.

    From ee381e3ded8fd9a9de78ee78a012d62a52858013 Mon Sep 17 00:00:00 2001
    From: Bin Liu <b-liu@ti.com>
    Date: Thu, 6 Oct 2016 14:19:51 -0500
    Subject: [PATCH] usb: musb: add cppi41 channel status tracing
    
    Signed-off-by: Bin Liu <b-liu@ti.com>
    ---
     drivers/usb/musb/musb_cppi41.c | 25 ++++++++++++++++++++++++-
     drivers/usb/musb/musb_host.c   | 12 ++++++++++++
     drivers/usb/musb/musb_trace.h  | 30 ++++++++++++++++++++++++++++++
     3 files changed, 66 insertions(+), 1 deletion(-)
    
    diff --git a/drivers/usb/musb/musb_cppi41.c b/drivers/usb/musb/musb_cppi41.c
    index 1ec0a4947b6b..25ed0e74ec4f 100644
    --- a/drivers/usb/musb/musb_cppi41.c
    +++ b/drivers/usb/musb/musb_cppi41.c
    @@ -126,6 +126,9 @@ static void cppi41_trans_done(struct cppi41_dma_channel *cppi41_channel)
     		/* done, complete */
     		cppi41_channel->channel.actual_len =
     			cppi41_channel->transferred;
    +
    +		trace_musb_cppi41_status(__builtin_return_address(0),
    +				cppi41_channel, MUSB_DMA_STATUS_FREE);
     		cppi41_channel->channel.status = MUSB_DMA_STATUS_FREE;
     		cppi41_channel->channel.rx_packet_done = true;
     
    @@ -514,6 +517,8 @@ static void cppi41_dma_channel_release(struct dma_channel *channel)
     	trace_musb_cppi41_free(cppi41_channel);
     	if (cppi41_channel->is_allocated) {
     		cppi41_channel->is_allocated = 0;
    +		trace_musb_cppi41_status(__builtin_return_address(0),
    +				cppi41_channel, MUSB_DMA_STATUS_FREE);
     		channel->status = MUSB_DMA_STATUS_FREE;
     		channel->actual_len = 0;
     	}
    @@ -527,6 +532,14 @@ static int cppi41_dma_channel_program(struct dma_channel *channel,
     	struct cppi41_dma_channel *cppi41_channel = channel->private_data;
     	int hb_mult = 0;
     
    +	if (channel->status == MUSB_DMA_STATUS_UNKNOWN ||
    +		channel->status == MUSB_DMA_STATUS_BUSY)
    +		trace_printk(" BUG on hwep%d ch%d%s, status %d\n",
    +				cppi41_channel->hw_ep->epnum,
    +				cppi41_channel->port_num,
    +				cppi41_channel->is_tx ? "tx" : "rx",
    +				channel->status);
    +
     	BUG_ON(channel->status == MUSB_DMA_STATUS_UNKNOWN ||
     		channel->status == MUSB_DMA_STATUS_BUSY);
     
    @@ -537,15 +550,21 @@ static int cppi41_dma_channel_program(struct dma_channel *channel,
     			hb_mult = cppi41_channel->hw_ep->in_qh->hb_mult;
     	}
     
    +	trace_musb_cppi41_status(__builtin_return_address(0),
    +			cppi41_channel, MUSB_DMA_STATUS_BUSY);
     	channel->status = MUSB_DMA_STATUS_BUSY;
     	channel->actual_len = 0;
     
    +
     	if (hb_mult)
     		packet_sz = hb_mult * (packet_sz & 0x7FF);
     
     	ret = cppi41_configure_channel(channel, packet_sz, mode, dma_addr, len);
    -	if (!ret)
    +	if (!ret) {
    +		trace_musb_cppi41_status(__builtin_return_address(0),
    +				cppi41_channel, MUSB_DMA_STATUS_FREE);
     		channel->status = MUSB_DMA_STATUS_FREE;
    +	}
     
     	return ret;
     }
    @@ -637,6 +656,8 @@ static int cppi41_dma_channel_abort(struct dma_channel *channel)
     		}
     	}
     
    +	trace_musb_cppi41_status(__builtin_return_address(0),
    +			cppi41_channel, MUSB_DMA_STATUS_FREE);
     	cppi41_channel->channel.status = MUSB_DMA_STATUS_FREE;
     	return 0;
     }
    @@ -712,6 +733,8 @@ static int cppi41_dma_controller_start(struct cppi41_dma_controller *controller)
     
     		musb_dma = &cppi41_channel->channel;
     		musb_dma->private_data = cppi41_channel;
    +		trace_musb_cppi41_status(__builtin_return_address(0),
    +				cppi41_channel, MUSB_DMA_STATUS_FREE);
     		musb_dma->status = MUSB_DMA_STATUS_FREE;
     		musb_dma->max_len = SZ_4M;
     
    diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
    index 5a6dca01a1d0..950849638e19 100644
    --- a/drivers/usb/musb/musb_host.c
    +++ b/drivers/usb/musb/musb_host.c
    @@ -997,6 +997,9 @@ static void musb_bulk_nak_timeout(struct musb *musb, struct musb_hw_ep *ep,
     	if (cur_qh) {
     		urb = next_urb(cur_qh);
     		if (dma_channel_status(dma) == MUSB_DMA_STATUS_BUSY) {
    +			trace_musb_cppi41_status(__builtin_return_address(0),
    +					dma->private_data,
    +					MUSB_DMA_STATUS_CORE_ABORT);
     			dma->status = MUSB_DMA_STATUS_CORE_ABORT;
     			musb->dma_controller->channel_abort(dma);
     			urb->actual_length += dma->actual_len;
    @@ -1321,6 +1324,9 @@ void musb_host_tx(struct musb *musb, u8 epnum)
     done:
     	if (status) {
     		if (dma_channel_status(dma) == MUSB_DMA_STATUS_BUSY) {
    +			trace_musb_cppi41_status(__builtin_return_address(0),
    +					dma->private_data,
    +					MUSB_DMA_STATUS_CORE_ABORT);
     			dma->status = MUSB_DMA_STATUS_CORE_ABORT;
     			musb->dma_controller->channel_abort(dma);
     		}
    @@ -1892,6 +1898,9 @@ void musb_host_rx(struct musb *musb, u8 epnum)
     	if (status) {
     		/* clean up dma and collect transfer count */
     		if (dma_channel_status(dma) == MUSB_DMA_STATUS_BUSY) {
    +			trace_musb_cppi41_status(__builtin_return_address(0),
    +					dma->private_data,
    +					MUSB_DMA_STATUS_CORE_ABORT);
     			dma->status = MUSB_DMA_STATUS_CORE_ABORT;
     			musb->dma_controller->channel_abort(dma);
     			xfer_len = dma->actual_len;
    @@ -1922,6 +1931,9 @@ void musb_host_rx(struct musb *musb, u8 epnum)
     		 * shouldn't this be the "half full" double buffer case?
     		 */
     		if (dma_channel_status(dma) == MUSB_DMA_STATUS_BUSY) {
    +			trace_musb_cppi41_status(__builtin_return_address(0),
    +					dma->private_data,
    +					MUSB_DMA_STATUS_CORE_ABORT);
     			dma->status = MUSB_DMA_STATUS_CORE_ABORT;
     			musb->dma_controller->channel_abort(dma);
     			xfer_len = dma->actual_len;
    diff --git a/drivers/usb/musb/musb_trace.h b/drivers/usb/musb/musb_trace.h
    index f031c9e74322..e541c3ecb182 100644
    --- a/drivers/usb/musb/musb_trace.h
    +++ b/drivers/usb/musb/musb_trace.h
    @@ -356,6 +356,36 @@ DEFINE_EVENT(musb_cppi41, musb_cppi41_free,
     	TP_PROTO(struct cppi41_dma_channel *ch),
     	TP_ARGS(ch)
     );
    +
    +TRACE_EVENT(musb_cppi41_status,
    +	TP_PROTO(void *caller,
    +		struct cppi41_dma_channel *ch,
    +		enum dma_channel_status new_st),
    +	TP_ARGS(caller, ch, new_st),
    +	TP_STRUCT__entry(
    +		__field(void *, caller)
    +		__field(struct cppi41_dma_channel *, ch)
    +		__field(u8, hwep)
    +		__field(u8, port)
    +		__field(u8, is_tx)
    +		__field(enum dma_channel_status, status)
    +		__field(enum dma_channel_status, new_st)
    +	),
    +	TP_fast_assign(
    +		__entry->caller = caller;
    +		__entry->ch = ch;
    +		__entry->hwep = ch->hw_ep->epnum;
    +		__entry->port = ch->port_num;
    +		__entry->is_tx = ch->is_tx;
    +		__entry->status = ch->channel.status;
    +		__entry->new_st = new_st;
    +	),
    +	TP_printk("%pS: %p, hwep%d ch%d%s, status %d -> %d",
    +			__entry->caller, __entry->ch, __entry->hwep,
    +			__entry->port, __entry->is_tx ? "tx" : "rx",
    +			__entry->status, __entry->new_st
    +	)
    +);
     #endif /* CONFIG_USB_TI_CPPI41_DMA */
     
     #endif /* __MUSB_TRACE_H */
    -- 
    2.17.1
    
    

    After booted your board with the new kernel image and modules, please run the following commands on the board to capture logs.

    # cd /sys/kernel/debug/tracing
    # echo 20000 > buffer_size_kb
    # echo 1 > events/musb/musb_cppi41_status/enable

    Then run your test until the kernel crash happens, immediately run the following commands to save the logs to a file.

    # echo 0 > events/musb/musb_cppi41_status/enable
    # cat trace > ~/musb_trace.log

    Then please attached the trace file musb_trace.log.

  • Hi,

    I haven't heard back from you. I'm assuming you were able to resolve your issue, or you don't get time to run the test I mentioned above. If latter, you can still disable the CPPI DMA as a workaround.

    I am closing this thread, you can still reply if you have any update (or create a new thread if this thread has locked due to time-out). Thanks.