This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/PROCESSOR-SDK-AM335X: USB modem driver issue

Part Number: PROCESSOR-SDK-AM335X


Tool/software: Linux

Hi all

I am using Beagle bone black with the latest TI SDK - v04.03

now i have MC7455 modem connected to the USB port this modem works with GobiNet drivers given by SIERRA WIRELESS

now after i compile and load the gobiNet.ko i see some kernel panic which is attached to this ticket (kernel_panic.txt)

please let me know possible solution

also i have attached the gobi drivers 

Regards

0027.S2.31N2.50.tar.gz

[ 2071.523305] BUG: scheduling while atomic: systemd-journal/106/0x00000002
[ 2071.523329] Modules linked in: GobiNet(O) usbnet ctr ccm pru_rproc pruss_intc pruss usb_f_acm u_serial usb_f_ecm musb_dsps phy_am335x musb_hdrc phy_am335x_control phy_generic ti_am335x_tsc ti_am335x_adc g_multi usb_f_mass_storage usb_f_rndis u_ether libcomposite udc_core xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo bluetooth pm33xx wkup_m3_ipc wkup_m3_rproc remoteproc omap_aes_driver crypto_engine omap_sham ti_emif_sram pruss_soc_bus c_can_platform c_can can_dev spidev rtc_omap musb_am335x omap_wdt ti_am335x_tscadc timer_counter(O) sch_fq_codel ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat_ftp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_state nf_conntrack_ftp nf_conntrack ip_tables x_tables arc4 ath9k_htc ath9k_common
[ 2071.523564]  ath9k_hw ath mac80211 cfg80211 usbcore usb_common
[ 2071.523598] CPU: 0 PID: 106 Comm: systemd-journal Tainted: G        W  O    4.9.69-g9ce43c71ae #4
[ 2071.523610] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 2071.523620] Backtrace:
[ 2071.523671] [<c010b790>] (dump_backtrace) from [<c010ba4c>] (show_stack+0x18/0x1c)
[ 2071.523688]  r7:c07d98bc r6:c0c0f358 r5:00000000 r4:00000000
[ 2071.523710] [<c010ba34>] (show_stack) from [<c03c35f4>] (dump_stack+0x24/0x28)
[ 2071.523738] [<c03c35d0>] (dump_stack) from [<c014d7c0>] (__schedule_bug+0x68/0x84)
[ 2071.523771] [<c014d758>] (__schedule_bug) from [<c07d97a4>] (__schedule+0x510/0x5d4)
[ 2071.523783]  r5:00000000 r4:dc5e9b80
[ 2071.523801] [<c07d9294>] (__schedule) from [<c07d98bc>] (schedule+0x54/0xb8)
[ 2071.523820]  r10:00000000 r9:dc674000 r8:c0107e44 r7:0000005b r6:dc675fb0 r5:c0107e44
[ 2071.523829]  r4:dc674000
[ 2071.523849] [<c07d9868>] (schedule) from [<c010b350>] (do_work_pending+0x28/0xc8)
[ 2071.523860]  r5:c0107e44 r4:dc674000
[ 2071.523879] [<c010b328>] (do_work_pending) from [<c0107ce0>] (slow_work_pending+0xc/0x20)
[ 2071.523894]  r7:0000005b r6:00101000 r5:00000000 r4:b54a9000
[ 2071.629060] creating qcqmi1
[ 2071.629354] BUG: scheduling while atomic: probe0-1-1.4:1./7057/0x00000003
[ 2071.629366] Modules linked in: GobiNet(O) usbnet ctr ccm pru_rproc pruss_intc pruss usb_f_acm u_serial usb_f_ecm musb_dsps phy_am335x musb_hdrc phy_am335x_control phy_generic ti_am335x_tsc ti_am335x_adc g_multi usb_f_mass_storage usb_f_rndis u_ether libcomposite udc_core xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo bluetooth pm33xx wkup_m3_ipc wkup_m3_rproc remoteproc omap_aes_driver crypto_engine omap_sham ti_emif_sram pruss_soc_bus c_can_platform c_can can_dev spidev rtc_omap musb_am335x omap_wdt ti_am335x_tscadc timer_counter(O) sch_fq_codel ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_nat_ftp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_state nf_conntrack_ftp nf_conntrack ip_tables x_tables arc4 ath9k_htc ath9k_common
[ 2071.629601]  ath9k_hw ath mac80211 cfg80211 usbcore usb_common
[ 2071.629637] CPU: 0 PID: 7057 Comm: probe0-1-1.4:1. Tainted: G        W  O    4.9.69-g9ce43c71ae #4
[ 2071.629648] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 2071.629826] Workqueue: probe0-1-1.4:1.8 gobi_work_handler [GobiNet]
[ 2071.629836] Backtrace:
[ 2071.629881] [<c010b790>] (dump_backtrace) from [<c010ba4c>] (show_stack+0x18/0x1c)
[ 2071.629898]  r7:c07d98bc r6:c0c0f358 r5:00000000 r4:00000000
[ 2071.629920] [<c010ba34>] (show_stack) from [<c03c35f4>] (dump_stack+0x24/0x28)
[ 2071.629949] [<c03c35d0>] (dump_stack) from [<c014d7c0>] (__schedule_bug+0x68/0x84)
[ 2071.629982] [<c014d758>] (__schedule_bug) from [<c07d97a4>] (__schedule+0x510/0x5d4)
[ 2071.629994]  r5:00000000 r4:dc6b8580
[ 2071.630013] [<c07d9294>] (__schedule) from [<c07d98bc>] (schedule+0x54/0xb8)
[ 2071.630031]  r10:00000000 r9:00000000 r8:dc0317c0 r7:d8341cc4 r6:00000000 r5:00000002
[ 2071.630041]  r4:d8340000
[ 2071.630061] [<c07d9868>] (schedule) from [<c07dcb5c>] (schedule_timeout+0x1c0/0x270)
[ 2071.630072]  r5:00000002 r4:7fffffff
[ 2071.630090] [<c07dc99c>] (schedule_timeout) from [<c07da374>] (wait_for_common+0xe0/0x1c4)
[ 2071.630107]  r8:dc0317c0 r7:d8341cc4 r6:00000000 r5:00000002 r4:d8340000
[ 2071.630128] [<c07da294>] (wait_for_common) from [<c07da470>] (wait_for_completion+0x18/0x1c)
[ 2071.630143]  r7:00000000 r6:db5fc400 r5:d8341cc0 r4:c0c82570
[ 2071.630167] [<c07da458>] (wait_for_completion) from [<c05267e4>] (devtmpfs_create_node+0xf8/0x12c)
[ 2071.630186] [<c05266ec>] (devtmpfs_create_node) from [<c051d458>] (device_add+0x568/0x588)
[ 2071.630197]  r5:d8086220 r4:db5fc408
[ 2071.630213] [<c051cef0>] (device_add) from [<c051d62c>] (device_create_groups_vargs+0xb4/0xc4)
[ 2071.630232]  r10:00000001 r9:db5fc408 r8:0f300000 r7:d8086220 r6:00000000 r5:db5fc400
[ 2071.630241]  r4:00000000
[ 2071.630257] [<c051d578>] (device_create_groups_vargs) from [<c051d698>] (device_create+0x2c/0x34)
[ 2071.630275]  r9:d8269000 r8:bf4c608c r7:bf4cae80 r6:bf4c8560 r5:00000000 r4:00000000
[ 2071.630351] [<c051d66c>] (device_create) from [<bf4c2088>] (RegisterQMIDevice+0x650/0x6b4 [GobiNet])
[ 2071.630361]  r4:d8268000
[ 2071.630485] [<bf4c1a38>] (RegisterQMIDevice [GobiNet]) from [<bf4b3368>] (work_function+0x204/0x640 [GobiNet])
[ 2071.630503]  r10:bf4cae88 r9:10624dd3 r8:bf4cae80 r7:d8268000 r6:d8269000 r5:ffffe000
[ 2071.630513]  r4:00000000
[ 2071.630633] [<bf4b3164>] (work_function [GobiNet]) from [<bf4b3848>] (gobi_work_handler+0xa4/0x118 [GobiNet])
[ 2071.630652]  r10:db390c00 r9:00000000 r8:c0c0e728 r7:dcb40900 r6:db0bf804 r5:d8268000
[ 2071.630661]  r4:d82682f8
[ 2071.630735] [<bf4b37a4>] (gobi_work_handler [GobiNet]) from [<c01425cc>] (process_one_work+0x1f8/0x424)
[ 2071.630749]  r6:00000000 r5:db390c00 r4:d82692f8
[ 2071.630767] [<c01423d4>] (process_one_work) from [<c0143078>] (rescuer_thread+0x248/0x458)
[ 2071.630785]  r10:db390c00 r9:db390c18 r8:d8340000 r7:dcb40968 r6:c0c0e728 r5:dcb40900
[ 2071.630795]  r4:c0c0e73c
[ 2071.630818] [<c0142e30>] (rescuer_thread) from [<c01484b4>] (kthread+0xf8/0x110)
[ 2071.630836]  r10:00000000 r9:00000000 r8:c0142e30 r7:db390c00 r6:d8340000 r5:d83d44c0
[ 2071.630845]  r4:00000000
[ 2071.630865] [<c01483bc>] (kthread) from [<c0107d50>] (ret_from_fork+0x14/0x24)
[ 2071.630881]  r8:00000000 r7:00000000 r6:00000000 r5:c01483bc r4:d83d44c0
[ 2071.631159] RawIP mode
root@scada:/#

  • Let me test this on my side and I will update.

    Best Regards,
    Yordan
  • thanks Yordan

    In the file GobiUSBNet.c i enabled 2 flags

    // Debug flag

    int debug=1;

    int qos_debug=1;

    and i added few more logs in the QMIDevice.c in function RegisterQMIDevice

    i will attach log file and the modified file

    QMIDevice-2.c

    9196403693377440360.txt

  • Hi Nick,

    I checked the log you provided. The whole kernel panic comes from within this GobiNet driver:
    [ 168.404446] [<bf011684>] (usb_control_msg [usbcore]) from [<bf406460>] (Gobi_usb_control_msg+0x84/0xa0 [GobiNet])

    As I understand from your description this is a third party driver from SIERRA WIRELESS. Have they verified that their driver works on AM335x Yocto Linux (which is TI Processor SDK Linux)? It seems that the driver causes scheduling issues in the kernel, which points to a bug in the driver, not the kernel.

    Best Regards,
    Yordan

  • hi yordan

    the same GobiNet works fine in older TI SDK
    but older TI SDK had "Intermittent USB Interface Failures" your support enginerr bin lui asked me to update to the latest TI SDK
    but in latest TI SDK i have this kernel exception using this GobiNet driver

    and by the way this api usb_control_msg is not called from interrupt context or bottom half


    Regards

  • Nick,

    The kernel log shows there is probably a bug in the gobi driver. You might want to get support from the driver provider.

    [  168.401497] BUG: scheduling while atomic: kworker/0:1/13/0x00000002
    ...
    [  168.402054] CPU: 0 PID: 13 Comm: kworker/0:1 Tainted: G           O    4.9.69-g9ce43c71ae #4
    [  168.402076] Hardware name: Generic AM33XX (Flattened Device Tree)
    [  168.402356] Workqueue: probe0-1-1.4:1.8 gobi_work_handler [GobiNet]
    [  168.402376] Backtrace:
    [  168.402450] [] (dump_backtrace) from [] (show_stack+0x18/0x1c)
    [  168.402483]  r7:c07d98bc r6:c0c0f358 r5:00000000 r4:00000000
    [  168.402524] [] (show_stack) from [] (dump_stack+0x24/0x28)
    [  168.402573] [] (dump_stack) from [] (__schedule_bug+0x68/0x84)
    [  168.402629] [] (__schedule_bug) from [] (__schedule+0x510/0x5d4)
    [  168.402652]  r5:00000000 r4:dc0e8580
    [  168.402689] [] (__schedule) from [] (schedule+0x54/0xb8)
    [  168.402725]  r10:00000022 r9:00000001 r8:c0c12d00 r7:c0c13e00 r6:dc111c58 r5:c0c12d00
    [  168.402744]  r4:dc110000
    [  168.402782] [] (schedule) from [] (schedule_timeout+0x160/0x270)
    [  168.402805]  r5:c0c12d00 r4:ffffcca2
    [  168.402841] [] (schedule_timeout) from [] (wait_for_common+0xe0/0x1c4)
    [  168.402874]  r8:00000064 r7:dc111ce8 r6:00000000 r5:00000002 r4:dc110000
    [  168.402914] [] (wait_for_common) from [] (wait_for_completion_timeout+0x14/0x18)
    [  168.402944]  r7:dc111d2c r6:00000000 r5:dc111ce8 r4:d8100b00
    [  168.403471] [] (wait_for_completion_timeout) from [] (usb_start_wait_urb+0x70/0xc4 [usbcore])
    [  168.403998] [] (usb_start_wait_urb [usbcore]) from [] (usb_control_msg+0xac/0xd8 [usbcore])
    [  168.404033]  r8:00000008 r7:db3af000 r6:00000000 r5:00000000 r4:d80af580
    [  168.404446] [] (usb_control_msg [usbcore]) from [] (Gobi_usb_control_msg+0x84/0xa0 [GobiNet])
    [  168.404484]  r10:00000021 r9:00000000 r8:00000022 r7:80000400 r6:00000008 r5:00000001
    [  168.404503]  r4:db3af000
    [  168.404756] [] (Gobi_usb_control_msg [GobiNet]) from [] (RegisterQMIDevice+0x174/0x7d4 [GobiNet])
    [  168.404793]  r10:bf417f84 r9:10624dd3 r8:00000001 r7:00000000 r6:00000001 r5:00000000
    [  168.404813]  r4:d8278000
    [  168.405056] [] (RegisterQMIDevice [GobiNet]) from [] (work_function+0x220/0x65c [GobiNet])
    [  168.405094]  r10:bf417f84 r9:10624dd3 r8:d8279000 r7:d8278000 r6:dc110000 r5:bf417d1c
    [  168.405113]  r4:00000000
    [  168.405352] [] (work_function [GobiNet]) from [] (gobi_work_handler+0xa4/0x118 [GobiNet])
    [  168.405390]  r10:c0c0e73c r9:00000000 r8:c0c0e728 r7:dcb3ff00 r6:db3af004 r5:d8278000
    [  168.405409]  r4:d82782f8
    [  168.405554] [] (gobi_work_handler [GobiNet]) from [] (process_one_work+0x1f8/0x424)
    [  168.405613]  r6:00000000 r5:dc0ff600 r4:d82792f8
    [  168.405649] [] (process_one_work) from [] (worker_thread+0x6c/0x638)
    [  168.405685]  r10:c0c0e73c r9:c0c13e00 r8:00000008 r7:dc110000 r6:dc0ff618 r5:c0c0e728
    [  168.405703]  r4:dc0ff600
    [  168.405746] [] (worker_thread) from [] (kthread+0xf8/0x110)
    [  168.405781]  r10:00000000 r9:00000000 r8:c01427f8 r7:dc0ff600 r6:dc110000 r5:dc107c40
    [  168.405800]  r4:00000000
    [  168.405840] [] (kthread) from [] (ret_from_fork+0x14/0x24)
    [  168.405872]  r8:00000000 r7:00000000 r6:00000000 r5:c01483bc r4:dc107c40
    
  • do you have any solution liu because the same driver works fine with older TI SDK having older kernel but then the older TI SDK
    has usb intermittent failure
  • Nick,

    Sorry, I don't. I never worked on the gobi driver.
  • hello bin Liu the Gobi driver is not in interrupt context nor it is in atomic context nor in top half something is wrong in your kernel
    please check
  • Nick,

    This is the community kernel, not mine or TI's.

    The crash log seems telling that the usb control packet is either not sent to the usb dongle, or the response from the dongle is not received, which cause the waiting timeout, then the scheduling bug.

    I checked the musb driver changes between SDK v4.0 and v4.3, but was unable to identify any one which might be related to this crash.

    However, I found the following patch in SDK v4.3kernel, which might fixed the intermittent failures you had in the v4.0 kernel. Do you mind to apply the following patch on the SDK v4.0 kernel, to see if it fixes the intermittent failures?

    Author: Bin Liu <b-liu@ti.com>
    Date:   Thu May 25 13:42:39 2017 -0500
    
        usb: musb: dsps: keep VBUS on for host-only mode
        
        commit b3addcf0d1f04f53fcc302577d5a5e964c18531a upstream.
        
        Currently VBUS is turned off while a usb device is detached, and turned
        on again by the polling routine. This short period VBUS loss prevents
        usb modem to switch mode.
        
        VBUS should be constantly on for host-only mode, so this changes the
        driver to not turn off VBUS for host-only mode.
        
        Fixes: 2f3fd2c5bde1 ("usb: musb: Prepare dsps glue layer for PM runtime support")
        Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it>
        Acked-by: Tony Lindgren <tony@atomide.com>
        Signed-off-by: Bin Liu <b-liu@ti.com>
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    
    diff --git a/drivers/usb/musb/musb_dsps.c b/drivers/usb/musb/musb_dsps.c
    index 9f125e179acd..39666fb911b0 100644
    --- a/drivers/usb/musb/musb_dsps.c
    +++ b/drivers/usb/musb/musb_dsps.c
    @@ -213,6 +213,12 @@ static int dsps_check_status(struct musb *musb, void *unused)
                                    msecs_to_jiffies(wrp->poll_timeout));
                    break;
            case OTG_STATE_A_WAIT_BCON:
    +               /* keep VBUS on for host-only mode */
    +               if (musb->port_mode == MUSB_PORT_MODE_HOST) {
    +                       mod_timer(&glue->timer, jiffies +
    +                                       msecs_to_jiffies(wrp->poll_timeout));
    +                       break;
    +               }
                    musb_writeb(musb->mregs, MUSB_DEVCTL, 0);
                    skip_session = 1;
                    /* fall */
    
    
  • thanks bin liu

    i do not want to go back to old kernel because again if i raise a ticket for another issue then you will ask me to switch to new kernel so i better stick to latest TI SDK 4.3 and regarding the usb packet not sent and not recieved it is sent and it is recieved infact the usb modem is working fine we are able to ping etc but apart from working it shows huge number of kernel panic in the dmesg.

    the same drivers work fine in the older version of your TI SDK in that case your kernel api call was not atomic i think
    but in the latest 4.3V your kernel drivers are making the call to the RegisterQMIDevice but in atomic context but why

    Regards
  • Nick,

    To be clear, I didn't ask you "to switch to the new kernel" without knowing the root cause, but I asked "to test with the new kernel to see if the issue also happens", it is a typical diagnosis step, to just give a clue if it was something fixed in new kernel. But it is completely fine that you stay with SDK v4.3.

    The function RegisterQMIDevice() is called from the gobi driver, not in the community kernel. We are unable to debug any off-tree kernel drivers, so you would have to debug the gobi driver (or get the support from the gobi driver provider) to understand why the usb control msg call doesn't get a response, which leads to the kernel crash.
  • hi liu

    by the way still i see intermittent failure in the usb even after applying your patch case OTG_STATE_A_WAIT_BCON can you please help debug this further

    Regards
  • Nick,

    You applied the patch on SDK v4.0 kernel, right? It tells the patch is not relevant.

    As I mentioned in my post e2e.ti.com/.../2532146 please apply this mention debug patch in SDK v4.0 kernel, and send me the kernel dmesg log when the issue happens, it might tell why the usb controller has left host mode.
  • yes i applied on v4.0 i will send you the logs shortly
  • WITH USB ISSUE:
    ===================
    root@am335x-evm:/# cat dmesg_usb.log | grep musb
    [ 16.663643] musb-hdrc musb-hdrc.0.auto: musb_init_controller failed with status -517
    [ 16.663910] musb-hdrc musb-hdrc.0.auto: musb_init_controller failed with status -517
    [ 16.696954] musb-hdrc musb-hdrc.0.auto: musb_init_controller failed with status -517
    [ 16.699483] musb-hdrc musb-hdrc.1.auto: musb_init_controller failed with status -517
    [ 16.699733] musb-hdrc musb-hdrc.0.auto: musb_init_controller failed with status -517
    [ 16.699860] musb-hdrc musb-hdrc.1.auto: musb_init_controller failed with status -517
    [ 16.758237] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
    [ 16.758244] musb-hdrc: MHDRC RTL version 2.0
    [ 16.758249] musb-hdrc: setup fifo_mode 4
    [ 16.758266] musb-hdrc: 28/31 max ep, 16384/16384 memory
    [ 16.868598] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
    [ 16.868607] musb-hdrc: MHDRC RTL version 2.0
    [ 16.868612] musb-hdrc: setup fifo_mode 4
    [ 16.868629] musb-hdrc: 28/31 max ep, 16384/16384 memory
    [ 16.868784] musb-hdrc musb-hdrc.1.auto: MUSB HDRC host driver
    [ 16.868813] musb-hdrc musb-hdrc.1.auto: new USB bus registered, assigned bus number 1


    WITHOUT USB ISSUE:
    ===================
    root@am335x-evm:/# cat dmesg_usb.log | grep musb
    [ 16.609896] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
    [ 16.609925] musb-hdrc: MHDRC RTL version 2.0
    [ 16.609934] musb-hdrc: setup fifo_mode 4
    [ 16.609952] musb-hdrc: 28/31 max ep, 16384/16384 memory
    [ 16.691327] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
    [ 16.691356] musb-hdrc: MHDRC RTL version 2.0
    [ 16.691364] musb-hdrc: setup fifo_mode 4
    [ 16.691382] musb-hdrc: 28/31 max ep, 16384/16384 memory
    [ 16.691525] musb-hdrc musb-hdrc.1.auto: MUSB HDRC host driver
    [ 16.802281] musb-hdrc musb-hdrc.1.auto: new USB bus registered, assigned bus number 1
    [ 17.255202] musb-hdrc musb-hdrc.1.auto: <== DevCtl=5d, int_usb=0x10, state a_wait_bcon
    [ 17.497918] usb 1-1: new high-speed USB device number 2 using musb-hdrc
    [ 18.062115] usb 1-1.2: new high-speed USB device number 3 using musb-hdrc
    [ 25.902301] usb 1-1.1: new high-speed USB device number 4 using musb-hdrc
    root@am335x-evm:/#
    root@am335x-evm:/#

    We are not getting that print in case of usb error.
  • The log shows the connect interrupt is not generated when the issue happens.

    Please run the following test:

    after booted Linux, in every time before attach the usb device, please run the following command
    # grep -i devctl /sys/kernel/debug/musb-hdrc.1.auto/regdump
    and ensure the register is 0x19

    after attached the device and it is enumerated, console should print the following log.
    "musb-hdrc musb-hdrc.1.auto: <== DevCtl=5d, int_usb=0x10, state a_wait_bcon"

    after detached the device,
    # grep -i devctl /sys/kernel/debug/musb-hdrc.1.auto/regdump
    should show the register is 0x19

    Please let me know which step doesn't match with your test in the failure case.

    Since you have an on-board hub, please ensure that the power rail to the hub can provide minimum 2A current to be able to drive all the downstream usb devices.