This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/WL1831MOD: Kernel crash with WL1831MOD on Variscite VAR-SOM-MX6 module

Part Number: WL1831MOD
Other Parts Discussed in Thread: WL1271, WL1831

Tool/software: Linux

Hello,

One of our customers is experiencing a kernel crash on VAR-SOM-MX6 module (see https://www.variscite.com/products/system-on-module-som/cortex-a9/var-som-mx6-cpu-freescale-imx6/

Jul 29 00:58:59 var-som-mx6 kernel: wlan0: Limiting TX power to 14 dBm as advertised by 04:fe:7f:93:93:91
Jul 29 01:29:09 var-som-mx6 kernel: wlan0: deauthenticated from 04:fe:7f:93:93:91 (Reason: 2=PREV_AUTH_NOT_VALID)
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: authenticate with 08:d0:9f:b5:2f:a1
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: send auth to 08:d0:9f:b5:2f:a1 (try 1/3)
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: authenticated
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: associate with 08:d0:9f:b5:2f:a1 (try 1/3)
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: RX AssocResp from 08:d0:9f:b5:2f:a1 (capab=0x431 status=0 aid=26)
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: associated
Jul 29 01:29:13 var-som-mx6 kernel: wlcore: Association completed.
Jul 29 01:29:13 var-som-mx6 kernel: wlan0: Limiting TX power to 14 dBm as advertised by 08:d0:9f:b5:2f:a1
Jul 29 01:32:41 var-som-mx6 kernel: ------------[ cut here ]------------
Jul 29 01:32:41 var-som-mx6 kernel: WARNING: CPU: 0 PID: 383 at drivers/net/wireless/ti/wlcore/sdio.c:145 wl12xx_sdio_raw_write+0xb0/0x13c [wlcore_sdio]
Jul 29 01:32:41 var-som-mx6 kernel: Modules linked in: tun binfmt_misc wl18xx wlcore mxc_v4l2_capture ipu_bg_overlay_sdc ipu_still ipu_prp_enc ipu_csi_enc ipu_fg_overlay_sdc ov5640_camera_mipi_int v4l2_int_device wlcore_sdio mxc_dcic ip_tables
Jul 29 01:32:41 var-som-mx6 kernel: CPU: 0 PID: 383 Comm: NetworkManager Not tainted 4.9.11-greyscan #1
Jul 29 01:32:41 var-som-mx6 kernel: Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Jul 29 01:32:41 var-som-mx6 kernel: [<8010f608>] (unwind_backtrace) from [<8010b2b0>] (show_stack+0x10/0x14)
Jul 29 01:32:41 var-som-mx6 kernel: [<8010b2b0>] (show_stack) from [<803b6bf8>] (dump_stack+0x78/0x8c)
Jul 29 01:32:41 var-som-mx6 kernel: [<803b6bf8>] (dump_stack) from [<8012a4e0>] (__warn+0xe8/0x100)
Jul 29 01:32:41 var-som-mx6 kernel: [<8012a4e0>] (__warn) from [<8012a5a8>] (warn_slowpath_null+0x20/0x28)
Jul 29 01:32:41 var-som-mx6 kernel: [<8012a5a8>] (warn_slowpath_null) from [<7f01125c>] (wl12xx_sdio_raw_write+0xb0/0x13c [wlcore_sdio])
Jul 29 01:32:41 var-som-mx6 kernel: [<7f01125c>] (wl12xx_sdio_raw_write [wlcore_sdio]) from [<7f05c014>] (wl1271_ps_elp_wakeup+0x128/0x1dc [wlcore])
Jul 29 01:32:41 var-som-mx6 kernel: [<7f05c014>] (wl1271_ps_elp_wakeup [wlcore]) from [<7f04d868>] (wlcore_op_sta_statistics+0x4c/0xbc [wlcore])
Jul 29 01:32:41 var-som-mx6 kernel: [<7f04d868>] (wlcore_op_sta_statistics [wlcore]) from [<808f5b3c>] (sta_set_sinfo+0x90/0x860)
Jul 29 01:32:41 var-som-mx6 kernel: [<808f5b3c>] (sta_set_sinfo) from [<80909710>] (ieee80211_get_station+0x44/0x5c)
Jul 29 01:32:41 var-som-mx6 kernel: [<80909710>] (ieee80211_get_station) from [<808e297c>] (nl80211_get_station+0x58/0x108)
Jul 29 01:32:41 var-som-mx6 kernel: [<808e297c>] (nl80211_get_station) from [<807a6788>] (genl_rcv_msg+0x258/0x404)
Jul 29 01:32:41 var-som-mx6 kernel: [<807a6788>] (genl_rcv_msg) from [<807a5acc>] (netlink_rcv_skb+0xb4/0xd8)
Jul 29 01:32:41 var-som-mx6 kernel: [<807a5acc>] (netlink_rcv_skb) from [<807a6520>] (genl_rcv+0x24/0x34)
Jul 29 01:32:41 var-som-mx6 kernel: [<807a6520>] (genl_rcv) from [<807a5440>] (netlink_unicast+0x170/0x220)
Jul 29 01:32:41 var-som-mx6 kernel: [<807a5440>] (netlink_unicast) from [<807a5834>] (netlink_sendmsg+0x27c/0x334)
Jul 29 01:32:41 var-som-mx6 kernel: [<807a5834>] (netlink_sendmsg) from [<807590fc>] (sock_sendmsg+0x14/0x24)
Jul 29 01:32:41 var-som-mx6 kernel: [<807590fc>] (sock_sendmsg) from [<8075976c>] (___sys_sendmsg+0x1ec/0x1fc)
Jul 29 01:32:41 var-som-mx6 kernel: [<8075976c>] (___sys_sendmsg) from [<8075a4a4>] (__sys_sendmsg+0x40/0x6c)
Jul 29 01:32:41 var-som-mx6 kernel: [<8075a4a4>] (__sys_sendmsg) from [<80107780>] (ret_fast_syscall+0x0/0x3c)
Jul 29 01:32:41 var-som-mx6 kernel: ---[ end trace 649f2817975dff49 ]---
Jul 29 01:32:41 var-som-mx6 kernel: wl1271_sdio mmc2:0001:2: sdio write failed (-110)
Jul 29 01:32:41 var-som-mx6 kernel: ------------[ cut here ]------------

The SDIO parameters of WL1831 are below:

root@var-som-mx6:~# cat /sys/kernel/debug/mmc2/ios

clock:          50000000 Hz

actual clock:   49500000 Hz

vdd:            7 (1.65 - 1.95 V)

bus mode:       2 (push-pull)

chip select:    0 (don't care)

power mode:     2 (on)

bus width:      2 (4 bits)

timing spec:    2 (sd high-speed)

signal voltage: 0 (3.30 V)

Setting SDIO clock to 25MHz does not fix the crash. The VAR-SOM-MX6 product is used by numerous customers, but this is the first time we encounter such problem

Please advice.

Thanks a lot.

  • The WL18xx firmware version in is 8.9.0.0.78. The crash occurs also with earlier versions R8.7_SP3 and R8.7_SP2. Kernel version is 4.9.11, wlcore driver code comes from kernel sources and is synchronized with latest 4.9.x stable kernel.
  • Hi Felix,

    Are you seeing this issue if you are turning ELP off?

    processors.wiki.ti.com/.../WL18xx_Driver_Debug

    If you are using the latest firmware (I would also try 8.9.0.0.79 which just came out) it should work ok already.
    Can you share the full log also following the messages you attached here?

    In addition, how often are you seeing this? Are you seeing it on just one board? Do they have another exact board that they can try it on?

    Best Regards,
    Eyal
  • Hi Eyal,

    Thanks a lot for a prompt reply.

    The problem is reproducible on several boards within a few hours. I've asked our customer to try disabling ELP and testing new firmware.

    Attached please find the full log.

    Felix.kern20180730.debian_stretch.log

  • Just to make sure we are on the same page.

    The following log is still with ELP enabled, is that correct?
    Any results with ELP turned off?
    ...
    Jul 29 01:32:41 var-som-mx6 kernel: ------------[ cut here ]------------
    Jul 29 01:32:41 var-som-mx6 kernel: WARNING: CPU: 0 PID: 383 at drivers/net/wireless/ti/wlcore/sdio.c:145 wl12xx_sdio_raw_write+0xb0/0x13c [wlcore_sdio]
    Jul 29 01:32:41 var-som-mx6 kernel: Modules linked in: tun binfmt_misc wl18xx wlcore mxc_v4l2_capture ipu_bg_overlay_sdc ipu_still ipu_prp_enc ipu_csi_enc ipu_fg_overlay_sdc ov5640_camera_mipi_int v4l2_int_device wlcore_sdio mxc_dcic ip_tables
    Jul 29 01:32:41 var-som-mx6 kernel: CPU: 0 PID: 383 Comm: NetworkManager Not tainted 4.9.11-greyscan #1
    Jul 29 01:32:41 var-som-mx6 kernel: Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
    Jul 29 01:32:41 var-som-mx6 kernel: [<8010f608>] (unwind_backtrace) from [<8010b2b0>] (show_stack+0x10/0x14)
    Jul 29 01:32:41 var-som-mx6 kernel: [<8010b2b0>] (show_stack) from [<803b6bf8>] (dump_stack+0x78/0x8c)
    Jul 29 01:32:41 var-som-mx6 kernel: [<803b6bf8>] (dump_stack) from [<8012a4e0>] (__warn+0xe8/0x100)
    Jul 29 01:32:41 var-som-mx6 kernel: [<8012a4e0>] (__warn) from [<8012a5a8>] (warn_slowpath_null+0x20/0x28)
    Jul 29 01:32:41 var-som-mx6 kernel: [<8012a5a8>] (warn_slowpath_null) from [<7f01125c>] (wl12xx_sdio_raw_write+0xb0/0x13c [wlcore_sdio])
    Jul 29 01:32:41 var-som-mx6 kernel: [<7f01125c>] (wl12xx_sdio_raw_write [wlcore_sdio]) from [<7f05c014>] (wl1271_ps_elp_wakeup+0x128/0x1dc [wlcore])
    Jul 29 01:32:41 var-som-mx6 kernel: [<7f05c014>] (wl1271_ps_elp_wakeup [wlcore]) from [<7f04d868>] (wlcore_op_sta_statistics+0x4c/0xbc [wlcore])
    Jul 29 01:32:41 var-som-mx6 kernel: [<7f04d868>] (wlcore_op_sta_statistics [wlcore]) from [<808f5b3c>] (sta_set_sinfo+0x90/0x860)
    Jul 29 01:32:41 var-som-mx6 kernel: [<808f5b3c>] (sta_set_sinfo) from [<80909710>] (ieee80211_get_station+0x44/0x5c)
    Jul 29 01:32:41 var-som-mx6 kernel: [<80909710>] (ieee80211_get_station) from [<808e297c>] (nl80211_get_station+0x58/0x108)
    Jul 29 01:32:41 var-som-mx6 kernel: [<808e297c>] (nl80211_get_station) from [<807a6788>] (genl_rcv_msg+0x258/0x404)
    Jul 29 01:32:41 var-som-mx6 kernel: [<807a6788>] (genl_rcv_msg) from [<807a5acc>] (netlink_rcv_skb+0xb4/0xd8)
    Jul 29 01:32:41 var-som-mx6 kernel: [<807a5acc>] (netlink_rcv_skb) from [<807a6520>] (genl_rcv+0x24/0x34)
    Jul 29 01:32:41 var-som-mx6 kernel: [<807a6520>] (genl_rcv) from [<807a5440>] (netlink_unicast+0x170/0x220)
    Jul 29 01:32:41 var-som-mx6 kernel: [<807a5440>] (netlink_unicast) from [<807a5834>] (netlink_sendmsg+0x27c/0x334)
    Jul 29 01:32:41 var-som-mx6 kernel: [<807a5834>] (netlink_sendmsg) from [<807590fc>] (sock_sendmsg+0x14/0x24)
    Jul 29 01:32:41 var-som-mx6 kernel: [<807590fc>] (sock_sendmsg) from [<8075976c>] (___sys_sendmsg+0x1ec/0x1fc)
    Jul 29 01:32:41 var-som-mx6 kernel: [<8075976c>] (___sys_sendmsg) from [<8075a4a4>] (__sys_sendmsg+0x40/0x6c)
    Jul 29 01:32:41 var-som-mx6 kernel: [<8075a4a4>] (__sys_sendmsg) from [<80107780>] (ret_fast_syscall+0x0/0x3c)
    Jul 29 01:32:41 var-som-mx6 kernel: ---[ end trace 649f2817975dff49 ]---

    BR,
    Eyal
  • Hi Eyal,

    You are correct, the log is with ELP enabled. Still waiting for customer to report the results with ELP turned off.

    Felix.

  • Hi Felix,
    Any new update here?
    or should I close it for now and you will re-open when we have new data?
    Best Regards,
    Eyal
  • Hi Eyal,
    The customer has confirmed that disabling ELP fixes the problem. How do we proceed ?
    Felix
  • Hi Felix,

    One thing that is strange is that when the crash is seen, the driver tries to perform a recovery but fails...
    This should not happen. Looks as if the sdio bus has just died and reset is not helping.
    It could be something related to hardware as well. Have they tried with even lower sdio clock (For example 10Mhz?)

    Now, when all is good, even with ELP enables are they able to perform the following down/up with no issues?:
    ifconfig wlan0 down
    ifconfig wlan0 up

    Best Regards,
    Eyal
  • Hi Eyal,
    Sorry for delay, I'm currently on vacation with occasional access to the internet. This platform is used by hundreds of our customers, so the possibility we have a hardware problem is not very high. Anyway, I've asked the customer to perform the tests you have requested. Will update you when they respond.
    Felix.
  • Hi Felix,

    If his is the same boards others customers are using and not seeing this issue, then it could be one board that is miss behaving?
    Did you suggest replacing their board and see if they see the same issue?

    Best regards,
    Eyal
  • Hi Eyal,

    The same boards are used by other customers. This particular customer has reproduced the problem on multiple boards. Since the crash goes away when ELP is disabled this looks like a firmware problem to me. I can ask the customer to provide more details about his environment, i.e. the AP model used, AP configuration or anything else you may need to reproduce/debug the problem.
    Felix.
  • Hi Eyal,

    The customer have reproduced the problem on several boards. The problem is not reproducible with ELP enabled and 10MHz SDIO clock. Before the crash "ifconfig wlan0 down" and "ifconfig wlan0 up" work fine.

    Felix.

  • Hi Felix,

    Are you positive they are using our latest firmware? (8.9.0.0.79)
    Can you share their boot logs showing the print of the firmware version?

    Best Regards,
    Eyal
  • Hi Eyal,

    Before I ask the customer to provide more logs I'd like to understand if disabling ELP is a final solution or a workaround.

    Thanks.

    Felix.