This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/WL1831MOD: Stability issues on WIFI and Bluetooth

Part Number: WL1831MOD
Other Parts Discussed in Thread: WL1831, WL1271

Tool/software: Linux

Hello TI Support-Team,

we are having some stability issues when using the WL1831MOD. One issue relates to Bluetooth, the others to wlan.
-> Can you tell us, if those are known issues or if you have a solution for that, before we start to spend time on deeper investigation of this errors. <-

Some information of the system we are using:

- Atmel SoC
- Kernel 4.9
- WL-Drivers that are coming with mainline kernel 4.9
- BlueZ 5.47

Bluetooth:
We are seeing timeout messages while communicating with the Bluetooth Part of the WL1831.

Bluetooth: hci0 command 0x1003 tx timeout
Bluetooth: hci0 command 0x1001 tx timeout
Bluetooth: hci0 command 0x1009 tx timeout

Restarting the bluetooth fixes this problem, which means, that we are shutting down the bluetooth-part with the bluetooth enable pin of the module, restarting it and doing a new hciattach. After that, the module works again as expected.

WLAN:
The wireless lan part generated kernel panics. It should be easy to hunt those errors down, but only our customer is seeing this type of errors.

Unable to handle kernel NULL pointer dereference at virtual address 00000018
pgd = c0004000
[00000018] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 6 Comm: kworker/u2:0 Not tainted 4.9.51 #124
Hardware name: Atmel SAMA5
Workqueue: wl12xx_wq wl1271_netstack_work
task: ce4ad000 task.stack: ce4bc000
PC is at ieee80211_scan_rx+0x184/0x198
LR is at 0xcd8394a0
pc : [<c07404e4>]    lr : [<cd8394a0>]    psr: a0010013
sp : ce4bdd20  ip : 00007f00  fp : ce4bde04
r10: 00000000  r9 : 00000000  r8 : 00000000
r7 : ce3903c0  r6 : ca59b380  r5 : ca650480  r4 : c0b02848
r3 : 947ddfdc  r2 : 000058c8  r1 : 000000ed  r0 : b076871c
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c53c7d  Table: 2c40c059  DAC: 00000051
Process kworker/u2:0 (pid: 6, stack limit = 0xce4bc208)
Stack: (0xce4bdd20 to 0xce4be000)
dd20: 000fac01 0a002000 c0b15740 c0b02848 ca66e140 c0587034 ca66ecc0 ca66e140
dd40: 00000040 ca59b380 ce4bdd6c ce4bdd58 c0587034 c0586dc4 ca66e140 8dfa79d7
dd60: ce4bdd8c 8dfa79d7 ca66e140 c0b02848 ca59bc80 cd8394c8 cd8394a0 ce3907f4
dd80: 00000080 c0781b94 ce4bde94 ce4bdd98 c0781b94 c0795c7c 00000000 00000000
dda0: ca66e140 00000000 ce3903c0 00000001 ca59b390 cc7d9040 00000000 00000000
ddc0: cc7d9040 ce3903c0 ce4bddc8 c0782500 ce4bde94 8dfa79d7 c0782500 ce3903c0
dde0: c0b02848 00000000 ca650480 ca59b380 00000000 00000000 ce4bdea4 ce4bde08
de00: c075dec4 c074036c c01fdaf8 c01483f8 ca66ecc0 c0b492f0 00000000 00000000
de20: 00000000 00000000 ce4bde44 c0b0fff8 00000001 c01484a0 c0b0fff8 00000000
de40: ce4ad030 00000001 00000000 ca650480 ce3903c0 00000000 00000000 00000000
de60: 00000000 00000010 00000010 00000000 00000000 8dfa79d7 ce4bdea4 ce390c00
de80: ca650480 ce3903c0 ce390db0 c04b4da4 00000000 c0b6c298 ce4bdecc ce4bdea8
dea0: c04b4dc4 c075d62c ce390e74 ce390c00 ce4a9ac0 ce36f300 00000000 00000000
dec0: ce4bdee4 ce4bded0 c04b4e50 c04b4d8c ce390e74 ce4a3a80 ce4bdf1c ce4bdee8
dee0: c0133d48 c04b4e38 ce4a9ad4 c0b15740 ce4bc020 ce4a9ac0 ce4a3a98 ce4a9ad4
df00: c0b15740 ce4bc020 00000088 ce4a3a80 ce4bdf64 ce4bdf20 c01340d4 c0133c14
df20: c0b15740 c0b15740 ce4a9c1c ce4a9ad4 ce4bc020 c0b6bf14 00000000 00000000
df40: ce4a3b40 ce4a3a80 c0134078 00000000 00000000 00000000 ce4bdfac ce4bdf68
df60: c013a280 c0134084 00000000 c014185c ce4a3a80 00000000 ce4bdf78 ce4bdf78
df80: 00000000 ce4bdf84 ce4bdf84 8dfa79d7 ce4a3b40 c013a16c 00000000 00000000
dfa0: 00000000 ce4bdfb0 c0109438 c013a178 00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 08000004 00000000
[<c07404e4>] (ieee80211_scan_rx) from [<c075dec4>] (ieee80211_rx_napi+0x8a4/0x938)
[<c075dec4>] (ieee80211_rx_napi) from [<c04b4dc4>] (wl1271_flush_deferred_work+0x44/0xac)
[<c04b4dc4>] (wl1271_flush_deferred_work) from [<c04b4e50>] (wl1271_netstack_work+0x24/0x34)
[<c04b4e50>] (wl1271_netstack_work) from [<c0133d48>] (process_one_work+0x140/0x470)
[<c0133d48>] (process_one_work) from [<c01340d4>] (worker_thread+0x5c/0x5b0)
[<c01340d4>] (worker_thread) from [<c013a280>] (kthread+0x114/0x12c)
[<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
Code: e022200c e0233000 e1923003 0affffc3 (e5993018)
---[ end trace 203a866ff7595e8c ]---
Kernel panic - not syncing: Fatal exception in interrupt
---[ end Kernel panic - not syncing: Fatal exception in interrupt

Another error are some strange messages from the WL18-driver.

Tue Oct 17 13:17:55 2017] wlan0: associated
[Tue Oct 17 13:17:56 2017] wlcore: Association completed.
[Tue Oct 17 13:18:04 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:18:10 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:18:13 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:18:45 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:19:47 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:20:54 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:21:59 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:23:07 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:24:11 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:25:16 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:26:22 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:27:30 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:28:34 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:29:40 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:30:47 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:31:54 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:32:59 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:34:07 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:35:12 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:36:17 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:37:24 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:38:31 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:39:38 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:40:42 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:41:48 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:42:53 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:45:04 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:47:14 2017] wlcore: WARNING no fw rx ba on tid 1
[Tue Oct 17 13:49:26 2017] wlcore: WARNING no fw rx ba on tid 1

Can you tell something about that warning? Has that impact on the functionallity of the module? Can that cause stability-issues?

Thanks in advance for your answers.

Best regards,
Frank Erdrich

  • Hi Frank,

    Acouple of questions regarding versions.

    1. Can you make sure you are using the latest wl18xx firmware:

    It should now be Fw 8.9.0.0.77

    2. Are you using the lastest init script for bluetooth?:

    Best Regards,

    Eyal

  • Hello Eyal,

    my colleague is out of office today and I have been requested to ask for the relase notes of the WiFi firmware update and the Erratas which have been fixed in it and which are Known. The wiki and product pages are not on outdated on that point.

    BR
    Jan
  • Hello Eyal,

    thanks for your answer.

    We are using the latest Firmware for the Bluetooth part. It's in our system July 2017. So there seems no newer Version than that from Jan. 2017, is that correct?

    The new WLAN firmware version 8.9.0.0.78 is under test now.

  • Hello Eyal,

    can you PLEASE send us the requested change notes for the new firmware versions 8.9.0.0.77 and 8.9.0.0.78!!

    The stability of the system seems not the best overall. I saw a kernel trace on three different systems.The strange thing is, that it happened on all three devices at the same time. It seems, that the access point is somehow involved into that, but I can't see any reason that could cause such a failure. Can you please check that and give your expertise?

    wlcore: ERROR SW watchdog interrupt received! starting recovery.
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 196 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    Modules linked in:
    CPU: 0 PID: 196 Comm: irq/93-wl18xx Not tainted 4.9.51 #124
    Hardware name: Atmel SAMA5
    [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    ---[ end trace 1fe0ede2d80e131a ]---

    After that, the behaviour of the systems is different, so no same events at the same time. I would conclude, that the access point is working as expected, but the systems with the WL18MOD don't. To be sure for that, we will test with some other access points.

    It happens on firmware version 8.9.0.0.75 as well as on 8.9.0.0.78.

  • Hi,

    Can you please share the full kernel log with the full trace?
    How often is this seen?
    In addition. Is this happening only with a specific AP? Which one is This?
    Can you try another one, just to make sure?

    Best Regards,
    Eyal
  • Hello Eyal,

    we see this error with different APs. But its the first time we are seeing it within our company, Before it was only seen at our customers office. We are trying to hunt it down to a specific environment, but thats not as easy as we thought. We are using a TP-Link TL-WR841N. It worked around a week without problems and then there was one event that caused all systems to collectively generate the kernel trace. Following is a mostly complete system log (I trimmed the system boot and some repeating messages). As you can see, the system ran quite a long time before it thougth it should do a crash. The logs on the other systems look almost the same (timing differences).

    [   37.840000] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
    [   39.150000] wlcore: PHY firmware version: Rev 8.2.0.0.242
    [   39.400000] wlcore: firmware booted (Rev 8.9.0.0.78)
    [   39.460000] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
    [   42.150000] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
    [   47.570000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [   47.600000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [   47.640000] wlan0: authenticated
    [   47.740000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [   47.740000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=1)
    [   47.810000] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
    [   47.810000] wlan0: associated
    [   48.270000] wlcore: Association completed.
    [   90.250000] atmel_usart f001c000.serial: using dma0chan5 for rx DMA transfers
    [   90.250000] atmel_usart f001c000.serial: using dma0chan6 for tx DMA transfers
    [77713.730000] wlan0: deauthenticated from 98:de:d0:fe:2a:94 (Reason: 3=DEAUTH_LEAVING)
    [77717.080000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [77717.110000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [77717.180000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 2/3)
    [77717.230000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 3/3)
    [77717.280000] wlan0: authentication with 98:de:d0:fe:2a:94 timed out
    [77729.670000] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
    [77757.480000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [77757.500000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [77757.530000] wlan0: authenticated
    [77757.550000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [77757.650000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    [77757.670000] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
    [77757.680000] wlan0: associated
    [77757.760000] wlcore: Association completed.
    [381468.690000] wlcore: ERROR SW watchdog interrupt received! starting recovery.
    [381468.700000] ------------[ cut here ]------------
    [381468.700000] WARNING: CPU: 0 PID: 213 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    [381468.710000] Modules linked in:
    [381468.710000] CPU: 0 PID: 213 Comm: irq/93-wl18xx Not tainted 4.9.51 #124
    [381468.720000] Hardware name: Atmel SAMA5
    [381468.720000] [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [381468.730000] [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [381468.730000] [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [381468.740000] [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [381468.750000] [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [381468.760000] [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [381468.760000] [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [381468.770000] [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [381468.780000] [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [381468.780000] [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    [381468.790000] ---[ end trace cc559f3d83d60088 ]---
    [381468.800000] wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.78
    [381468.800000] wlcore: pc: 0x18f7e, hint_sts: 0x00000000 count: 1
    [381468.820000] wlcore: down
    [381468.820000] wlcore: down
    [381468.820000] ieee80211 phy0: Hardware restart was requested
    [381469.320000] wlcore: PHY firmware version: Rev 8.2.0.0.242
    [381469.610000] wlcore: firmware booted (Rev 8.9.0.0.78)
    [381469.660000] wlcore: Association completed.
    [381669.360000] wlan0: deauthenticated from 98:de:d0:fe:2a:94 (Reason: 2=PREV_AUTH_NOT_VALID)
    [381672.750000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [381672.790000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [381672.810000] wlan0: authenticated
    [381672.830000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [381672.850000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    [381672.880000] wlan0: associated
    [381672.970000] wlcore: Association completed.
    [397484.390000] wlcore: ERROR SW watchdog interrupt received! starting recovery.
    [397484.390000] ------------[ cut here ]------------
    [397484.400000] WARNING: CPU: 0 PID: 213 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    [397484.410000] Modules linked in:
    [397484.410000] CPU: 0 PID: 213 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    [397484.420000] Hardware name: Atmel SAMA5
    [397484.420000] [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [397484.420000] [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [397484.430000] [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [397484.440000] [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [397484.440000] [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [397484.450000] [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [397484.460000] [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [397484.470000] [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [397484.470000] [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [397484.480000] [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    [397484.480000] ---[ end trace cc559f3d83d60089 ]---
    [397484.500000] wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.78
    [397484.510000] wlcore: pc: 0x18f7e, hint_sts: 0x00000000 count: 2
    [397484.520000] wlcore: down
    [397484.520000] wlcore: down
    [397484.520000] ieee80211 phy0: Hardware restart was requested
    [397485.040000] wlcore: PHY firmware version: Rev 8.2.0.0.242
    [397485.350000] wlcore: firmware booted (Rev 8.9.0.0.78)
    [397485.390000] wlcore: Association completed.
    [397507.110000] wlcore: ERROR SW watchdog interrupt received! starting recovery.
    [397507.120000] ------------[ cut here ]------------
    [397507.120000] WARNING: CPU: 0 PID: 213 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    [397507.130000] Modules linked in:
    [397507.130000] CPU: 0 PID: 213 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    [397507.140000] Hardware name: Atmel SAMA5
    [397507.140000] [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [397507.150000] [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [397507.150000] [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [397507.160000] [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [397507.160000] [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [397507.170000] [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [397507.180000] [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [397507.190000] [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [397507.190000] [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [397507.200000] [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    [397507.210000] ---[ end trace cc559f3d83d6008a ]---
    [397507.220000] wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.78
    [397507.220000] wlcore: pc: 0x18f7e, hint_sts: 0x00000000 count: 3
    [397507.240000] wlcore: down
    [397507.240000] wlcore: down
    [397507.240000] ieee80211 phy0: Hardware restart was requested
    [397507.740000] wlcore: PHY firmware version: Rev 8.2.0.0.242
    [397508.050000] wlcore: firmware booted (Rev 8.9.0.0.78)
    [397508.090000] wlcore: Association completed.
    [397518.470000] wlcore: ERROR SW watchdog interrupt received! starting recovery.
    [397518.480000] ------------[ cut here ]------------
    [397518.480000] WARNING: CPU: 0 PID: 213 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    [397518.490000] Modules linked in:
    [397518.490000] CPU: 0 PID: 213 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    [397518.500000] Hardware name: Atmel SAMA5
    [397518.500000] [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [397518.510000] [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [397518.510000] [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [397518.520000] [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [397518.520000] [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [397518.530000] [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [397518.540000] [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [397518.550000] [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [397518.550000] [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [397518.560000] [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    [397518.570000] ---[ end trace cc559f3d83d6008b ]---
    [397518.580000] wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.78
    [397518.580000] wlcore: pc: 0x18f7e, hint_sts: 0x00000000 count: 4
    [397518.600000] wlcore: down
    [397518.600000] wlcore: down
    [397518.600000] ieee80211 phy0: Hardware restart was requested
    [397519.100000] wlcore: PHY firmware version: Rev 8.2.0.0.242
    [397519.390000] wlcore: firmware booted (Rev 8.9.0.0.78)
    [397519.440000] wlcore: Association completed.
    [439176.670000] wlan0: disassociated from 98:de:d0:fe:2a:94 (Reason: 34)
    [439180.260000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [439180.300000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [439180.330000] wlan0: authenticated
    [439180.350000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [439180.350000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    --
    [454765.570000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [454765.600000] wlan0: authenticated
    [454765.670000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [454765.670000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    [454765.720000] wlan0: associated
    [454765.780000] wlcore: Association completed.
    [455361.340000] wlan0: disassociated from 98:de:d0:fe:2a:94 (Reason: 34)
    [455364.930000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [455364.990000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [455365.030000] wlan0: authenticated
    [455365.040000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [455365.050000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    [455365.090000] wlan0: associated
    [455365.140000] wlcore: Association completed.
    [458838.050000] wlan0: disassociated from 98:de:d0:fe:2a:94 (Reason: 34)
    [458841.640000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [458841.670000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [458841.700000] wlan0: authenticated
    [458841.730000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [458841.740000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    [458841.800000] wlan0: associated
    [458841.860000] wlcore: Association completed.
    [459557.380000] wlan0: disassociated from 98:de:d0:fe:2a:94 (Reason: 34)
    [459561.020000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [459561.040000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [459561.080000] wlan0: authenticated
    [459561.090000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [459561.200000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)
    [459561.230000] wlan0: associated
    [459561.310000] wlcore: Association completed.
    [459562.090000] wlcore: ERROR SW watchdog interrupt received! starting recovery.
    [459562.100000] ------------[ cut here ]------------
    [459562.100000] WARNING: CPU: 0 PID: 213 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    [459562.110000] Modules linked in:
    [459562.120000] CPU: 0 PID: 213 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    [459562.120000] Hardware name: Atmel SAMA5
    [459562.120000] [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [459562.130000] [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [459562.140000] [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [459562.140000] [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [459562.150000] [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [459562.160000] [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [459562.170000] [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [459562.170000] [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [459562.180000] [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [459562.180000] [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    [459562.190000] ---[ end trace cc559f3d83d6008c ]---
    [459562.200000] wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.78
    [459562.200000] wlcore: pc: 0x18f7e, hint_sts: 0x00000000 count: 5
    [459562.220000] wlcore: down
    [459562.220000] wlcore: down
    [459562.220000] ieee80211 phy0: Hardware restart was requested
    [459562.710000] wlcore: PHY firmware version: Rev 8.2.0.0.242
    [459563.020000] wlcore: firmware booted (Rev 8.9.0.0.78)
    [459563.060000] wlcore: Association completed.
    [460036.890000] wlan0: disassociated from 98:de:d0:fe:2a:94 (Reason: 34)
    [460040.480000] wlan0: authenticate with 98:de:d0:fe:2a:94
    [460040.510000] wlan0: send auth to 98:de:d0:fe:2a:94 (try 1/3)
    [460040.540000] wlan0: authenticated
    [460040.560000] wlan0: associate with 98:de:d0:fe:2a:94 (try 1/3)
    [460040.570000] wlan0: RX AssocResp from 98:de:d0:fe:2a:94 (capab=0x431 status=0 aid=3)

    At the moment we are setting up a testing system with three more APs, but I think that we aren't seeing an error within the first hours.

  • Hi,

    We haven't seen this specific error before:
    wlcore: pc: 0x18f7e, hint_sts: 0x00000000 count: 5

    Have you seen the exact same crash as well when you have used your previous firmware (8.9.0.0.75)?

    BR,
    Eyal
  • Hello Eyal,

    yep, we see that also with the older firmware. The strange thing is that three systems had thrown the same error at exactly the same time.

    wlcore: ERROR SW watchdog interrupt received! starting recovery.
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 196 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    Modules linked in:
    CPU: 0 PID: 196 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    Hardware name: Atmel SAMA5
    [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    ---[ end trace 1fe0ede2d80e131e ]---
    wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.75
    wlcore: pc: 0x18e62, hint_sts: 0x00000000 count: 5
    wlcore: down
    wlcore: down
    ieee80211 phy0: Hardware restart was requested
    Feb 17 21:06:59 kernel: wlcore: ERROR SW watchdog interrupt received! starting recovery.
    Feb 17 21:06:59 kernel: ------------[ cut here ]------------
    Feb 17 21:06:59 kernel: WARNING: CPU: 0 PID: 196 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    Feb 17 21:06:59 kernel: Modules linked in:
    Feb 17 21:06:59 kernel: CPU: 0 PID: 196 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    Feb 17 21:06:59 kernel: Hardware name: Atmel SAMA5
    Feb 17 21:06:59 kernel: [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    Feb 17 21:06:59 kernel: [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    Feb 17 21:06:59 kernel: [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    Feb 17 21:06:59 kernel: [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    Feb 17 21:06:59 kernel: [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    Feb 17 21:06:59 kernel: [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    Feb 17 21:06:59 kernel: [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    Feb 17 21:06:59 kernel: [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    Feb 17 21:06:59 kernel: [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    wlcore: PHY firmware version: Rev 8.2.0.0.240
    Feb 17 21:06:59 kernel: [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    Feb 17 21:06:59 kernel: ---[ end trace 1fe0ede2d80e131e ]---
    Feb 17 21:06:59 kernel: wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.75
    Feb 17 21:06:59 kernel: wlcore: pc: 0x18e62, hint_sts: 0x00000000 count: 5
    Feb 17 21:06:59 kernel: wlcore: down
    Feb 17 21:06:59 kernel: wlcore: down
    Feb 17 21:06:59 kernel: ieee80211 phy0: Hardware restart was requested
    wlcore: firmware booted (Rev 8.9.0.0.75)
    Feb 17 21:07:00 kernel: wlcore: PHY firmware version: Rev 8.2.0.0.240
    Feb 17 21:07:00 kernel: wlcore: firmware booted (Rev 8.9.0.0.75)
    wlcore: Association completed.
    
    wlcore: ERROR SW watchdog interrupt received! starting recovery.
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 196 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    Modules linked in:
    CPU: 0 PID: 196 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    Hardware name: Atmel SAMA5
    [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    ---[ end trace 1fe0ede2d80e131f ]---
    wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.75
    wlcore: pc: 0x18e62, hint_sts: 0x00000000 count: 6
    wlcore: down
    wlcore: down
    ieee80211 phy0: Hardware restart was requested
    Feb 19 05:22:56 kernel: wlcore: ERROR SW watchdog interrupt received! starting recovery.
    Feb 19 05:22:56 kernel: ------------[ cut here ]------------
    Feb 19 05:22:56 kernel: WARNING: CPU: 0 PID: 196 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work.part.10+0x74/0x78
    Feb 19 05:22:56 kernel: Modules linked in:
    Feb 19 05:22:56 wlcore: PHY firmware version: Rev 8.2.0.0.240
    kernel: CPU: 0 PID: 196 Comm: irq/93-wl18xx Tainted: G        W       4.9.51 #124
    Feb 19 05:22:56 kernel: Hardware name: Atmel SAMA5
    Feb 19 05:22:56 kernel: [<c0110570>] (unwind_backtrace) from [<c010d488>] (show_stack+0x20/0x24)
    Feb 19 05:22:56 kernel: [<c010d488>] (show_stack) from [<c0390c7c>] (dump_stack+0x24/0x28)
    Feb 19 05:22:56 kernel: [<c0390c7c>] (dump_stack) from [<c011a90c>] (__warn+0xf4/0x10c)
    Feb 19 05:22:56 kernel: [<c011a90c>] (__warn) from [<c011aa38>] (warn_slowpath_null+0x30/0x38)
    Feb 19 05:22:56 kernel: [<c011aa38>] (warn_slowpath_null) from [<c04baa5c>] (wl12xx_queue_recovery_work.part.10+0x74/0x78)
    Feb 19 05:22:56 kernel: [<c04baa5c>] (wl12xx_queue_recovery_work.part.10) from [<c04bae08>] (wlcore_irq+0x220/0x250)
    Feb 19 05:22:56 kernel: [<c04bae08>] (wlcore_irq) from [<c0159814>] (irq_thread_fn+0x2c/0x44)
    Feb 19 05:22:56 kernel: [<c0159814>] (irq_thread_fn) from [<c0159ae0>] (irq_thread+0x144/0x1fc)
    Feb 19 05:22:56 kernel: [<c0159ae0>] (irq_thread) from [<c013a280>] (kthread+0x114/0x12c)
    Feb 19 05:22:56 kernel: [<c013a280>] (kthread) from [<c0109438>] (ret_from_fork+0x14/0x3c)
    Feb 19 05:22:56 kernel: ---[ end trace 1fe0ede2d80e131f ]---
    Feb 19 05:22:56 kernel: wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.75
    Feb 19 05:22:56 kernel: wlcore: pc: 0x18e62, hint_sts: 0x00000000 count: 6
    Feb 19 05:22:56 kernel: wlcore: down
    Feb 19 05:22:56 kernel: wlcore: down
    Feb 19 05:22:56 kernel: ieee80211 phy0: Hardware restart was requested
    wlcore: firmware booted (Rev 8.9.0.0.75)
    Feb 19 05:22:56 kernel: wlcore: PHY firmware version: Rev 8.2.0.0.240
    Feb 19 05:22:57 kernel: wlcore: firmware booted (Rev 8.9.0.0.75)

  • Hi,

    We have confirmed with our R&D and this issue is not seen on any of our platforms so there is no quick solution here.
    Unless I am mistaken the issue takes very long time to reproduce( you mentioned a week) and then it seems the system immediately recovers.
    So I am not sure how critical it really is.

    Are you able to find a test scenario that allows reproducing this issue? Without seeing it on our test setups, there is not much we can do right now.

    BR,
    Eyal
  • Yep, the error is very hard to reproduce. I'm running around 8 systems in parallel and none of them has shown the error for at least 9 or 10 days. We are also using different access points to check the error is triggered by some faulty implementation of a wireless stack. But nothing to see until now. The one event on a friday evening seemed to be a one time event and hasn't shown up since then.

    The other error with the NULL-pointer is tracked down and fixed. It's a problem where the ieee80211_sub_if_data struct has a value, but the corresponding sched_scan_req is NULL. The fix seems to be available since 4.15rc1 although we have Implemented our own solution (we found the mainline fix a little bit too late...).

    But what about the changelog of the wlan firmware? We have asked several times for that and didn't get an answer until now.


    Best Regards,
    Frank Erdrich

  • Hi Frank,

    Thank you for the update.

    Can you point me to the patch that is part of 4.15RC1 that fixed this issue?

    I want to verify whether it is part of our latest internal release or not...

    Regarding the changelog, there is no specific change log for the wlan firmware (wl18xx-fw-4.bin) if that is what you are asking for.

    All the relevant information about the fixes that wer made to the version as whole is part of the release notes.

    R8.7-SP2:
    software-dl.ti.com/.../release_notes_R8_7_SP2.html
    R8.7_SP3:
    software-dl.ti.com/.../release_notes_R8_7_SP3.html



    What specifically are you looking for?

    Best Regards,
    Eyal

  • Hello Eyal,

    the NULL pointer was fixed here:
    git.kernel.org/.../scan.c

    That fix was not explicitely made for a NULL pointer problem, but it fixes it.

    Your Link to SP3 contains firmware version 8.9.0.0.76, but at the moment we are at 8.9.0.0.78. So there must be a changelog somewhere about what happened between 8.9.0.0.76 and 8.9.0.0.78. If you can't post that into this forum due to NDA stuff, we can exchange that information directly.
  • Hi Frank,

    Thank you for the pointer to this fix. Will look into it.

    8.9.0.0.78 is not released yet as an official release hence there is no documentation for it yet.
    In general the fix in this version is related to entering ELP (low power) during idle connection/slow ping.
    After a long time (could be hours/days) a firmware recovery was seen.
    The issue was not seen if ELP was disabled.

    Best Regards,
    Eyal