This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/WL1835MOD: Wl18xx linux packet data rate slow

Part Number: WL1835MOD
Other Parts Discussed in Thread: WL1271

Tool/software: Linux

Hi, We have a wl1805 on our own board. The wifi link is up with good quality:

~ # iw wlan0 info
Interface wlan0
        ifindex 3
        wdev 0x1
        addr 50:65:83:e1:6c:c6
        type managed
        wiphy 0
        channel 4 (2427 MHz), width: 40 MHz, center1: 2437 MHz
~ #
~ # iw wlan0 link
Connected to f8:d1:11:4c:70:fa (on wlan0)
        SSID: DemoAP
        freq: 2427
        RX: 2302 bytes (38 packets)
        TX: 1422 bytes (14 packets)
        signal: -45 dBm
        tx bitrate: 108.0 MBit/s MCS 5 40MHz

        bss flags:      short-preamble short-slot-time
        dtim period:    1
        beacon int:     100

However when I mounted a NFS volume and copied files, it gets only below 1M bits per second packet data transfer rate. The packet data rate is measured by size of files transfered over a period of time. It takes about 2 seconds to transfer a file of 200k bytes, and 170 seconds for a directory of 20M bytes of files. As a comparison, we compared it with a USB-wifi dongle that can achieve around 10M to 20M bit-per-second transfer rate performance.

How can this issue be debugged/troubleshooted?

The firmware version is:

$ strings wl18xx-fw-4.bin | grep -i rev
FRev 8.9.0.0.69
Rev 8.2.0.0.236

The driver is ported from Debian 9.2 for BeagleBoneBlack Wireless, with kernel 4.9.37-ti-r47.

  • File copy is not the way to measure wifi data throughput as it involves other parts of the system that are not directly related.
    What numbers are you getting when testing iperf UDP/TCP transfers between your platform and another platform(PC?) connected to the same AP?

    Best Regards,
    Eyalk
  • Eyalk,

    The iperf testing is not done yet. I'll update you once we get the numbers. 

    Before getting to iperf, are there configurations and parameters those can be tuned to impact performance?

    Thanks.

  • Not really.
    What would usually impact performance is wifi environment (noisy channel etc.) and this would show up in UDP iperf testing as well.

    Best Regards,
    Eyal
  • Eyal, Thanks for the great insight! It will take some time to build the user space. Once done I'll post the iperf result.
  • Hi, Eyalk,

    The test is made from the target through a wifi router to a pc that's on the copper port of router.

    Here is the tcp test result on the target side:

    -----------------------------------------------------------
    Server listening on 5201
    -----------------------------------------------------------
    Accepted connection from 192.168.7.101, port 46840
    [  5] local 192.168.7.199 port 5201 connected to 192.168.7.101 port 46842
    [ ID] Interval           Transfer     Bitrate
    [  5]   0.00-1.00   sec   762 KBytes  6.24 Mbits/sec
    [  5]   1.00-2.00   sec   178 KBytes  1.46 Mbits/sec
    [  5]   2.00-3.00   sec  69.3 KBytes   568 Kbits/sec
    [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec
    [  5]   4.00-5.00   sec  12.7 KBytes   104 Kbits/sec
    [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec
    [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec
    [  5]   7.00-8.00   sec  29.7 KBytes   243 Kbits/sec
    [  5]   8.00-9.00   sec  7.07 KBytes  57.9 Kbits/sec
    [  5]   9.00-10.00  sec  12.7 KBytes   104 Kbits/sec
    [  5]  10.00-11.00  sec  0.00 Bytes  0.00 bits/sec
    [  5]  11.00-11.51  sec  0.00 Bytes  0.00 bits/sec
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bitrate
    [  5]   0.00-11.51  sec  1.05 MBytes   763 Kbits/sec                  receiver
    -----------------------------------------------------------

    And the udp result:

    -----------------------------------------------------------
    Server listening on 5201
    -----------------------------------------------------------
    Accepted connection from 192.168.7.101, port 46848
    [  5] local 192.168.7.199 port 5201 connected to 192.168.7.101 port 41607
    [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
    [  5]   0.00-1.00   sec   122 KBytes   996 Kbits/sec  397191704.485 ms  0/86 (0%)
    [  5]   1.00-2.00   sec   129 KBytes  1.05 Mbits/sec  1117865.517 ms  0/91 (0%)
    [  5]   2.00-3.00   sec   129 KBytes  1.05 Mbits/sec  3148.219 ms  0/91 (0%)
    [  5]   3.00-4.00   sec   127 KBytes  1.04 Mbits/sec  12.769 ms  0/90 (0%)
    [  5]   4.00-5.00   sec   129 KBytes  1.05 Mbits/sec  3.998 ms  0/91 (0%)
    [  5]   5.00-6.00   sec   129 KBytes  1.05 Mbits/sec  5.753 ms  0/91 (0%)
    [  5]   6.00-7.00   sec   127 KBytes  1.04 Mbits/sec  3.557 ms  0/90 (0%)
    [  5]   7.00-8.00   sec   129 KBytes  1.05 Mbits/sec  3.292 ms  0/91 (0%)
    [  5]   8.00-9.00   sec   127 KBytes  1.04 Mbits/sec  3.881 ms  0/90 (0%)
    [  5]   9.00-10.00  sec   126 KBytes  1.03 Mbits/sec  1.680 ms  0/89 (0%)
    [  5]  10.00-10.05  sec  8.48 KBytes  1.38 Mbits/sec  3.902 ms  0/6 (0%)
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
    [  5]   0.00-10.05  sec  1.25 MBytes  1.04 Mbits/sec  3.902 ms  0/906 (0%)  receiver
    -----------------------------------------------------------

    Do you see some obvious problems ?

    Thanks.

  • Another problem: When it runs tcp test it sometimes panics on this backtrace:

    [ 816.852335] [<ffffff80007fb1b0>] wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [ 816.859700] [<ffffff80006e5934>] wl1271_ps_elp_wakeup+0x134/0x1f8 [wlcore]

    What could be the cause? A unstable power supply on the wl18xx link? Or just kernel driver needs to handle it nicely? Probably both need work?

    Thanks.

  • Hello,

    For the ELP related issue you have seen above please upgrade your firmware file (wl18x-fw-4.bin) to the latest one from here:

    git.kernel.org/.../ti-connectivity

    The latest is now Fw rev 8.9.0.0.79. You are still using FRev 8.9.0.0.69.

    With regards to UDP testing, what is the command you have used on the client side?
    Something like?:
    iperf -c 192.168.7.199 -b100M -t6000 -i2

    In case you did, the results don't look good.
    It looks as if you don't have your wl18xx-conf.bin file set correctly for the module used, number of antennas connected etc.

    Have you used wlconf for tuning your wl18xx-conf.bin for the hardware module you have used?:
    www.ti.com/.../swra489.pdf

    See section 2.

    Best Regards,
    Eyal
  • Hi,

    Do you have any update on this issue?
    Should we keep it open for now?

    BR,
    Eyal
  • Hi, Eyal, The system did not send a notification two days ago when you asked about updates. Now I just come back to ask more questions and noticed this.

    As to the ELP issue: I did the firmware update to 8.9.0.0.79 and it is more stable now.

    But we are still seeing the same dump once in a while. What could be a cause to the ELP issue? Could the noise on the SDIO bus cause it?

    I'm currently disabling the ELP by two hacks in ps.c like:

    int wl1271_ps_set_mode(struct wl1271 *wl, struct wl12xx_vif *wlvif,
    		       enum wl1271_cmd_ps_mode mode)
    {
    	int ret;
    	u16 timeout = wl->conf.conn.dynamic_ps_timeout;
    
    	switch (mode) {
    	case STATION_AUTO_PS_MODE:
    	case STATION_POWER_SAVE_MODE:
    #if 1
    if ( wl18xx_get_fw_elp_dis() ) {
            /* same as default: case */
    		wl1271_warning("trying to set ps to unsupported mode %d enforced elp dis", mode);
    		ret = -EINVAL;
            break;
    } else {
    #endif
        .../*original code of this switch case goes here*/
    #if 1
    }
    #endif
    

    and

    void wl1271_ps_elp_sleep(struct wl1271 *wl)
    {
    	struct wl12xx_vif *wlvif;
    	u32 timeout;
    
    	/* We do not enter elp sleep in PLT mode */
    	if (wl->plt)
    		return;
    
    	if (wl->sleep_auth != WL1271_PSM_ELP)
    		return;
    
    #if 1
    if ( wl18xx_get_fw_elp_dis() ) {
     return;
    }
    #endif
    

    Would this be a decent work-around? Can you share some detail about how the updated firmware improves here?

    From the source code, the dump shows up when the kernel wakes the wl18xx from sleep, it may dump the backtrace as a warning. Is that a harmless warning only, or does it have to be resolved in our design/development?

    Now we are seeing another issue: After running for a longer time, e.g. 1 hour or a few days, the linux console starts to print messages of corrupted packets in RX, like:

     [ 3448.375861] wlcore: WARNING corrupted packet in RX: status: 0x1 len: 183
     [ 3452.757950] wlcore: WARNING corrupted packet in RX: status: 0x3 len: 60
     [ 3455.214661] wlcore: WARNING corrupted packet in RX: status: 0x3 len: 60
     [ 3458.906306] wlcore: WARNING corrupted packet in RX: status: 0x3 len: 241
    

    Do you think the two issues are because the same cause?

    I have not done the wlconf part. Could that be a cause to the above two issues?

    Looking forward to your further guidance!

    Thank you for the help!

  • A little more explanation to my code: the wl18xx_get_fw_elp_dis() function would always return 1 in my code. I set up a module parameter so I can switch it on and off to test the different logics.
  • Can you share the dump that you still see once in a while? how often do you see it?

    If you want to disable elp for testing it is better to this this way:
    processors.wiki.ti.com/.../WL18xx_Driver_Debug

    No need for changing the code itself.

    The corrupted packets messages that you are seeing is only a warning indicating a packet that was no received correctly and is being discarded. It could be even noise or a different interference. It is not a fatal error.

    Best Regards,
    Eyal
  • Hi, Eyal, So the corrupted packets are on the air, not on the sdio bus. Correct?

    I'll upload a full dump shortly. If ELP is enabled we usually see a dump in a few minutes, but sometimes in a few hours. When ELP is disabled, usually we don't see the dump, but once saw it after running over a whole weekend.

    By the way, fyi, the udp and tcp rate can now run up to 36Mbps, depending on which router it's connecting to. We believe it is mostly the board trace and antenna connection that we made a lot improvements around that area. Appreciate your help on this thread that helped us a lot.

    Thank you.
  • A full dump screen copy:

    ~ # ping 192.168.7.100
    PING 192.168.7.100 (192.168.7.100): 56 data bytes
    64 bytes from 192.168.7.100: seq=0 ttl=64 time=115.624 ms
    64 bytes from 192.168.7.100: seq=1 ttl=64 time=9.353 ms
    64 bytes from 192.168.7.100: seq=2 ttl=64 time=7.203 ms
    64 bytes from 192.168.7.100: seq=3 ttl=64 time=6.283 ms
    64 bytes from 192.168.7.100: seq=4 ttl=64 time=30.143 ms
    64 bytes from 192.168.7.100: seq=5 ttl=64 time=25.253 ms
    64 bytes from 192.168.7.100: seq=6 ttl=64 time=9.446 ms
    64 bytes from 192.168.7.100: seq=7 ttl=64 time=8.553 ms
    [  261.260782] ------------[ cut here ]------------
    [  261.265424] WARNING: CPU: 0 PID: 14 at drivers/net/wireless/ti/wlcore/sdio.c:160 wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [  261.276885] Modules linked in: wlcore_sdio wl18xx wlcore wlcore_sdio_cond
    [  261.283683]
    [  261.285170] CPU: 0 PID: 14 Comm: kworker/u2:1 Not tainted 4.9.37 #32
    [  261.291511] Hardware name: DEMO Board (DT)
    [  261.297578] Workqueue: phy0 wl1271_tx_work [wlcore]
    [  261.302448] task: ffffffc012492400 task.stack: ffffffc012558000
    [  261.308359] PC is at wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [  261.314617] LR is at wl12xx_sdio_raw_write+0x7c/0x168 [wlcore_sdio]
    [  261.320872] pc : [] lr : [] pstate: 00000005
    [  261.328254] sp : ffffffc01255bc90
    [  261.331557] x29: ffffffc01255bc90 x28: 0000000000000000
    [  261.336864] x27: 0000000000000000 x26: 0000000000000000
    [  261.342171] x25: ffffff8000867880 x24: 0000000000000000
    [  261.347478] x23: 0000000000000004 x22: ffffffc011d0b080
    [  261.352785] x21: ffffffc01196f810 x20: 000000000001fffc
    [  261.358092] x19: ffffffc011d16800 x18: 0000000000000007
    [  261.363399] x17: 0000000000000001 x16: 0000000000000019
    [  261.368706] x15: 0000000000000033 x14: 000000000000004c
    [  261.374013] x13: ffffff8008917518 x12: 0000000000000000
    [  261.379320] x11: ffffffc013bbff80 x10: 0000000000000820
    [  261.384626] x9 : ffffffc012558000 x8 : ffffffc012492c80
    [  261.389933] x7 : 0000003cd45a4ea0 x6 : 0000000000000000
    [  261.395240] x5 : ffffffc013bbff80 x4 : 0000000000000000
    [  261.400547] x3 : 0000000000000000 x2 : 00000000ffffffac
    [  261.405854] x1 : 0000000000000000 x0 : 00000000fffffff3
    [  261.411160]
    [  261.412642] ---[ end trace 1d540c2d21d6a5f7 ]---
    [  261.417249] Call trace:
    [  261.419687] Exception stack(0xffffffc01255bab0 to 0xffffffc01255bbe0)
    [  261.426117] baa0:                                   ffffffc011d16800 0000007fffffffff
    [  261.433935] bac0: ffffffc01255bc90 ffffff80008b21b0 0000000000000005 000000000000003d
    [  261.441753] bae0: ffffffc01255bb78 ffffff80087d69d0 ffffff80089d5f20 ffffffc011f42000
    [  261.449571] bb00: ffffffc01255bbb8 0000000000000000 ffffffc01255bb50 ffffff80084f9450
    [  261.457389] bb20: ffffffc01255bc20 0000000000000000 ffffffc011f42000 ffffffc011d0b080
    [  261.465207] bb40: 0000000000000004 ffffff80083f11a0 ffffffc01255bbf0 ffffff8008504c84
    [  261.473025] bb60: 00000000fffffff3 0000000000000000 00000000ffffffac 0000000000000000
    [  261.480843] bb80: 0000000000000000 ffffffc013bbff80 0000000000000000 0000003cd45a4ea0
    [  261.488661] bba0: ffffffc012492c80 ffffffc012558000 0000000000000820 ffffffc013bbff80
    [  261.496479] bbc0: 0000000000000000 ffffff8008917518 000000000000004c 0000000000000033
    [  261.504299] [] wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [  261.511658] [] wl1271_ps_elp_wakeup+0x134/0x1f8 [wlcore]
    [  261.518580] [] wl1271_tx_work+0x24/0x70 [wlcore]
    [  261.524757] [] process_one_work+0x1d0/0x390
    [  261.530493] [] worker_thread+0x48/0x4b0
    [  261.535881] [] kthread+0xd0/0xe8
    [  261.540664] [] ret_from_fork+0x10/0x30
    [  261.546438] wl1271_sdio mmc2:0001:2: sdio write failed (-84)
    [  261.552156] ------------[ cut here ]------------
    [  261.556830] WARNING: CPU: 0 PID: 14 at drivers/net/wireless/ti/wlcore/main.c:877 wl12xx_queue_recovery_work.part.24+0x58/0x60 [wlcore]
    [  261.568899] Modules linked in: wlcore_sdio wl18xx wlcore wlcore_sdio_cond
    [  261.575695]
    [  261.577182] CPU: 0 PID: 14 Comm: kworker/u2:1 Tainted: G        W       4.9.37 #32
    [  261.584738] Hardware name: DEMO Board (DT)
    [  261.590795] Workqueue: phy0 wl1271_tx_work [wlcore]
    [  261.595665] task: ffffffc012492400 task.stack: ffffffc012558000
    [  261.601631] PC is at wl12xx_queue_recovery_work.part.24+0x58/0x60 [wlcore]
    [  261.608552] LR is at wl12xx_queue_recovery_work+0x1c/0x28 [wlcore]
    [  261.614721] pc : [] lr : [] pstate: 60000005
    [  261.622102] sp : ffffffc01255bcb0
    [  261.625406] x29: ffffffc01255bcb0 x28: 0000000000000000
    [  261.630713] x27: 00000000ffffffac x26: 0000000000000000
    [  261.636020] x25: ffffff8000867880 x24: ffffffc011d79480
    [  261.641328] x23: 00000000ffff66bc x22: ffffff8008916a80
    [  261.646635] x21: ffffffc011d794c0 x20: 00000000ffffffac
    [  261.651942] x19: ffffffc011d79440 x18: ffffff8088972d77
    [  261.657250] x17: 0000000000000001 x16: 0000000000000019
    [  261.662556] x15: 00000000fffffffe x14: 0000000000000005
    [  261.667863] x13: ffffff8008972d85 x12: 0000000000000007
    [  261.673170] x11: 0000000000000002 x10: 000000000000021c
    [  261.678476] x9 : 0000000000000040 x8 : 282064656c696166
    [  261.683783] x7 : 206574697277206f x6 : ffffff8008972db7
    [  261.689090] x5 : 000000000000000a x4 : 0000000000000000
    [  261.694397] x3 : 0000000000004000 x2 : 0000000000004009
    [  261.699704] x1 : 0000000000000002 x0 : 0000000000004009
    [  261.705010]
    [  261.706492] ---[ end trace 1d540c2d21d6a5f8 ]---
    [  261.711098] Call trace:
    [  261.713536] Exception stack(0xffffffc01255bad0 to 0xffffffc01255bc00)
    [  261.719966] bac0:                                   ffffffc011d79440 0000007fffffffff
    [  261.727784] bae0: ffffffc01255bcb0 ffffff80006d9bf8 0000000060000005 000000000000003d
    [  261.735602] bb00: ffffffc01255bbe0 00000000ffffffd8 4554535953425553 44006f6964733d4d
    [  261.743420] bb20: 732b3d4543495645 32636d6d3a6f6964 00323a313030303a ffffffc011d0b080
    [  261.751238] bb40: 0000000000000004 ffffff80083f11a0 ffffffc01255bbf0 ffffff8008504c84
    [  261.759056] bb60: 00000000fffffff3 0000000000000000 00000000ffffffac 0000000000000000
    [  261.766874] bb80: 0000000000004009 0000000000000002 0000000000004009 0000000000004000
    [  261.774692] bba0: 0000000000000000 000000000000000a ffffff8008972db7 206574697277206f
    [  261.782510] bbc0: 282064656c696166 0000000000000040 000000000000021c 0000000000000002
    [  261.790327] bbe0: 0000000000000007 ffffff8008972d85 0000000000000005 00000000fffffffe
    [  261.798203] [] wl12xx_queue_recovery_work.part.24+0x58/0x60 [wlcore]
    [  261.806165] [] wl12xx_queue_recovery_work+0x1c/0x28 [wlcore]
    [  261.813433] [] wl1271_ps_elp_wakeup+0xe0/0x1f8 [wlcore]
    [  261.820268] [] wl1271_tx_work+0x24/0x70 [wlcore]
    [  261.826444] [] process_one_work+0x1d0/0x390
    [  261.832179] [] worker_thread+0x48/0x4b0
    [  261.837567] [] kthread+0xd0/0xe8
    [  261.842349] [] ret_from_fork+0x10/0x30
    [  261.848238] wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.79
    [  261.855066] wlcore: down
    [  261.857609] wlcore: down
    
    [  261.860172]   sdiodbg wlcore io.h   224 wl1271_power_off power OFF
    [  261.866528]   sdiodbg wlcore sdio.c   219 wl12xx_sdio_set_power power OFF
    
    [  261.886613] ieee80211 phy0: Hardware restart was requested
    
    [  261.913587]   sdiodbg wlcore io.h   241 wl1271_power_on power   ON
    [  261.919856]   sdiodbg wlcore sdio.c   219 wl12xx_sdio_set_power power   ON
    
    [  262.331545] wlcore: using inverted interrupt logic: 2
    
    ^C
    --- 192.168.7.100 ping statistics ---
    11 packets transmitted, 8 packets received, 27% packet loss
    round-trip min/avg/max = 6.283/26.482/115.624 ms
    ~ # 
    
    [  272.848592] mmc2: Timeout waiting for hardware interrupt.
    [  272.853994] sdhci: =========== REGISTER DUMP (mmc2)===========
    [  272.859817] sdhci: Sys addr: 0x0000003b | Version:  0x00000005
    [  272.865638] sdhci: Blk size: 0x00007100 | Blk cnt:  0x00000000
    [  272.871458] sdhci: Argument: 0xac000040 | Trn mode: 0x00000023
    [  272.877279] sdhci: Present:  0x03e201f6 | Host ctl: 0x0000001b
    [  272.883100] sdhci: Power:    0x0000000f | Blk gap:  0x00000000
    [  272.888920] sdhci: Wake-up:  0x00000000 | Clock:    0x0000000f
    [  272.894741] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
    [  272.900562] sdhci: Int enab: 0x03ff000b | Sig enab: 0x03ff000b
    [  272.906383] sdhci: ACMD err: 0x00000000 | Slot int: 0x00000000
    [  272.912204] sdhci: Caps:     0x3f6ec881 | Caps_1:   0x08000077
    [  272.918024] sdhci: Cmd:      0x0000353a | Max curr: 0x00000000
    [  272.923844] sdhci: Host ctl2: 0x00000080
    [  272.927757] sdhci: ADMA Err: 0x00000030 | ADMA Ptr: 0x0000000053c7720c
    [  272.934271] sdhci: ===========================================
    [  272.940280] ------------[ cut here ]------------
    [  272.944908] WARNING: CPU: 0 PID: 842 at drivers/net/wireless/ti/wlcore/sdio.c:160 wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [  272.956456] Modules linked in: wlcore_sdio wl18xx wlcore wlcore_sdio_cond
    [  272.963255]
    [  272.964742] CPU: 0 PID: 842 Comm: kworker/0:2 Tainted: G        W       4.9.37 #32
    [  272.972298] Hardware name: DEMO Board (DT)
    [  272.978305] Workqueue: events_freezable ieee80211_restart_work
    [  272.984130] task: ffffffc011f83000 task.stack: ffffffc011f6c000
    [  272.990041] PC is at wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [  272.996299] LR is at wl12xx_sdio_raw_write+0x7c/0x168 [wlcore_sdio]
    [  273.002554] pc : [] lr : [] pstate: 00000005
    [  273.009936] sp : ffffffc011f6fab0
    [  273.013239] x29: ffffffc011f6fab0 x28: ffffff800a05240c
    [  273.018547] x27: 0000000000000003 x26: ffffff8000867880
    [  273.023854] x25: ffffff800a05240c x24: 0000000000000000
    [  273.029161] x23: 0000000000004000 x22: ffffffc011b3c000
    [  273.034468] x21: ffffffc01196f810 x20: 0000000000000000
    [  273.039775] x19: ffffffc011d16800 x18: 0000000000000000
    [  273.045082] x17: 0000007fa0473138 x16: ffffff800818b190
    [  273.050388] x15: 0000000000000010 x14: 0000000000000001
    [  273.055694] x13: ffffff8008917518 x12: 0000000000000000
    [  273.061002] x11: ffffff80088ea000 x10: 0000000000000820
    [  273.066308] x9 : ffffffc011f6c000 x8 : ffffffc011f83880
    [  273.071615] x7 : 0000003f8c815044 x6 : 0000000000000000
    [  273.076922] x5 : ffffffc013bbff80 x4 : 0000000000000000
    [  273.082229] x3 : 0000000000000000 x2 : 00000000ffffff92
    [  273.087535] x1 : 0000000000000000 x0 : 00000000fffffff3
    [  273.092842]
    [  273.094324] ---[ end trace 1d540c2d21d6a5f9 ]---
    [  273.098930] Call trace:
    [  273.101369] Exception stack(0xffffffc011f6f8d0 to 0xffffffc011f6fa00)
    [  273.107799] f8c0:                                   ffffffc011d16800 0000007fffffffff
    [  273.115618] f8e0: ffffffc011f6fab0 ffffff80008b21b0 0000000000000005 000000000000003d
    [  273.123436] f900: ffffff800a05240c ffffffc011b3c000 a7ffb00400000035 0000000000001000
    [  273.131254] f920: ffffffbf0046cf02 0000400000000000 0000000051b3c000 0000000000004000
    [  273.139071] f940: ac00004000000035 0000000000001000 0000000000000000 00000000000001b5
    [  273.146889] f960: 0000000000000000 0000000000000000 ffffffc011f6f980 ffffffc011f6f9c8
    [  273.154707] f980: 00000000fffffff3 0000000000000000 00000000ffffff92 0000000000000000
    [  273.162524] f9a0: 0000000000000000 ffffffc013bbff80 0000000000000000 0000003f8c815044
    [  273.170342] f9c0: ffffffc011f83880 ffffffc011f6c000 0000000000000820 ffffff80088ea000
    [  273.178160] f9e0: 0000000000000000 ffffff8008917518 0000000000000001 0000000000000010
    [  273.185980] [] wl12xx_sdio_raw_write+0xb8/0x168 [wlcore_sdio]
    [  273.193347] [] wlcore_boot_upload_firmware+0x1c4/0x438 [wlcore]
    [  273.200855] [] wl18xx_boot+0x7a0/0xb58 [wl18xx]
    [  273.206996] [] wl1271_op_add_interface+0x638/0x8d0 [wlcore]
    [  273.214122] [] drv_add_interface+0x34/0x78
    [  273.219774] [] ieee80211_reconfig+0x3dc/0xad8
    [  273.225684] [] ieee80211_restart_work+0x80/0xb8
    [  273.231772] [] process_one_work+0x1d0/0x390
    [  273.237507] [] worker_thread+0x48/0x4b0
    [  273.242896] [] kthread+0xd0/0xe8
    [  273.247679] [] ret_from_fork+0x10/0x30
    [  273.253429] wl1271_sdio mmc2:0001:2: sdio write failed (-110)
    
    [  273.259188]   sdiodbg wlcore io.h   224 wl1271_power_off power OFF
    [  273.265543]   sdiodbg wlcore sdio.c   219 wl12xx_sdio_set_power power OFF
    [  274.803587]   sdiodbg wlcore io.h   241 wl1271_power_on power   ON
    [  274.809854]   sdiodbg wlcore sdio.c   219 wl12xx_sdio_set_power power   ON
    [  280.935292]   sdiodbg wlcore io.h   247 wl1271_power_on power   ON  failed
    [  280.942168]   sdiodbg wlcore io.h   224 wl1271_power_off power OFF
    [  280.969585]   sdiodbg wlcore io.h   241 wl1271_power_on power   ON
    [  280.975852]   sdiodbg wlcore sdio.c   219 wl12xx_sdio_set_power power   ON
    

    The code around line 160 in sdio.c:

       128  static int __must_check wl12xx_sdio_raw_write(struct device *child, int addr,
       129                                                void *buf, size_t len, bool fixed)
       130  {
       131          int ret;
       132          struct wl12xx_sdio_glue *glue = dev_get_drvdata(child->parent);
       133          struct sdio_func *func = dev_to_sdio_func(glue->dev);
       134
       135          sdio_claim_host(func);
       136
       137          if (unlikely(dump)) {
       138                  printk(KERN_DEBUG "wlcore_sdio: WRITE to 0x%04x\n", addr);
       139                  print_hex_dump(KERN_DEBUG, "wlcore_sdio: WRITE ",
       140                                  DUMP_PREFIX_OFFSET, 16, 1,
       141                                  buf, len, false);
       142          }
       143
       144          if (unlikely(addr == HW_ACCESS_ELP_CTRL_REG)) {
       145                  sdio_f0_writeb(func, ((u8 *)buf)[0], addr, &ret);
       146                  dev_dbg(child->parent, "sdio write 52 addr 0x%x, byte 0x%02x\n",
       147                          addr, ((u8 *)buf)[0]);
       148          } else {
       149                  dev_dbg(child->parent, "sdio write 53 addr 0x%x, %zu bytes\n",
       150                          addr, len);
       151
       152                  if (fixed)
       153                          ret = sdio_writesb(func, addr, buf, len);
       154                  else
       155                          ret = sdio_memcpy_toio(func, addr, buf, len);
       156          }
       157
       158          sdio_release_host(func);
       159
       160          if (WARN_ON(ret))
       161                  dev_err(child->parent, "sdio write failed (%d)\n", ret);
       162
       163          return ret;
       164  }
    

    The code around line 877 in main.c:

       872  void wl12xx_queue_recovery_work(struct wl1271 *wl)
       873  {
       874          /* Avoid a recursive recovery */
       875          if (wl->state == WLCORE_STATE_ON) {
       876                  WARN_ON(!test_bit(WL1271_FLAG_INTENDED_FW_RECOVERY,
       877                                    &wl->flags));
       878
       879                  wl->state = WLCORE_STATE_RESTARTING;
       880                  set_bit(WL1271_FLAG_RECOVERY_IN_PROGRESS, &wl->flags);
       881                  wl1271_ps_elp_wakeup(wl);
       882                  wlcore_disable_interrupts_nosync(wl);
       883                  ieee80211_queue_work(wl->hw, &wl->recovery_work);
       884          }
       885  }
    

    I guess now it becomes two questions:

    [1] Why did it fail at line 160 in sdio.c when writing raw sdio.

    [2] And how to recover from it. As to the recovery part, I've not hooked up the function wl1271_power_off() to the WL_EN pin. Do we need to do that? Is that required for the recovery to work properly? We currently created a separate kernel module wlcore_sdio_cond that is inserted at the beginning to enable WL_EN pin once and keep it high.

  • Hi,

    Ok, in your dump I see an sdio acess error which leads to a recovery. The crash itself could be related to the sdio bus itself.
    However the main issue is that recovery is not sucessfull as it seems like full reset of the wl18xx was not done correctly.

    Copule of things to try:
    1. When everything is working ok, can you try bringing the interface down and then back up:
    ifconfig wlan0 down
    ifconfig wlan0 up
    This will verify that your wlan_enable gpio is configured controlled ok from the driver.

    2. Can you try limitting the sdio bus speed, lets say to 20Mhz and see if it helps?
    For limiting the sdio clock speed you can just use the following in you .dts file:
    elixir.bootlin.com/.../mmc.txt

    Add:
    max-frequency = <20000000>;
    to the mmc node in your board device tree file.
    You can of course try lower values as well instead of the 20Mhz above.

    3. Lastly, if this doesn't help. Can you try to disable ELP and see if the crash is still seen?:
    processors.wiki.ti.com/.../WL18xx_Driver_Debug

    Best Regards,
    Eyal
  • Hi, Eyal, I hooked up the power on-off control. And the recovery part works now. As to the three things suggested to try in your last message:

    1. Yes now it can recovery properly. Only that we do not have ifup/ifdown set up, I used "ip link set up dev wlan0" and "ip link set down dev wlan0" instead.

    2. No we can not change clock to 20MHz due to a host hardware limitation. That's ok not to try this.

    3. Yes disabling ELP as suggested does help it a lot.

    It can recover from the ELP dumps with a small glitch on the traffic path (a few seconds or up to a minute data disruption). Given all the very helpful suggestion from you and the results that we are achieving, the only thing left for us to investigate further is to improve the signal integrity on the SDIO bus so as to eliminate the raw_write errors. Now we are seeing in average about two dumps per hour.

    Thanks for the help!