This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/66AK2H12: IP stack freeze when trying to use NFS

Part Number: 66AK2H12

Tool/software: Linux

Hi,  

I'm working on a system that has a 66AK2H12 device connected over 10GbE.
My system is based on TI's kernel and buildRoot for creating the filesystem.

I'm witnessing an issue when trying to use NFS from the Hawking (mount an NFS shared folder that resides in remote computer).

Mounting the NFS seems to work fine, but at some point (After sending a few megabytes of data) we lose any TCP/IP connectivity with the ARM (Even Ping doesn’t work). 

Looking at the process list, we see that the writing process is stuck in ‘D’ state.

After that happened, I noticed that the send-Q holds a huge amount of data.

In addition, when the NFS is stuck, trying to shut down the NIC, produces some info about Packet DMA in the kernel log.

All of the info can be seen bellow.

 

I tried the following, and none of them helped:

  • Move to MTU of 1500 in the ARM (Usually 9014)
  • Change the NFS configuration to rsize=wsize=1024  
    • This seems to improve the situation a bit. The test doesn’t always fail in the first shot, but it still fail quickly.
  • Use NFSv3, NFSv2
  • Use NFS nolock configuration
  • Put the NFS server on another similar device instead of a Windows computer

 

To reproduce the problem, I’m running the following command:
head -c  10m /dev/urandom > /mnt/NFSShare/file

 The only thing seems to help, is working in small chunks, for example:
for i in `seq 10000`; do head -c  1k /dev/urandom > /mnt/NFSShare/file; done
In this way I was able to write more than 100M

  

 

 

==========================================  INFO ===============================================

 

Netstat – after getting stuck

======================

Active Internet connections (w/o servers)

Proto Recv-Q Send-Q Local Address           Foreign Address         State

tcp        0  26064 (null):931              (null):nfsd             ESTABLISHED

 

 

mount information

===============

15.40.1.192:/NfsShared on /home/root/SharedForArm type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=15.40.1.192,mountvers=3,mountproto=tcp,local_lock=none,addr=15.40.1.192)

 

 

Console output when doing “ifconfing eth0 down” after getting stuck

=====================================================

[420002.029780] ------------[ cut here ]------------

[420002.034508] WARNING: at drivers/dma/keystone-pktdma.c:1094 chan_destroy+0x2c4/0x2d0()

[420002.042461] chan_stop: pool 0 deficit 2047 != depth 2048

[420002.047881] Modules linked in: uio_module_drv(O) mcd(O) cmemk(O)

[420002.054025] CPU: 3 PID: 1298 Comm: ifconfig Tainted: G           O 3.10.72 #1

[420002.061289] [<c0015014>] (unwind_backtrace+0x0/0xec) from [<c00117ac>] (show_stack+0x10/0x14)

[420002.069954] [<c00117ac>] (show_stack+0x10/0x14) from [<c0021200>] (warn_slowpath_common+0x54/0x6c)

[420002.079034] [<c0021200>] (warn_slowpath_common+0x54/0x6c) from [<c0021248>] (warn_slowpath_fmt+0x30/0x40)

[420002.088733] [<c0021248>] (warn_slowpath_fmt+0x30/0x40) from [<c029df58>] (chan_destroy+0x2c4/0x2d0)

[420002.098075] [<c029df58>] (chan_destroy+0x2c4/0x2d0) from [<c029b354>] (dma_release_channel+0x24/0x94)

[420002.107425] [<c029b354>] (dma_release_channel+0x24/0x94) from [<c031eb2c>] (netcp_ndo_stop+0x128/0x1e0)

[420002.116959] [<c031eb2c>] (netcp_ndo_stop+0x128/0x1e0) from [<c0394d50>] (__dev_close_many+0x80/0xc4)

[420002.126223] [<c0394d50>] (__dev_close_many+0x80/0xc4) from [<c0394db8>] (__dev_close+0x24/0x38)

[420002.135055] [<c0394db8>] (__dev_close+0x24/0x38) from [<c039a51c>] (__dev_change_flags+0x94/0x128)

[420002.144144] [<c039a51c>] (__dev_change_flags+0x94/0x128) from [<c039a61c>] (dev_change_flags+0x10/0x48)

[420002.153675] [<c039a61c>] (dev_change_flags+0x10/0x48) from [<c03ebd10>] (devinet_ioctl+0x65c/0x734)

[420002.162854] [<c03ebd10>] (devinet_ioctl+0x65c/0x734) from [<c03856c0>] (sock_ioctl+0x1c0/0x294)

[420002.171684] [<c03856c0>] (sock_ioctl+0x1c0/0x294) from [<c00e0058>] (do_vfs_ioctl+0x3fc/0x5bc)

[420002.180427] [<c00e0058>] (do_vfs_ioctl+0x3fc/0x5bc) from [<c00e0250>] (SyS_ioctl+0x38/0x60)

[420002.188900] [<c00e0250>] (SyS_ioctl+0x38/0x60) from [<c000d920>] (ret_fast_syscall+0x0/0x30)

[420002.197459] ---[ end trace d5dca14823c2b4f6 ]---

[420002.202962] dma dma3chan2: xgerx0 leaked descriptor 701

[420003.489991] dma dma3chan0: xgetx0 leaked descriptor 473

[420003.495320] dma dma3chan0: xgetx0 leaked descriptor 483

  • Hi Tom,

    This appears to be some kind of memory corruption; possibly trying to access memory regions not visible from the netcp. This is visible from the modules linked in:
    [420002.034508] WARNING: at drivers/dma/keystone-pktdma.c:1094 chan_destroy+0x2c4/0x2d0()

    [420002.042461] chan_stop: pool 0 deficit 2047 != depth 2048

    [420002.047881] Modules linked in: uio_module_drv(O) mcd(O) cmemk(O)

    However, I am not sure why this is happening. Can you elaborate on the following questions:

    1. Are you using K2H EVM or is this a custom board? If custom board, how is it different from the K2H EVM?
    2. Which Linux SDK is this? Could you list the modifications if any?
    3. Could you provide the exact steps to try & reproduce your issue on K2H EVM?

    Best Regards,
    Yordan
  • Hi,

    1) We don't use the EVM, we are using a custom board that has a few 66AK2H12 devices on it and a 10G Ethernet switch that connects them.
    2) The SDK is a proprietary build, based on TI's kernel and a filesystem based on BuildRoot.    
    I don't think that we have any kernel modifications from the original TI kernel. I'll verify that again.
    3) To reproduce the issue, you should:

    1. Build the kernel with NFS support
    2. Build the user-space NFS support  (nfs-utils)
    3. Connect the EVM over Ethernet to a system that has an NFS shared directory
    4. On the EVM ARM, mount the shared directory  (mount -t nfs 15.40.1.192:/NfsShared  /mnt/NFSShare/file)
    5. run the following command: "head -c  10m /dev/urandom > /mnt/NFSShare/file"

  • Hi,

    Some more information about the kernel:

    the kernel is based on http://git.ti.com/keystone-linux/linux/commits/v3.10.72/master with http://git.ti.com/keystone-linux/linux/commits/rio-dev-dio merged into it for rapidio support.

    The DTS configuration was modified to improve 10GbE performance. The modification is based  on suggestion we got from Ti (Eric Ding). The path with the modifications is attached.

    10gbe_patch.txt
    commit 4e142870256c15b83733bd285fa8779468a435e2
    Author: Martijn de Gouw <martijn.de.gouw@prodrive-technologies.com>
    Date:   Tue Apr 12 11:19:47 2016 +0200
    
        Keystone configs to achieve 2+ Gbps on 10gbe
        
        unofficial TI patch: e777edebbbfd274420fd8cd3fcd7e8999bf5c929
    
    diff --git a/arch/arm/boot/dts/k2hk.dtsi b/arch/arm/boot/dts/k2hk.dtsi
    index f2634457a9e..26f69c9e4d0 100644
    --- a/arch/arm/boot/dts/k2hk.dtsi
    +++ b/arch/arm/boot/dts/k2hk.dtsi
    @@ -289,8 +289,8 @@
     						<0 51 0x804>,
     						<0 52 0x204>,
     						<0 53 0xf04>,
    -						<0 54 0xf04>,
    -						<0 55 0xf04>,
    +						<0 54 0x104>,	/* xgerx0 */
    +						<0 55 0x104>,	/* xgetx1 */
     						<0 56 0xf04>,
     						<0 57 0xf04>,
     						<0 58 0xf04>,
    @@ -350,7 +350,7 @@
     				};
     				region-14 {
     					id = <14>;
    -					values	= <2048 128>;	/* num_desc desc_size */
    +					values	= <8192 128>;	/* num_desc desc_size */
     					link-index = <0x7800>;
     				};
     				region-15 {
    @@ -1217,7 +1217,7 @@
     					transmit;
     					label		= "xgetx1";
     					pool		= "pool-xge";
    -					submit-queue	= <8753>;
    +					submit-queue	= <8752>;
     					/* complete-queue = <8715>; */
     					/* debug; */
     					/* channel = <0>; */
    diff --git a/arch/arm/boot/dts/keystone.dtsi b/arch/arm/boot/dts/keystone.dtsi
    index 1290bddf3db..01c90b0b23b 100644
    --- a/arch/arm/boot/dts/keystone.dtsi
    +++ b/arch/arm/boot/dts/keystone.dtsi
    @@ -558,7 +558,7 @@
     					region-id = <15>;
     				};
     				pool-xge {
    -					values = <1024 128>;
    +					values = <8192 128>;
     					region-id = <14>;
     				};
     				pool-crypto {
    diff --git a/arch/arm/boot/dts/pdpu-net.dtsi b/arch/arm/boot/dts/pdpu-net.dtsi
    index 7fb98a53112..440451920d0 100644
    --- a/arch/arm/boot/dts/pdpu-net.dtsi
    +++ b/arch/arm/boot/dts/pdpu-net.dtsi
    @@ -30,8 +30,8 @@ netcpx: netcp@2f00000 {
     	interfaces {
     		xinterface0: interface-0 {
     			rx-channel = "xgerx0";
    -			rx-queue-depth = <128 128 0 0>;
    -			rx-buffer-size = <1500 4096 0 0>;
    +			rx-queue-depth = <2048 128 0 0>;
    +			rx-buffer-size = <1536 4096 0 0>;
     			local-mac-address = [02 18 31 7e 3e 5e];
     		};
     	};
    @@ -65,7 +65,7 @@ netcpx: netcp@2f00000 {
     			interface-0 {
     				slave_port = <0>;
     				tx-channel = "xgetx0";
    -				tx_queue_depth = <128>;
    +				tx_queue_depth = <512>;
     			};
     		};
     
    

  • Hi, Tom,

    I am Just trying to exclude NFS from the scenario. Will the issue happen if running outbound tcp iperf ? Iperf should generate similar traffic to the server through xge port if not more. Do you agree?


    Rex
  • Hi Rex, We are using the ethernet in our board extensively in the last year and a half, and didn't see such behavior. We've also made a lot of iperf tests and didn't have such a problem. I have a feeling that the problem is related to the NFS itself.
  • Hi, Tom,

    I believe MTU size of 1500 still causes fragmentation, could you try with MTU size of 1450 to see if less fragmentation helps? Could you also try to increase the 1K size in the loop to see what size will break?

    Our experience with 10GbE on PC is that it is slower than K2H. We suspect the receiving size could not handle that traffic and causing the backlog on the K2H. To narrow down the issue, we'd like to find the minimal size in the loop scenario which causes the NFS failed, then try the same size between 2 K2H to see if it breaks.

    Rex
  • Hi Rex,

    We've tried with MTU size of 1450 and got the same behavior.
    We also found out that it fails after a while in 1K, it just takes longer.

    The problem isn't related to the PC, since we got the same results on a K2H<->K2H setup (One as server, the other as client).

    Can you try to reproduce the issue ?

  • Tom,

    Let me set up my KS2 with XGE and give it a try.

    Rex
  • Tom,

    I was able to mount NFS on XGE, but when doing "ls", the prompt doesn't come back. After a while, it prints "Server is not responding and till trying". The ping still goes through XGE. Have you encountered this before? Any way to pass it? I verified the NFS setup through GbE port, and was able to create a file on it, so the NFS set up is correct.

    Rex
  • Hi Rex,

    We didn't have any problem mounting the NFS.

    Are the 2 interfaces (GbE and XGE) connected at the same time?  Maybe there's some problem with the IP/Subnet configuration of them?

  • Hi, Tom,

    They are different networks. GbE is on corporate, and XGE on private network. I didn't apply the patch from Eric. That shouldn't be the cause. I'll take a closer look to see what may be causing it.

    Rex

  • Tom,

    I tried different approach by using another KS2 platform, K2E, and different release, ProcSDk 4.0.0.4. I was able to mount and even create a file with size of 10M. However, when moving back to K2H with ProcSDk 4.0, I am still having issue using K2H and the setup isn't stable enough to test NFS over XGE. I'll check with MCSDK on K2E.and see how I can get the setup running.

    Rex

  • Tom,

    We can't find XGB port on the blade. How would it work using the blade we have?

    Another thought on the issue and possible the reason I can't reproduce it. I just noticed that I am not getting the max throughput out of XGE with MTU 9000. I am checking to see what could be the problem. Once I get the max throughput, then I'll retry on KS2 platform.

    Rex
  • Tom,

    We received the RTM board, but want to run it with you to be sure it is connected correctly. In the lab setup, we have a power extension board connected to the board. I assume the RTM should be connected to the power board, and the blue 16pin aligned with the blue connector on the power board. Is it correct?

    Rex
  • The RTM should be connected back-to-back with the 6PU board that you have.

    There is a kind of gray connector with many pins that connect to a similar connector on the 6PU.

    If it still not clear how to connect, please send me an email and I'll reply with images of how to connect

  • This thread is now locked by the system. I'll close it.