This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/PROCESSOR-SDK-AM335X: Ethernet low performance issue

Part Number: PROCESSOR-SDK-AM335X

Tool/software: Linux

Hello,

I'm currently using the AR8035 gige phy with the am335x and using kernel v4.4. Network communication works fine, but I noticed that with iperf testing, I'm not approaching anywhere near the advertised speed: http://processors.wiki.ti.com/index.php/AM335x-PSP_04.06.00.08_Features_and_Performance_Guide#Ethernet_Driver

In fact, my issue is basically the same as this one: https://e2e.ti.com/support/arm/sitara_arm/f/791/t/441787

Using an older kernel (3.2), I can reach the advertised speed, so HW is obviously OK.

Any suggestions on how to fix this?

Thanks!

-Keith

UPDATE:

Here's the link to the new SDK's performance (again, no where close to that one or the previous one):

processors.wiki.ti.com/.../Processor_SDK_Linux_Kernel_Performance_Guide

Speed @ 4.4: 86Mb/s up, 156 Mb/s down

Speed @ 3.2: 198Mb/s up, 276Mb/s down

  • Hi,

    The wiki link you point to is for an obsolete SDK. Current SDK performance can be found here: processors.wiki.ti.com/.../Processor_SDK_Linux_Kernel_Performance_Guide
  • Thank you for the link to the new performance data. However, it doesn't really resolve the issue.
  • I have notified the Ethernet experts. They will respond here.
  • Hi Biser,

    Awesome, thank you!

  • You mentioned you are using a 4.4 kernel. Are you using a TI SDK kernel and configuration?

    The performance could be a result of link errors or kernel conifiguration.

    In the link below describes how to check for hardware errors and some commands to run and post the results.

    processors.wiki.ti.com/.../5x_CPSW

  • Hi Schuyler,

    Thanks for your reply.

    We're using 4.4.12 (srcrev = 3639bea54a4a1e1c572a1bde78facc4e37839c12) from TI but the build system is uclinux. The kernel config is also from TI, but has been modified.

    I've attached the outputs from ethtool, ifconfig, dmesg as well as the kernel config.

    Thanks!

    defconfig.txt

    dmesg_d1.txt
    D1-UNKNOWN-IMEI:~ # dmesg
    [    0.000000] Booting Linux on physical CPU 0x0
    [    0.000000] Linux version 4.4.12 (kmo_oxy@dev-kmo) (gcc version 4.8.3 (GCC) ) #94 Tue Feb 28 18:33:20 PST 2017
    [    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
    [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
    [    0.000000] Machine model: TI AM335x Dev1
    [    0.000000] cma: Reserved 16 MiB at 0x8f000000
    [    0.000000] Memory policy: Data cache writeback
    [    0.000000] On node 0 totalpages: 65536
    [    0.000000] free_area_init_node: node 0, pgdat c05b6664, node_mem_map cedf1000
    [    0.000000]   Normal zone: 512 pages used for memmap
    [    0.000000]   Normal zone: 0 pages reserved
    [    0.000000]   Normal zone: 65536 pages, LIFO batch:15
    [    0.000000] CPU: All CPU(s) started in SVC mode.
    [    0.000000] AM335X ES2.1 (neon )
    [    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
    [    0.000000] pcpu-alloc: [0] 0
    [    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
    [    0.000000] Kernel command line: console=ttyO0,115200n8 debug rfkill.default_state=1 root=ubi0:d1-rootfs rw ubi.mtd=8,2048 rootfstype=ubifs rootwait=1 eth0=50:65:83:5f:4d:41
    [    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
    [    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
    [    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
    [    0.000000] Memory: 229288K/262144K available (4129K kernel code, 214K rwdata, 1268K rodata, 208K init, 8099K bss, 16472K reserved, 16384K cma-reserved, 0K highmem)
    [    0.000000] Virtual kernel memory layout:
    [    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    [    0.000000]     fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
    [    0.000000]     vmalloc : 0xd0800000 - 0xff800000   ( 752 MB)
    [    0.000000]     lowmem  : 0xc0000000 - 0xd0000000   ( 256 MB)
    [    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    [    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
    [    0.000000]       .text : 0xc0008000 - 0xc054d960   (5399 kB)
    [    0.000000]       .init : 0xc054e000 - 0xc0582000   ( 208 kB)
    [    0.000000]       .data : 0xc0582000 - 0xc05b7a00   ( 215 kB)
    [    0.000000]        .bss : 0xc05b9000 - 0xc0da1d1c   (8100 kB)
    [    0.000000] Running RCU self tests
    [    0.000000] NR_IRQS:16 nr_irqs:16 16
    [    0.000000] IRQ: Found an INTC at 0xfa200000 (revision 5.0) with 128 interrupts
    [    0.000000] OMAP clockevent source: timer2 at 24000000 Hz
    [    0.000020] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
    [    0.000056] clocksource: timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
    [    0.000120] OMAP clocksource: timer1 at 24000000 Hz
    [    0.000679] clocksource_probe: no matching clocksources found
    [    0.001757] Console: colour dummy device 80x30
    [    0.001827] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
    [    0.001839] ... MAX_LOCKDEP_SUBCLASSES:  8
    [    0.001850] ... MAX_LOCK_DEPTH:          48
    [    0.001860] ... MAX_LOCKDEP_KEYS:        8191
    [    0.001870] ... CLASSHASH_SIZE:          4096
    [    0.001881] ... MAX_LOCKDEP_ENTRIES:     32768
    [    0.001890] ... MAX_LOCKDEP_CHAINS:      65536
    [    0.001900] ... CHAINHASH_SIZE:          32768
    [    0.001910]  memory used by lock dependency info: 5167 kB
    [    0.001921]  per task-struct memory footprint: 1536 bytes
    [    0.001956] Calibrating delay loop... 545.99 BogoMIPS (lpj=2729984)
    [    0.108038] pid_max: default: 32768 minimum: 301
    [    0.108279] Security Framework initialized
    [    0.108402] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
    [    0.108421] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
    [    0.112203] CPU: Testing write buffer coherency: ok
    [    0.114020] Setting up static identity map for 0x80008200 - 0x80008258
    [    0.122119] devtmpfs: initialized
    [    0.171968] VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 3
    [    0.237448] omap_hwmod: debugss: _wait_target_disable failed
    [    0.295795] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
    [    0.298804] pinctrl core: initialized pinctrl subsystem
    [    0.305818] NET: Registered protocol family 16
    [    0.313916] DMA: preallocated 256 KiB pool for atomic coherent allocations
    [    0.330885] GPIO line 20 (ddr vtt enable) hogged as output/high
    [    0.330955] GPIO line 12 (uart1 enable) hogged as output/high
    [    0.331683] gpiochip_add: registered GPIOs 0 to 31 on device: gpio
    [    0.333144] OMAP GPIO hardware version 0.1
    [    0.334831] gpiochip_add: registered GPIOs 32 to 63 on device: gpio
    [    0.337672] GPIO line 89 (Battery Charge Enable) hogged as output/high
    [    0.338173] gpiochip_add: registered GPIOs 64 to 95 on device: gpio
    [    0.340727] GPIO line 104 (eth0 phy enable) hogged as output/high
    [    0.341132] gpiochip_add: registered GPIOs 96 to 127 on device: gpio
    [    0.359042] omap-gpmc 50000000.gpmc: could not find pctldev for node /ocp/l4_wkup@44c00000/scm@210000/pinmux@800/pinmux_nandflash, deferring probe
    [    0.363000] hw-breakpoint: debug architecture 0x4 unsupported.
    [    0.406623] edma 49000000.edma: TI EDMA DMA engine driver
    [    0.407501] of_get_named_gpiod_flags: can't parse 'gpio' property of node '/fixedregulator@0[0]'
    [    0.410652] omap_i2c 44e0b000.i2c: could not find pctldev for node /ocp/l4_wkup@44c00000/scm@210000/pinmux@800/pinmux_i2c0_pins, deferring probe
    [    0.417306] clocksource: Switched to clocksource timer1
    [    0.456923] NET: Registered protocol family 2
    [    0.459665] TCP established hash table entries: 2048 (order: 1, 8192 bytes)
    [    0.459761] TCP bind hash table entries: 2048 (order: 4, 73728 bytes)
    [    0.460941] TCP: Hash tables configured (established 2048 bind 2048)
    [    0.461196] UDP hash table entries: 256 (order: 2, 20480 bytes)
    [    0.461527] UDP-Lite hash table entries: 256 (order: 2, 20480 bytes)
    [    0.462950] NET: Registered protocol family 1
    [    0.468877] futex hash table entries: 256 (order: 1, 11264 bytes)
    [    0.491800] io scheduler noop registered (default)
    [    0.495320] pinctrl-single 44e10800.pinmux: 142 pins at pa f9e10800 size 568
    [    0.498275] omap_uart 44e09000.serial: no wakeirq for uart0
    [    0.498312] of_get_named_gpiod_flags: can't parse 'rts-gpio' property of node '/ocp/serial@44e09000[0]'
    [    0.498991] 44e09000.serial: ttyO0 at MMIO 0x44e09000 (irq = 158, base_baud = 3000000) is a OMAP UART0
    [    1.104116] console [ttyO0] enabled
    [    1.110332] omap_uart 48022000.serial: no wakeirq for uart1
    [    1.116188] of_get_named_gpiod_flags: can't parse 'rts-gpio' property of node '/ocp/serial@48022000[0]'
    [    1.126264] 48022000.serial: ttyO1 at MMIO 0x48022000 (irq = 159, base_baud = 3000000) is a OMAP UART1
    [    1.139945] Oxygen-3 LED and GPIO manager active
    [    1.146914] omap_rng 48310000.rng: OMAP Random Number Generator ver. 20
    [    1.195719] brd: module loaded
    [    1.224774] loop: module loaded
    [    1.232084] mtdoops: mtd device (mtddev=name/number) must be supplied
    [    1.246814] m25p80 spi1.0: s25fl208k (1024 Kbytes)
    [    1.252173] 7 ofpart partitions found on MTD device spi1.0
    [    1.257947] Creating 7 MTD partitions on "spi1.0":
    [    1.262999] 0x000000000000-0x00000001a000 : "SPL"
    [    1.275593] 0x00000001a000-0x00000007e000 : "U-Boot"
    [    1.285244] 0x00000007e000-0x00000007f000 : "U-Boot Env"
    [    1.295357] 0x00000007f000-0x000000080000 : "D1 U-Boot Env"
    [    1.305651] 0x000000080000-0x0000000a0000 : "config"
    [    1.315250] 0x0000000a0000-0x0000000c0000 : "config2"
    [    1.324939] 0x0000000c0000-0x000000100000 : "store"
    [    1.337098] tun: Universal TUN/TAP device driver, 1.6
    [    1.342501] tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
    [    1.417400] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6
    [    1.423797] davinci_mdio 4a101000.mdio: detected phy mask ffffff7f
    [    1.434835] Atheros 8035 ethernet 4a101000.mdio:07: GPIO lookup for consumer reset
    [    1.442865] Atheros 8035 ethernet 4a101000.mdio:07: using lookup tables for GPIO lookup
    [    1.451345] Atheros 8035 ethernet 4a101000.mdio:07: lookup for GPIO reset failed
    [    1.459323] libphy: 4a101000.mdio: probed
    [    1.463527] davinci_mdio 4a101000.mdio: phy[7]: device 4a101000.mdio:07, driver Atheros 8035 ethernet
    [    1.474360] cpsw 4a100000.ethernet: No slave[0] phy_id, phy-handle, or fixed-link property
    [    1.483230] cpsw 4a100000.ethernet: Missing dual_emac_res_vlan in DT.
    [    1.490022] cpsw 4a100000.ethernet: Using 1 as Reserved VLAN for 0 slave
    [    1.497245] cpsw 4a100000.ethernet: Missing dual_emac_res_vlan in DT.
    [    1.504015] cpsw 4a100000.ethernet: Using 2 as Reserved VLAN for 1 slave
    [    1.511059] cpsw 4a100000.ethernet: Detected MACID = 50:65:83:5f:4d:41
    [    1.518735] renaming cpsw switch port 0
    [    1.525181] cpsw 4a100000.ethernet: cpsw: Detected MACID = 50:65:83:5f:4d:43
    [    1.534996] PPP generic driver version 2.4.2
    [    1.540909] PPP BSD Compression module registered
    [    1.545860] PPP Deflate Compression module registered
    [    1.551238] NET: Registered protocol family 24
    [    1.559434] omap_rtc 44e3e000.rtc: rtc core: registered 44e3e000.rtc as rtc0
    [    1.567234] i2c /dev entries driver
    [    1.574437] omap_wdt: OMAP Watchdog Timer Rev 0x01: initial timeout 60 sec
    [    1.586760] initialized TI-am335x-adc driver
    [    1.591893] Netfilter messages via NETLINK v0.30.
    [    1.596930] nf_conntrack version 0.5.0 (3838 buckets, 15352 max)
    [    1.604050] ctnetlink v0.93: registering with nfnetlink.
    [    1.611048] ipip: IPv4 over IPv4 tunneling driver
    [    1.618500] gre: GRE over IPv4 demultiplexor driver
    [    1.623623] ip_gre: GRE over IPv4 tunneling driver
    [    1.632874] IPv4 over IPsec tunneling driver
    [    1.640992] ip_tables: (C) 2000-2006 Netfilter Core Team
    [    1.647167] arp_tables: (C) 2002 David S. Miller
    [    1.652508] Initializing XFRM netlink socket
    [    1.657180] NET: Registered protocol family 17
    [    1.661990] NET: Registered protocol family 15
    [    1.666968] bridge: automatic filtering via arp/ip/ip6tables has been deprecated. Update your scripts to load br_netfilter if you need this.
    [    1.680353] Bridge firewalling registered
    [    1.684669] l2tp_core: L2TP core driver, V2.0
    [    1.689326] l2tp_ppp: PPPoL2TP kernel driver, V2.0
    [    1.694347] l2tp_ip: L2TP IP encapsulation support (L2TPv3)
    [    1.700269] l2tp_netlink: L2TP netlink interface
    [    1.705302] l2tp_eth: L2TP ethernet pseudowire support (L2TPv3)
    [    1.711545] 8021q: 802.1Q VLAN Support v1.8
    [    1.716491] omap_voltage_late_init: Voltage driver support not added
    [    1.726758] ThumbEE CPU extension supported.
    [    1.738517] omap-gpmc 50000000.gpmc: GPMC revision 6.0
    [    1.744153] gpmc_mem_init: disabling cs 0 mapped at 0x0-0x1000000
    [    1.750616] gpiochip_find_base: found new base at 510
    [    1.756658] gpiochip_add: registered GPIOs 510 to 511 on device: omap-gpmc
    [    1.765891] omap2-nand 8000000.nand: GPIO lookup for consumer rb
    [    1.772255] omap2-nand 8000000.nand: using device tree for GPIO lookup
    [    1.779137] of_get_named_gpiod_flags: can't parse 'rb-gpios' property of node '/ocp/gpmc@50000000/nand@0,0[0]'
    [    1.789621] of_get_named_gpiod_flags: can't parse 'rb-gpio' property of node '/ocp/gpmc@50000000/nand@0,0[0]'
    [    1.800018] omap2-nand 8000000.nand: using lookup tables for GPIO lookup
    [    1.807046] omap2-nand 8000000.nand: lookup for GPIO rb failed
    [    1.813598] nand: device found, Manufacturer ID: 0x01, Chip ID: 0xa1
    [    1.820276] nand: AMD/Spansion S34MS01G2
    [    1.824379] nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
    [    1.832403] nand: using OMAP_ECC_BCH8_CODE_HW ECC scheme
    [    1.838278] 8 ofpart partitions found on MTD device 8000000.nand
    [    1.844557] Creating 8 MTD partitions on "8000000.nand":
    [    1.850148] 0x000000000000-0x000000400000 : "Kernel 1"
    [    1.865727] 0x000000400000-0x000003000000 : "File System 1"
    [    1.914835] 0x000003000000-0x000003400000 : "Kernel 2"
    [    1.928395] 0x000003400000-0x000006000000 : "File System 2"
    [    1.977146] 0x000006000000-0x000008000000 : "Data"
    [    2.015007] 0x000000000000-0x000003000000 : "Part 1"
    [    2.067770] 0x000003000000-0x000006000000 : "Part 2"
    [    2.119461] 0x000000000000-0x000008000000 : "All"
    [    2.258839] tps65910 0-002d: No interrupt support, no core IRQ
    [    2.315129] omap_i2c 44e0b000.i2c: bus 0 rev0.11 at 400 kHz
    [    2.321864] vdd_mpu: supplied by vmain
    [    2.333382] cpufreq: cpufreq_online: CPU0: Running at unlisted freq: 550000 KHz
    [    2.341622] cpufreq: cpufreq_online: CPU0: Unlisted initial frequency changed to: 600000 KHz
    [    2.353343] ubi0: attaching mtd8
    [    2.574722] ubi0: scanning is finished
    [    2.585739] ubi0 warning: print_rsvd_warning: cannot reserve enough PEBs for bad PEB handling, reserved 1, need 20
    [    2.599774] ubi0: attached mtd8 (name "File System 1", size 44 MiB)
    [    2.606349] ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
    [    2.613604] ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 512
    [    2.620649] ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
    [    2.627952] ubi0: good PEBs: 352, bad PEBs: 0, corrupted PEBs: 0
    [    2.634237] ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128
    [    2.641819] ubi0: max/mean erase counter: 2/0, WL threshold: 4096, image sequence number: 1176465072
    [    2.651389] ubi0: available PEBs: 0, total reserved PEBs: 352, PEBs reserved for bad PEB handling: 1
    [    2.661359] omap_rtc 44e3e000.rtc: setting system clock to 2000-01-01 00:00:01 UTC (946684801)
    [    2.689748] ubi0: background thread "ubi_bgt0d" started, PID 92
    [    2.718194] UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 93
    [    2.742532] UBIFS (ubi0:0): recovery needed
    [    2.823630] UBIFS (ubi0:0): recovery completed
    [    2.828919] UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "d1-rootfs"
    [    2.836943] UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
    [    2.847368] UBIFS (ubi0:0): FS size: 42790912 bytes (40 MiB, 337 LEBs), journal size 6221824 bytes (5 MiB, 49 LEBs)
    [    2.858307] UBIFS (ubi0:0): reserved for root: 0 bytes (0 KiB)
    [    2.864417] UBIFS (ubi0:0): media format: w4/r0 (latest is w4/r0), UUID 570D42D0-7F50-4FAB-8628-0226DBA0868A, small LPT model
    [    2.878396] VFS: Mounted root (ubifs filesystem) on device 0:14.
    [    2.886580] devtmpfs: mounted
    [    2.890172] Freeing unused kernel memory: 208K (c054e000 - c0582000)
    [    2.896825] This architecture does not have kernel memory protection.
    [    3.081244] ubi1: attaching mtd11
    [    3.243252] ubi1: scanning is finished
    [    3.253899] ubi1 warning: print_rsvd_warning: cannot reserve enough PEBs for bad PEB handling, reserved 4, need 20
    [    3.266888] ubi1: attached mtd11 (name "Data", size 32 MiB)
    [    3.272826] ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
    [    3.280039] ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 512
    [    3.287051] ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
    [    3.294357] ubi1: good PEBs: 256, bad PEBs: 0, corrupted PEBs: 0
    [    3.300655] ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
    [    3.308235] ubi1: max/mean erase counter: 10/4, WL threshold: 4096, image sequence number: 2136948180
    [    3.317898] ubi1: available PEBs: 0, total reserved PEBs: 256, PEBs reserved for bad PEB handling: 4
    [    3.327659] ubi1: background thread "ubi_bgt1d" started, PID 100
    [    3.362974] UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 102
    [    3.387392] UBIFS (ubi1:0): recovery needed
    [    3.492177] UBIFS (ubi1:0): recovery completed
    [    3.497241] UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "data"
    [    3.504861] UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
    [    3.515272] UBIFS (ubi1:0): FS size: 30347264 bytes (28 MiB, 239 LEBs), journal size 1523712 bytes (1 MiB, 12 LEBs)
    [    3.526212] UBIFS (ubi1:0): reserved for root: 1433376 bytes (1399 KiB)
    [    3.533161] UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID 01A3F3C0-4CAA-43F9-9C4C-FAC653A106EA, small LPT model
    [    3.793934] usbcore: registered new interface driver usbfs
    [    3.800329] usbcore: registered new interface driver hub
    [    3.806119] usbcore: registered new device driver usb
    [    3.826255] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
    [    3.867704] am335x-phy-driver 47401300.usb-phy: GPIO lookup for consumer reset
    [    3.875294] am335x-phy-driver 47401300.usb-phy: using device tree for GPIO lookup
    [    3.883169] of_get_named_gpiod_flags: can't parse 'reset-gpios' property of node '/ocp/usb@47400000/usb-phy@47401300[0]'
    [    3.894575] of_get_named_gpiod_flags: can't parse 'reset-gpio' property of node '/ocp/usb@47400000/usb-phy@47401300[0]'
    [    3.905878] am335x-phy-driver 47401300.usb-phy: using lookup tables for GPIO lookup
    [    3.913927] am335x-phy-driver 47401300.usb-phy: lookup for GPIO reset failed
    [    3.921324] am335x-phy-driver 47401300.usb-phy: GPIO lookup for consumer vbus-detect
    [    3.929450] am335x-phy-driver 47401300.usb-phy: using device tree for GPIO lookup
    [    3.937327] of_get_named_gpiod_flags: can't parse 'vbus-detect-gpios' property of node '/ocp/usb@47400000/usb-phy@47401300[0]'
    [    3.949274] of_get_named_gpiod_flags: can't parse 'vbus-detect-gpio' property of node '/ocp/usb@47400000/usb-phy@47401300[0]'
    [    3.961120] am335x-phy-driver 47401300.usb-phy: using lookup tables for GPIO lookup
    [    3.969158] am335x-phy-driver 47401300.usb-phy: lookup for GPIO vbus-detect failed
    [    3.977211] 47401300.usb-phy supply vcc not found, using dummy regulator
    [    3.988765] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
    [    3.999841] musb-hdrc: MHDRC RTL version 2.0
    [    4.004401] musb-hdrc: setup fifo_mode 4
    [    4.008543] musb-hdrc: 28/31 max ep, 16384/16384 memory
    [    4.014467] musb-hdrc musb-hdrc.0.auto: MUSB HDRC host driver
    [    4.021444] musb-hdrc musb-hdrc.0.auto: new USB bus registered, assigned bus number 1
    [    4.031766] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
    [    4.038934] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
    [    4.046495] usb usb1: Product: MUSB HDRC host driver
    [    4.051742] usb usb1: Manufacturer: Linux 4.4.12 musb-hcd
    [    4.057409] usb usb1: SerialNumber: musb-hdrc.0.auto
    [    4.067555] hub 1-0:1.0: USB hub found
    [    4.071985] hub 1-0:1.0: 1 port detected
    [    4.082083] am335x-phy-driver 47401b00.usb-phy: GPIO lookup for consumer reset
    [    4.089770] am335x-phy-driver 47401b00.usb-phy: using device tree for GPIO lookup
    [    4.097640] of_get_named_gpiod_flags: can't parse 'reset-gpios' property of node '/ocp/usb@47400000/usb-phy@47401b00[0]'
    [    4.109045] of_get_named_gpiod_flags: can't parse 'reset-gpio' property of node '/ocp/usb@47400000/usb-phy@47401b00[0]'
    [    4.120347] am335x-phy-driver 47401b00.usb-phy: using lookup tables for GPIO lookup
    [    4.128391] am335x-phy-driver 47401b00.usb-phy: lookup for GPIO reset failed
    [    4.135777] am335x-phy-driver 47401b00.usb-phy: GPIO lookup for consumer vbus-detect
    [    4.143903] am335x-phy-driver 47401b00.usb-phy: using device tree for GPIO lookup
    [    4.151760] of_get_named_gpiod_flags: can't parse 'vbus-detect-gpios' property of node '/ocp/usb@47400000/usb-phy@47401b00[0]'
    [    4.163701] of_get_named_gpiod_flags: can't parse 'vbus-detect-gpio' property of node '/ocp/usb@47400000/usb-phy@47401b00[0]'
    [    4.175553] am335x-phy-driver 47401b00.usb-phy: using lookup tables for GPIO lookup
    [    4.183604] am335x-phy-driver 47401b00.usb-phy: lookup for GPIO vbus-detect failed
    [    4.191657] 47401b00.usb-phy supply vcc not found, using dummy regulator
    [    4.201802] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
    [    4.212909] musb-hdrc: MHDRC RTL version 2.0
    [    4.217487] musb-hdrc: setup fifo_mode 4
    [    4.221599] musb-hdrc: 28/31 max ep, 16384/16384 memory
    [    4.227174] musb-hdrc musb-hdrc.1.auto: MUSB HDRC host driver
    [    4.233294] musb-hdrc musb-hdrc.1.auto: new USB bus registered, assigned bus number 2
    [    4.244152] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
    [    4.251342] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
    [    4.258930] usb usb2: Product: MUSB HDRC host driver
    [    4.264128] usb usb2: Manufacturer: Linux 4.4.12 musb-hcd
    [    4.269799] usb usb2: SerialNumber: musb-hdrc.1.auto
    [    4.277055] hub 2-0:1.0: USB hub found
    [    4.281170] hub 2-0:1.0: 1 port detected
    [    4.316001] usbcore: registered new interface driver cdc_wdm
    [    4.333047] usbcore: registered new interface driver cdc_acm
    [    4.339105] cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters
    [    4.370916] usbcore: registered new interface driver qmi_wwan
    [    4.390663] usbcore: registered new interface driver usbserial
    [    4.415238] usbcore: registered new interface driver qcserial
    [    4.422220] usbserial: USB Serial support registered for Qualcomm USB modem
    [    4.444496] usbcore: registered new interface driver ftdi_sio
    [    4.450934] usbserial: USB Serial support registered for FTDI USB Serial Device
    [    4.645973] random: sed urandom read with 11 bits of entropy available
    [    5.154803] net eth0: initializing cpsw version 1.12 (0)
    [    5.154837] net eth9: initialized cpsw ale version 1.4
    [    5.238108] net eth0: phy found : id is : 0x4dd072
    [    5.238127] libphy: reading hw_rev...
    [    5.238136] libphy: hw_rev:02
    [    5.242908] 8021q: adding VLAN 0 to HW filter on device eth0
    [    6.053306] udevd[304]: starting version 3.2
    [    6.381959] udevd[304]: starting eudev-3.2
    [    7.238156] libphy: hw_rev:02
    [    9.238263] libphy: hw_rev:02
    [    9.238430] cpsw 4a100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
    [  118.007932] random: nonblocking pool is initialized
    D1-UNKNOWN-IMEI:~ #
    outputs.txt
    D1-UNKNOWN-IMEI:~ # ethtool eth0
    Settings for eth0:
            Supported ports: [ TP MII ]
            Supported link modes:   10baseT/Half 10baseT/Full
                                    100baseT/Half 100baseT/Full
                                    1000baseT/Full
            Supported pause frame use: No
            Supports auto-negotiation: Yes
            Advertised link modes:  10baseT/Half 10baseT/Full
                                    100baseT/Half 100baseT/Full
                                    1000baseT/Full
            Advertised pause frame use: No
            Advertised auto-negotiation: Yes
            Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                                 100baseT/Half 100baseT/Full
                                                 1000baseT/Half 1000baseT/Full
            Link partner advertised pause frame use: Symmetric Receive-only
            Link partner advertised auto-negotiation: Yes
            Speed: 1000Mb/s
            Duplex: Full
            Port: MII
            PHYAD: 7
            Transceiver: external
            Auto-negotiation: on
            Supports Wake-on: g
            Wake-on: d
            Current message level: 0x00000000 (0)
    
            Link detected: yes
    D1-UNKNOWN-IMEI:~ # ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 1543
         Broadcast Rx Frames: 384
         Multicast Rx Frames: 1159
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 205029
         Good Tx Frames: 0
         Broadcast Tx Frames: 0
         Multicast Tx Frames: 0
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 0
         Rx + Tx 64 Octet Frames: 944
         Rx + Tx 65-127 Octet Frames: 261
         Rx + Tx 128-255 Octet Frames: 238
         Rx + Tx 256-511 Octet Frames: 11
         Rx + Tx 512-1023 Octet Frames: 73
         Rx + Tx 1024-Up Octet Frames: 16
         Net Octets: 205029
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 447
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 0
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 736
         Rx DMA chan: good_dequeue: 384
         Rx DMA chan: requeue: 0
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 0
         Tx DMA chan: tail_enqueue: 0
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 0
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 0
         Tx DMA chan: empty_dequeue: 0
         Tx DMA chan: busy_dequeue: 0
         Tx DMA chan: good_dequeue: 0
         Tx DMA chan: requeue: 0
         Tx DMA chan: teardown_dequeue: 0
    D1-UNKNOWN-IMEI:~ # ifconfig
    eth0      Link encap:Ethernet  HWaddr 50:65:83:5F:4D:41
              inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:394 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:61920 (60.4 KiB)  TX bytes:0 (0.0 B)
    
    eth0:0    Link encap:Ethernet  HWaddr 50:65:83:5F:4D:41
              inet addr:169.254.0.1  Bcast:169.254.0.255  Mask:255.255.255.0
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    
    lo        Link encap:Local Loopback
              inet addr:127.0.0.1  Mask:255.0.0.0
              UP LOOPBACK RUNNING  MTU:65536  Metric:1
              RX packets:11910 errors:0 dropped:0 overruns:0 frame:0
              TX packets:11910 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1
              RX bytes:595500 (581.5 KiB)  TX bytes:595500 (581.5 KiB)
    D1-UNKNOWN-IMEI:~ # uname -a
    Linux D1-UNKNOWN-IMEI 4.4.12 #94 Tue Feb 28 18:33:20 PST 2017 armv7l GNU/Linux

  • Thanks for the attached info. In terms of hw errors I didn't see any in the HW statistics.

    One thing not pointed out in the performance datasheet is that device under test is running at 1GHz. The demsg log indicates the processor is being run in the 500-600Mhz range. This difference might account for the drop in performance that you are seeing.

    Looking through the kernel config there are several differences that makes it hard to judge any one setting. One thing I did notice is that CONFIG_PREEMPT_NONE is set. This could have an impact as I believe system calls and other use apps may not yield so iperf may not be getting the chance to run as much.

    I do not know if the uclinux build system maybe causing an impact.
  • Hi Schuyler,

    Thanks for taking a look.

    You're right in that the CPU is showing 500-600MHz range in dmesg, but that's misleading on my part because the log capture I provided was using an on-demand governor. When I rebuilt the kernel using performance only (our cpu is maxed at 800MHz), the speed I listed (86Mb/s up, 156 Mb/s down) is what I'd get. You're also right in that CPU frequency does matter. In fact, when I originally tested @ 600MHz, I got 62up/113down. Going from 600->800MHz (33% increase), I got an almost corresponding 33% increase (62->86, 113->156). So extrapolating 800MHz->1GHz for comparison, my equivalent speed should be around 108Mb/s down, 195Mb/s up.

    I also tried rebuilding w/o setting CONFIG_PREEMPT_NONE (i.e. picked CONFIG_PREEMPT) and the performance became worse (maybe decreased by around 10Mbit/s.) So you were also right in that it could have an impact, but I guess that needed to be set.

    I'm not sure if this would point in the right direction or not, but I noticed that enabling interrupt pacing feature (saw the command from http://processors.wiki.ti.com/index.php/AM335x-PSP_04.06.00.08_Features_and_Performance_Guide#Ethernet_Driver) did improve my upload speed 86->120Mb/s, but decreased my download from 156->128Mb/s.

    If you have any ideas from these observations, please let me know, thanks!

  • Hi Shuyler and Biser,

    May I ask if there are any TI approved hardware that uses the AM335x chip with the AR8035? This would allow us to validate the HW.

    As well, may I ask what are the exact iperf commands used for those benchmarks listed? This would allow us to ensure we're running the same tests.

    Thanks!
  • Hi Biser and Schuyler,

    We took a step back and used your AM335x EVM and AM335x Starter Kit with prebuilt images from your SDK v03.02.00.05.

    Both the EVM and the Starter Kit uses AR8031, which is close to our AR8035. They're both also @ 720MHz and used the same SD card (since only had to use diff device tree.)
    With this test set up, everything is the same other than the device tree.

    Interestingly, the starter kit also had the same low ethernet performance issue that we saw. However, the EVM was at least twice as fast.

    Starter kit: 71 mbit/s up, 114 mbit/s down.
    EVM: 176 mbit/s up, 301 mbit/s down.

    We're now curious what is the main difference as the cause of this speed discrepancy could resolve our issue as well.

    [EDIT]

    I trimmed down both EVM and starter kit device trees so they're basically identical. The only difference is that starter kit has vtt for ddr3 and the eth nodes now even match each other.

    Unfortunately, still getting the same speeds (~70 mbit/s for starter kit, ~180 mbit/s for EVM).

    That means device tree isn't the cause, so now we're wondering what is the diff in HW for ethernet between EVM and starterkit?

    am335x-evm.dts.txt
    /*
     * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
     *
     * This program is free software; you can redistribute it and/or modify
     * it under the terms of the GNU General Public License version 2 as
     * published by the Free Software Foundation.
     */
    /dts-v1/;
    
    #include "am33xx.dtsi"
    #include <dt-bindings/interrupt-controller/irq.h>
    
    / {
    	model = "TI AM335x EVM v3";
    	compatible = "ti,am335x-evm", "ti,am33xx";
    
    	cpus {
    		cpu@0 {
    			cpu0-supply = <&vdd1_reg>;
    		};
    	};
    
    	memory {
    		device_type = "memory";
    		reg = <0x80000000 0x10000000>; /* 256 MB */
    	};
    
    	vbat: fixedregulator@0 {
    		compatible = "regulator-fixed";
    		regulator-name = "vbat";
    		regulator-min-microvolt = <5000000>;
    		regulator-max-microvolt = <5000000>;
    		regulator-boot-on;
    	};
    
    	lis3_reg: fixedregulator@1 {
    		compatible = "regulator-fixed";
    		regulator-name = "lis3_reg";
    		regulator-boot-on;
    	};
    };
    
    &am33xx_pinmux {
    	pinctrl-names = "default";
    	pinctrl-0 = <&clkout2_pin>;
    
    	i2c0_pins: pinmux_i2c0_pins {
    		pinctrl-single,pins = <
    			0x188 (PIN_INPUT_PULLUP | MUX_MODE0)	/* i2c0_sda.i2c0_sda */
    			0x18c (PIN_INPUT_PULLUP | MUX_MODE0)	/* i2c0_scl.i2c0_scl */
    		>;
    	};
    
    	uart0_pins: pinmux_uart0_pins {
    		pinctrl-single,pins = <
    			0x170 (PIN_INPUT_PULLUP | MUX_MODE0)	/* uart0_rxd.uart0_rxd */
    			0x174 (PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* uart0_txd.uart0_txd */
    		>;
    	};
    
    	clkout2_pin: pinmux_clkout2_pin {
    		pinctrl-single,pins = <
    			0x1b4 (PIN_OUTPUT_PULLDOWN | MUX_MODE3)	/* xdma_event_intr1.clkout2 */
    		>;
    	};
    
    	cpsw_default: cpsw_default {
    		pinctrl-single,pins = <
    			/* Slave 1 */
    			0x114 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txen.rgmii1_tctl */
    			0x118 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxdv.rgmii1_rctl */
    			0x11c (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd3.rgmii1_td3 */
    			0x120 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd2.rgmii1_td2 */
    			0x124 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd1.rgmii1_td1 */
    			0x128 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd0.rgmii1_td0 */
    			0x12c (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txclk.rgmii1_tclk */
    			0x130 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxclk.rgmii1_rclk */
    			0x134 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd3.rgmii1_rd3 */
    			0x138 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd2.rgmii1_rd2 */
    			0x13c (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd1.rgmii1_rd1 */
    			0x140 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd0.rgmii1_rd0 */
    		>;
    	};
    
    	davinci_mdio_default: davinci_mdio_default {
    		pinctrl-single,pins = <
    			/* MDIO */
    			0x148 (PIN_INPUT_PULLUP | SLEWCTRL_FAST | MUX_MODE0)	/* mdio_data.mdio_data */
    			0x14c (PIN_OUTPUT_PULLUP | MUX_MODE0)			/* mdio_clk.mdio_clk */
    		>;
    	};
    
    	mmc1_pins: pinmux_mmc1_pins {
    		pinctrl-single,pins = <
    			0x160 (PIN_INPUT | MUX_MODE7) /* spi0_cs1.gpio0_6 */
    		>;
    	};
    
    	mcasp1_pins: mcasp1_pins {
    		pinctrl-single,pins = <
    			0x10c (PIN_INPUT_PULLDOWN | MUX_MODE4) /* mii1_crs.mcasp1_aclkx */
    			0x110 (PIN_INPUT_PULLDOWN | MUX_MODE4) /* mii1_rxerr.mcasp1_fsx */
    			0x108 (PIN_OUTPUT_PULLDOWN | MUX_MODE4) /* mii1_col.mcasp1_axr2 */
    			0x144 (PIN_INPUT_PULLDOWN | MUX_MODE4) /* rmii1_ref_clk.mcasp1_axr3 */
    		>;
    	};
    };
    
    &uart0 {
    	pinctrl-names = "default";
    	pinctrl-0 = <&uart0_pins>;
    
    	status = "okay";
    };
    
    &usb {
    	status = "okay";
    };
    
    &usb_ctrl_mod {
    	status = "okay";
    };
    
    &usb0_phy {
    	status = "okay";
    };
    
    &usb1_phy {
    	status = "okay";
    };
    
    &usb0 {
    	status = "okay";
    };
    
    &usb1 {
    	status = "okay";
    	dr_mode = "host";
    };
    
    &cppi41dma  {
    	status = "okay";
    };
    
    &i2c0 {
    	pinctrl-names = "default";
    	pinctrl-0 = <&i2c0_pins>;
    
    	status = "okay";
    	clock-frequency = <400000>;
    
    	tps: tps@2d {
    		reg = <0x2d>;
    	};
    };
    
    #include "tps65910.dtsi"
    
    &mcasp1 {
    	#sound-dai-cells = <0>;
    	pinctrl-names = "default";
    	pinctrl-0 = <&mcasp1_pins>;
    
    	status = "okay";
    
    	op-mode = <0>;          /* MCASP_IIS_MODE */
    	tdm-slots = <2>;
    	/* 4 serializers */
    	serial-dir = <  /* 0: INACTIVE, 1: TX, 2: RX */
    		0 0 1 2
    	>;
    	tx-num-evt = <32>;
    	rx-num-evt = <32>;
    };
    
    &tps {
    	vcc1-supply = <&vbat>;
    	vcc2-supply = <&vbat>;
    	vcc3-supply = <&vbat>;
    	vcc4-supply = <&vbat>;
    	vcc5-supply = <&vbat>;
    	vcc6-supply = <&vbat>;
    	vcc7-supply = <&vbat>;
    	vccio-supply = <&vbat>;
    
    	regulators {
    		vrtc_reg: regulator@0 {
    			regulator-always-on;
    		};
    
    		vio_reg: regulator@1 {
    			regulator-always-on;
    		};
    
    		vdd1_reg: regulator@2 {
    			/* VDD_MPU voltage limits 0.95V - 1.26V with +/-4% tolerance */
    			regulator-name = "vdd_mpu";
    			regulator-min-microvolt = <912500>;
    			regulator-max-microvolt = <1351500>;
    			regulator-boot-on;
    			regulator-always-on;
    		};
    
    		vdd2_reg: regulator@3 {
    			/* VDD_CORE voltage limits 0.95V - 1.1V with +/-4% tolerance */
    			regulator-name = "vdd_core";
    			regulator-min-microvolt = <912500>;
    			regulator-max-microvolt = <1150000>;
    			regulator-boot-on;
    			regulator-always-on;
    		};
    
    		vdd3_reg: regulator@4 {
    			regulator-always-on;
    		};
    
    		vdig1_reg: regulator@5 {
    			regulator-always-on;
    		};
    
    		vdig2_reg: regulator@6 {
    			regulator-always-on;
    		};
    
    		vpll_reg: regulator@7 {
    			regulator-always-on;
    		};
    
    		vdac_reg: regulator@8 {
    			regulator-always-on;
    		};
    
    		vaux1_reg: regulator@9 {
    			regulator-always-on;
    		};
    
    		vaux2_reg: regulator@10 {
    			regulator-always-on;
    		};
    
    		vaux33_reg: regulator@11 {
    			regulator-always-on;
    		};
    
    		vmmc_reg: regulator@12 {
    			regulator-min-microvolt = <1800000>;
    			regulator-max-microvolt = <3300000>;
    			regulator-always-on;
    		};
    	};
    };
    
    &mac {
    	pinctrl-names = "default";
    	pinctrl-0 = <&cpsw_default>;
    	status = "okay";
    };
    
    &davinci_mdio {
    	pinctrl-names = "default";
    	pinctrl-0 = <&davinci_mdio_default>;
    	status = "okay";
    };
    
    &cpsw_emac0 {
    	phy_id = <&davinci_mdio>, <0>;
    	phy-mode = "rgmii-txid";
    };
    
    &cpsw_emac1 {
    	phy_id = <&davinci_mdio>, <1>;
    	phy-mode = "rgmii-txid";
    };
    
    &mmc1 {
    	status = "okay";
    	vmmc-supply = <&vmmc_reg>;
    	bus-width = <4>;
    	pinctrl-names = "default";
    	pinctrl-0 = <&mmc1_pins>;
    	cd-gpios = <&gpio0 6 GPIO_ACTIVE_LOW>;
    };
    
    &sham {
    	status = "okay";
    };
    
    &aes {
    	status = "okay";
    };
    
    &rtc {
    	clocks = <&clk_32768_ck>, <&clkdiv32k_ick>;
    	clock-names = "ext-clk", "int-clk";
    };
    
    &sgx {
    	status = "okay";
    };
    
    am335x-evmsk.dts.txt
    /*
     * Copyright (C) 2012 Texas Instruments Incorporated - http://www.ti.com/
     *
     * This program is free software; you can redistribute it and/or modify
     * it under the terms of the GNU General Public License version 2 as
     * published by the Free Software Foundation.
     */
    
    /*
     * AM335x Starter Kit
     * http://www.ti.com/tool/tmdssk3358
     */
    
    /dts-v1/;
    
    #include "am33xx.dtsi"
    #include <dt-bindings/interrupt-controller/irq.h>
    
    / {
    	model = "TI AM335x EVM-SK v3.2";
    	compatible = "ti,am335x-evmsk", "ti,am33xx";
    
    	cpus {
    		cpu@0 {
    			cpu0-supply = <&vdd1_reg>;
    		};
    	};
    
    	memory {
    		device_type = "memory";
    		reg = <0x80000000 0x10000000>; /* 256 MB */
    	};
    
    	vbat: fixedregulator@0 {
    		compatible = "regulator-fixed";
    		regulator-name = "vbat";
    		regulator-min-microvolt = <5000000>;
    		regulator-max-microvolt = <5000000>;
    		regulator-boot-on;
    	};
    
    	lis3_reg: fixedregulator@1 {
    		compatible = "regulator-fixed";
    		regulator-name = "lis3_reg";
    		regulator-boot-on;
    	};
    
    	vtt_fixed: fixedregulator@3 {
    		compatible = "regulator-fixed";
    		regulator-name = "vtt";
    		regulator-min-microvolt = <1500000>;
    		regulator-max-microvolt = <1500000>;
    		gpio = <&gpio0 7 GPIO_ACTIVE_HIGH>;
    		regulator-always-on;
    		regulator-boot-on;
    		enable-active-high;
    	};
    };
    
    &am33xx_pinmux {
    	pinctrl-names = "default";
    	pinctrl-0 = <&clkout2_pin &ddr3_vtt_toggle>;
    
    	ddr3_vtt_toggle: ddr3_vtt_toggle {
    		pinctrl-single,pins = <
    			0x164 (PIN_OUTPUT | MUX_MODE7)	/* ecap0_in_pwm0_out.gpio0_7 */
    		>;
    	};
    
    	i2c0_pins: pinmux_i2c0_pins {
    		pinctrl-single,pins = <
    			0x188 (PIN_INPUT_PULLUP | MUX_MODE0)	/* i2c0_sda.i2c0_sda */
    			0x18c (PIN_INPUT_PULLUP | MUX_MODE0)	/* i2c0_scl.i2c0_scl */
    		>;
    	};
    
    	uart0_pins: pinmux_uart0_pins {
    		pinctrl-single,pins = <
    			0x170 (PIN_INPUT_PULLUP | MUX_MODE0)	/* uart0_rxd.uart0_rxd */
    			0x174 (PIN_OUTPUT_PULLDOWN | MUX_MODE0)	/* uart0_txd.uart0_txd */
    		>;
    	};
    
    	clkout2_pin: pinmux_clkout2_pin {
    		pinctrl-single,pins = <
    			0x1b4 (PIN_OUTPUT_PULLDOWN | MUX_MODE3)	/* xdma_event_intr1.clkout2 */
    		>;
    	};
    
    	cpsw_default: cpsw_default {
    		pinctrl-single,pins = <
    			/* Slave 1 */
    			0x114 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txen.rgmii1_tctl */
    			0x118 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxdv.rgmii1_rctl */
    			0x11c (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd3.rgmii1_td3 */
    			0x120 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd2.rgmii1_td2 */
    			0x124 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd1.rgmii1_td1 */
    			0x128 (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txd0.rgmii1_td0 */
    			0x12c (PIN_OUTPUT_PULLDOWN | MUX_MODE2)	/* mii1_txclk.rgmii1_tclk */
    			0x130 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxclk.rgmii1_rclk */
    			0x134 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd3.rgmii1_rd3 */
    			0x138 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd2.rgmii1_rd2 */
    			0x13c (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd1.rgmii1_rd1 */
    			0x140 (PIN_INPUT_PULLDOWN | MUX_MODE2)	/* mii1_rxd0.rgmii1_rd0 */
    		>;
    	};
    
    	davinci_mdio_default: davinci_mdio_default {
    		pinctrl-single,pins = <
    			/* MDIO */
    			0x148 (PIN_INPUT_PULLUP | SLEWCTRL_FAST | MUX_MODE0)	/* mdio_data.mdio_data */
    			0x14c (PIN_OUTPUT_PULLUP | MUX_MODE0)			/* mdio_clk.mdio_clk */
    		>;
    	};
    
    	mmc1_pins: pinmux_mmc1_pins {
    		pinctrl-single,pins = <
    			0x160 (PIN_INPUT | MUX_MODE7) /* spi0_cs1.gpio0_6 */
    		>;
    	};
    
    	mcasp1_pins: mcasp1_pins {
    		pinctrl-single,pins = <
    			0x10c (PIN_INPUT_PULLDOWN | MUX_MODE4) /* mii1_crs.mcasp1_aclkx */
    			0x110 (PIN_INPUT_PULLDOWN | MUX_MODE4) /* mii1_rxerr.mcasp1_fsx */
    			0x108 (PIN_OUTPUT_PULLDOWN | MUX_MODE4) /* mii1_col.mcasp1_axr2 */
    			0x144 (PIN_INPUT_PULLDOWN | MUX_MODE4) /* rmii1_ref_clk.mcasp1_axr3 */
    		>;
    	};
    };
    
    &uart0 {
    	pinctrl-names = "default";
    	pinctrl-0 = <&uart0_pins>;
    
    	status = "okay";
    };
    
    &usb {
    	status = "okay";
    };
    
    &usb_ctrl_mod {
    	status = "okay";
    };
    
    &usb0_phy {
    	status = "okay";
    };
    
    &usb1_phy {
    	status = "okay";
    };
    
    &usb0 {
    	status = "okay";
    };
    
    &usb1 {
    	status = "okay";
    	dr_mode = "host";
    };
    
    &cppi41dma  {
    	status = "okay";
    };
    
    &i2c0 {
    	pinctrl-names = "default";
    	pinctrl-0 = <&i2c0_pins>;
    
    	status = "okay";
    	clock-frequency = <400000>;
    
    	tps: tps@2d {
    		reg = <0x2d>;
    	};
    };
    
    #include "tps65910.dtsi"
    
    &mcasp1 {
    	#sound-dai-cells = <0>;
    	pinctrl-names = "default";
    	pinctrl-0 = <&mcasp1_pins>;
    
    	status = "okay";
    
    	op-mode = <0>;          /* MCASP_IIS_MODE */
    	tdm-slots = <2>;
    	/* 4 serializers */
    	serial-dir = <  /* 0: INACTIVE, 1: TX, 2: RX */
    		0 0 1 2
    	>;
    	tx-num-evt = <32>;
    	rx-num-evt = <32>;
    };
    
    &tps {
    	vcc1-supply = <&vbat>;
    	vcc2-supply = <&vbat>;
    	vcc3-supply = <&vbat>;
    	vcc4-supply = <&vbat>;
    	vcc5-supply = <&vbat>;
    	vcc6-supply = <&vbat>;
    	vcc7-supply = <&vbat>;
    	vccio-supply = <&vbat>;
    
    	regulators {
    		vrtc_reg: regulator@0 {
    			regulator-always-on;
    		};
    
    		vio_reg: regulator@1 {
    			regulator-always-on;
    		};
    
    		vdd1_reg: regulator@2 {
    			/* VDD_MPU voltage limits 0.95V - 1.26V with +/-4% tolerance */
    			regulator-name = "vdd_mpu";
    			regulator-min-microvolt = <912500>;
    			regulator-max-microvolt = <1351500>;
    			regulator-boot-on;
    			regulator-always-on;
    		};
    
    		vdd2_reg: regulator@3 {
    			/* VDD_CORE voltage limits 0.95V - 1.1V with +/-4% tolerance */
    			regulator-name = "vdd_core";
    			regulator-min-microvolt = <912500>;
    			regulator-max-microvolt = <1150000>;
    			regulator-boot-on;
    			regulator-always-on;
    		};
    
    		vdd3_reg: regulator@4 {
    			regulator-always-on;
    		};
    
    		vdig1_reg: regulator@5 {
    			regulator-always-on;
    		};
    
    		vdig2_reg: regulator@6 {
    			regulator-always-on;
    		};
    
    		vpll_reg: regulator@7 {
    			regulator-always-on;
    		};
    
    		vdac_reg: regulator@8 {
    			regulator-always-on;
    		};
    
    		vaux1_reg: regulator@9 {
    			regulator-always-on;
    		};
    
    		vaux2_reg: regulator@10 {
    			regulator-always-on;
    		};
    
    		vaux33_reg: regulator@11 {
    			regulator-always-on;
    		};
    
    		vmmc_reg: regulator@12 {
    			regulator-min-microvolt = <1800000>;
    			regulator-max-microvolt = <3300000>;
    			regulator-always-on;
    		};
    	};
    };
    
    &mac {
    	pinctrl-names = "default";
    	pinctrl-0 = <&cpsw_default>;
    	status = "okay";
    };
    
    &davinci_mdio {
    	pinctrl-names = "default";
    	pinctrl-0 = <&davinci_mdio_default>;
    	status = "okay";
    };
    
    &cpsw_emac0 {
    	phy_id = <&davinci_mdio>, <0>;
    	phy-mode = "rgmii-txid";
    };
    
    &cpsw_emac1 {
    	phy_id = <&davinci_mdio>, <1>;
    	phy-mode = "rgmii-txid";
    };
    
    &mmc1 {
    	status = "okay";
    	vmmc-supply = <&vmmc_reg>;
    	bus-width = <4>;
    	pinctrl-names = "default";
    	pinctrl-0 = <&mmc1_pins>;
    	cd-gpios = <&gpio0 6 GPIO_ACTIVE_LOW>;
    };
    
    &sham {
    	status = "okay";
    };
    
    &aes {
    	status = "okay";
    };
    
    &rtc {
    	clocks = <&clk_32768_ck>, <&clkdiv32k_ick>;
    	clock-names = "ext-clk", "int-clk";
    };
    
    &sgx {
    	status = "okay";
    };
    

    Thanks!

    iperf_evm.txt
    Upload:
    root@am335x-evm:~# iperf -c 192.168.1.20 -P 5 -t 60
    ------------------------------------------------------------
    Client connecting to 192.168.1.20, TCP port 5001
    TCP window size: 43.8 KByte (default)
    ------------------------------------------------------------
    [  7] local 192.168.1.1 port 57538 connected with 192.168.1.20 port 5001
    [  4] local 192.168.1.1 port 57530 connected with 192.168.1.20 port 5001
    [  3] local 192.168.1.1 port 57532 connected with 192.168.1.20 port 5001
    [  5] local 192.168.1.1 port 57534 connected with 192.168.1.20 port 5001
    [  6] local 192.168.1.1 port 57536 connected with 192.168.1.20 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  6]  0.0-60.0 sec   102 MBytes  14.3 Mbits/sec
    [  7]  0.0-60.0 sec   101 MBytes  14.1 Mbits/sec
    [  3]  0.0-60.0 sec   102 MBytes  14.2 Mbits/sec
    [  4]  0.0-60.1 sec   102 MBytes  14.3 Mbits/sec
    [  5]  0.0-60.1 sec   102 MBytes  14.2 Mbits/sec
    [SUM]  0.0-60.1 sec   508 MBytes  71.0 Mbits/sec
    
    Download:
    root@am335x-evm:~# iperf -s
    ------------------------------------------------------------
    Server listening on TCP port 5001
    TCP window size: 85.3 KByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35632
    [  5] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35634
    [  8] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35640
    [  7] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35638
    [  6] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35636
    [ ID] Interval       Transfer     Bandwidth
    [  8]  0.0-60.3 sec   165 MBytes  23.0 Mbits/sec
    [  4]  0.0-60.4 sec   160 MBytes  22.2 Mbits/sec
    [  5]  0.0-60.4 sec   167 MBytes  23.2 Mbits/sec
    [  6]  0.0-60.5 sec   163 MBytes  22.6 Mbits/sec
    [  7]  0.0-60.5 sec   167 MBytes  23.2 Mbits/sec
    [SUM]  0.0-60.5 sec   822 MBytes   114 Mbits/sec
    
    iperf_sk.txt
    Upload:
    root@am335x-evm:~# iperf -c 192.168.1.20 -P 5 -t 60
    ------------------------------------------------------------
    Client connecting to 192.168.1.20, TCP port 5001
    TCP window size: 43.8 KByte (default)
    ------------------------------------------------------------
    [  7] local 192.168.1.1 port 44694 connected with 192.168.1.20 port 5001
    [  4] local 192.168.1.1 port 44688 connected with 192.168.1.20 port 5001
    [  6] local 192.168.1.1 port 44692 connected with 192.168.1.20 port 5001
    [  3] local 192.168.1.1 port 44686 connected with 192.168.1.20 port 5001
    [  5] local 192.168.1.1 port 44690 connected with 192.168.1.20 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  4]  0.0-60.0 sec   253 MBytes  35.3 Mbits/sec
    [  6]  0.0-60.0 sec   252 MBytes  35.2 Mbits/sec
    [  7]  0.0-60.0 sec   252 MBytes  35.3 Mbits/sec
    [  3]  0.0-60.0 sec   252 MBytes  35.3 Mbits/sec
    [  5]  0.0-60.0 sec   252 MBytes  35.3 Mbits/sec
    [SUM]  0.0-60.0 sec  1.23 GBytes   176 Mbits/sec
    
    Download:
    root@am335x-evm:~# iperf -s
    ------------------------------------------------------------
    Server listening on TCP port 5001
    TCP window size: 85.3 KByte (default)
    ------------------------------------------------------------
    [  4] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35646
    [  5] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35648
    [  7] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35652
    [  6] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35650
    [  8] local 192.168.1.1 port 5001 connected with 192.168.1.20 port 35654
    [ ID] Interval       Transfer     Bandwidth
    [  7]  0.0-60.4 sec   437 MBytes  60.7 Mbits/sec
    [  4]  0.0-60.4 sec   439 MBytes  61.0 Mbits/sec
    [  8]  0.0-60.4 sec   432 MBytes  59.9 Mbits/sec
    [  5]  0.0-60.4 sec   429 MBytes  59.5 Mbits/sec
    [  6]  0.0-60.4 sec   434 MBytes  60.2 Mbits/sec
    [SUM]  0.0-60.4 sec  2.12 GBytes   301 Mbits/sec
    

  • Thanks for running the testing on a TI EVM.

    I am a little confused, your text mentions that you are seeing the low performance on the SK but test logs attached have the SK with good performance and the EVM with the poor performance.

    I will stick with SK for the moment. I agree the numbers that you are seeing on the starterkit are low. I ran iperf per your attached text file and I see approximately on my starterkit the numbers that you are seeing on the EVM. I used an older an SD card that is a 3.0.0.4 SDK, it was what I had setup at the moment. My setup is my development machine through a Gb switch to the starterkit board. So for the moment I cannot explain the low performance.

    Your earlier post of the ethtool statistics showed no problems on your board. Could you attach them for the SK board as well?

    Since the EVM is showing good numbers, is the starterkit hooked up to the same link partner and cable?

    I don't think the CPU freq governor settings will matter unless the governor is not on-demand or performance. Runs with both showed roughtly the same numbers.

    Is the link partner a Linux PC by any chance? If so can you run wireshark on it and see if the iperf tcp traffic is showing errors?

  • Hi Schuyler,


    Thanks for looking into this some more.

    My apologies, I had the text filenames mixed up. The starterkit is the slow one.

    I've attached the ethtool stats before and after running iperf for both SK and EVM.
    When you said you're getting the numbers I'm seeing on the EVM, do you mean your SK is hitting ~200Mbit/sec upload rather than ~75Mbit/sec upload?
    Another strange thing I've noticed is that when SK uboot boots with "optargs debug" set (i.e. enable debug messages on boot), the SK actually reaches the EVM speed!
    Yes, you read that right, enabling debug made a huge difference...

    I've wireshark captures for both SK and EVM (~25MB total compressed) but I'm not sure of the best way to send them to you (but going through them, they seem OK, no errors stand out.)

    My setup to capture this set of logs are identical for both EVM and SK:
    PC (Windows+Wireshark) <--> eth0 of EVM/SK, using same SD card, same ethernet cable, no other HW (e.g. switches) in between. The only diff is EVM vs SK HW.

    My setup for testing in general are also identical for both EVM and SK, except there's no wireshark capture and they're done in linux:
    Laptop (Ubuntu 16.04.1 LTS) <--> eth0 of EVM/SK, using same SD card, same ethernet cable, no other HW (e.g. switches) in between. The only diff is EVM vs SK HW.

    I forget if I included the versions, but I'm using the prebuilt image:
    uname -a = "Linux am335x-evm 4.4.32-gadde2ca9f8 #1 PREEMPT Wed Dec 14 18:52:13 EST 2016 armv7l GNU/Linux"
    from ti-processor-sdk-linux-am335x-evm-03.02.00.05-Linux-x86-Install.bin

    Please let me know what other logs you may need, thanks!

    7416.iperf_evm.txt
    root@am335x-evm:~# uname -a
    Linux am335x-evm 4.4.32-gadde2ca9f8 #1 PREEMPT Wed Dec 14 18:52:13 EST 2016 armv7l GNU/Linux
    root@am335x-evm:~# ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 276
         Broadcast Rx Frames: 88
         Multicast Rx Frames: 184
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 56666
         Good Tx Frames: 76
         Broadcast Tx Frames: 2
         Multicast Tx Frames: 71
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 13402
         Rx + Tx 64 Octet Frames: 28
         Rx + Tx 65-127 Octet Frames: 217
         Rx + Tx 128-255 Octet Frames: 48
         Rx + Tx 256-511 Octet Frames: 26
         Rx + Tx 512-1023 Octet Frames: 17
         Rx + Tx 1024-Up Octet Frames: 16
         Net Octets: 70068
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 316
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 0
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 132
         Rx DMA chan: good_dequeue: 189
         Rx DMA chan: requeue: 0
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 76
         Tx DMA chan: tail_enqueue: 0
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 0
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 7
         Tx DMA chan: empty_dequeue: 76
         Tx DMA chan: busy_dequeue: 0
         Tx DMA chan: good_dequeue: 76
         Tx DMA chan: requeue: 0
         Tx DMA chan: teardown_dequeue: 0
    root@am335x-evm:~# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
    720000
    root@am335x-evm:~# iperf -c 192.168.1.20 -P 5 -t 60
    ------------------------------------------------------------
    Client connecting to 192.168.1.20, TCP port 5001
    TCP window size: 43.8 KByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.1.1 port 35562 connected with 192.168.1.20 port 5001
    [  5] local 192.168.1.1 port 35566 connected with 192.168.1.20 port 5001
    [  6] local 192.168.1.1 port 35568 connected with 192.168.1.20 port 5001
    [  4] local 192.168.1.1 port 35564 connected with 192.168.1.20 port 5001
    [  7] local 192.168.1.1 port 35570 connected with 192.168.1.20 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-60.0 sec   282 MBytes  39.3 Mbits/sec
    [  4]  0.0-60.0 sec   279 MBytes  39.0 Mbits/sec
    [  5]  0.0-60.0 sec   282 MBytes  39.4 Mbits/sec
    [  6]  0.0-60.0 sec   280 MBytes  39.1 Mbits/sec
    [  7]  0.0-60.0 sec   281 MBytes  39.3 Mbits/sec
    [SUM]  0.0-60.0 sec  1.37 GBytes   196 Mbits/sec
    root@am335x-evm:~# ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 456630
         Broadcast Rx Frames: 96
         Multicast Rx Frames: 240
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 32082719
         Good Tx Frames: 1022789
         Broadcast Tx Frames: 3
         Multicast Tx Frames: 73
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 1544757762
         Rx + Tx 64 Octet Frames: 55
         Rx + Tx 65-127 Octet Frames: 456575
         Rx + Tx 128-255 Octet Frames: 73
         Rx + Tx 256-511 Octet Frames: 28
         Rx + Tx 512-1023 Octet Frames: 10985
         Rx + Tx 1024-Up Octet Frames: 1011703
         Net Octets: 1576840481
         Rx Start of Frame Overruns: 248
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 248
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 456368
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 144
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 306077
         Rx DMA chan: good_dequeue: 456241
         Rx DMA chan: requeue: 1
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 993837
         Tx DMA chan: tail_enqueue: 28952
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 3711
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 8
         Tx DMA chan: empty_dequeue: 993837
         Tx DMA chan: busy_dequeue: 7000
         Tx DMA chan: good_dequeue: 1022789
         Tx DMA chan: requeue: 3218
         Tx DMA chan: teardown_dequeue: 0
    0572.iperf_sk_quiet.txt
    root@am335x-evm:~# ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 302
         Broadcast Rx Frames: 92
         Multicast Rx Frames: 206
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 58737
         Good Tx Frames: 80
         Broadcast Tx Frames: 2
         Multicast Tx Frames: 77
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 13618
         Rx + Tx 64 Octet Frames: 35
         Rx + Tx 65-127 Octet Frames: 234
         Rx + Tx 128-255 Octet Frames: 58
         Rx + Tx 256-511 Octet Frames: 23
         Rx + Tx 512-1023 Octet Frames: 16
         Rx + Tx 1024-Up Octet Frames: 16
         Net Octets: 72355
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 321
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 0
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 142
         Rx DMA chan: good_dequeue: 194
         Rx DMA chan: requeue: 0
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 80
         Tx DMA chan: tail_enqueue: 0
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 0
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 8
         Tx DMA chan: empty_dequeue: 80
         Tx DMA chan: busy_dequeue: 0
         Tx DMA chan: good_dequeue: 80
         Tx DMA chan: requeue: 0
         Tx DMA chan: teardown_dequeue: 0
    root@am335x-evm:~# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
    720000
    root@am335x-evm:~# iperf -c 192.168.1.20 -P 5 -t 60
    ------------------------------------------------------------
    Client connecting to 192.168.1.20, TCP port 5001
    TCP window size: 43.8 KByte (default)
    ------------------------------------------------------------
    [  5] local 192.168.1.1 port 51104 connected with 192.168.1.20 port 5001
    [  3] local 192.168.1.1 port 51098 connected with 192.168.1.20 port 5001
    [  7] local 192.168.1.1 port 51106 connected with 192.168.1.20 port 5001
    [  4] local 192.168.1.1 port 51100 connected with 192.168.1.20 port 5001
    [  6] local 192.168.1.1 port 51102 connected with 192.168.1.20 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-60.0 sec   111 MBytes  15.5 Mbits/sec
    [  6]  0.0-60.0 sec   110 MBytes  15.4 Mbits/sec
    [  5]  0.0-60.1 sec   103 MBytes  14.4 Mbits/sec
    [  7]  0.0-60.1 sec   103 MBytes  14.4 Mbits/sec
    [  4]  0.0-60.1 sec   103 MBytes  14.4 Mbits/sec
    [SUM]  0.0-60.1 sec   529 MBytes  73.9 Mbits/sec
    root@am335x-evm:~# ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 227848
         Broadcast Rx Frames: 111
         Multicast Rx Frames: 249
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 15988664
         Good Tx Frames: 385310
         Broadcast Tx Frames: 3
         Multicast Tx Frames: 78
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 581807704
         Rx + Tx 64 Octet Frames: 60
         Rx + Tx 65-127 Octet Frames: 227775
         Rx + Tx 128-255 Octet Frames: 65
         Rx + Tx 256-511 Octet Frames: 24
         Rx + Tx 512-1023 Octet Frames: 4247
         Rx + Tx 1024-Up Octet Frames: 380987
         Net Octets: 597796368
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 227848
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 0
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 157299
         Rx DMA chan: good_dequeue: 227721
         Rx DMA chan: requeue: 0
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 360420
         Tx DMA chan: tail_enqueue: 24890
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 12405
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 9
         Tx DMA chan: empty_dequeue: 360420
         Tx DMA chan: busy_dequeue: 3497
         Tx DMA chan: good_dequeue: 385310
         Tx DMA chan: requeue: 3156
         Tx DMA chan: teardown_dequeue: 0
    
    
    7824.iperf_sk_debug.txt
    [   63.583547] systemd-journald[87]: Vacuuming...
    [   63.589549] systemd-journald[87]: Vacuuming done, freed 0B of archived journals on disk.
    [   65.101337] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    
     ____ _                  _ ____          _          _
    |  _  |___ ___ ___ __   |  _  |___ ___  |_|___ ___| |_
    |     |  _| .'| . | . |  |   __|  _| . | | | -_|  _|  _|
    |__|__|_| |__,|_  |___|  |__|  |_| |___|_| |___|___|_|
                  |___|                    |___|
    
    Arago Project http://arago-project.org am335x-evm ttyS0
    
    Arago 2016.10 am335x-evm ttyS0
    
    am335x-evm login: root
    [   73.889394] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [   75.597126] systemd[1]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.0 object=n/a interface=n/a memb
    er=n/a cookie=29 reply_cookie=731 error=n/a
    7r                                                                                                                                 8
    root@am335x-evm:~# [   75.863515] systemd-journald[87]: Data hash table of /run/log/journal/ab441e048c874c4a9af4258f2ef969f5/system.
    journal has a fill level at 75.1 (2087 of 2780 items, 1601536 file size, 767 bytes per hash table item), suggesting rotation.
    [   75.884837] systemd-journald[87]: /run/log/journal/ab441e048c874c4a9af4258f2ef969f5/system.journal: Journal header limits reached
     or header out-of-date, rotating.
    [   75.899835] systemd-journald[87]: Rotating...
    [   75.906296] systemd-journald[87]: Reserving 2780 entries in hash table.
    [   75.913972] systemd-journald[87]: Vacuuming...
    [   75.921000] systemd-journald[87]: Deleted archived journal /run/log/journal/ab441e048c874c4a9af4258f2ef969f5/system@bf4897c2a6744
    67cb57c0e0e76db6bfe-0000000000000001-000543a6826c5385.journal (1.5M).
    [   75.939160] systemd-journald[87]: Vacuuming done, freed 1.5M of archived journals on disk.
    [   80.311818] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    
    root@am335x-evm:~# ifconfig eth0 192.168.1.1
    root@am335x-evm:~# [   95.595942] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [   97.474481] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 292
         Broadcast Rx Frames: 86
         Multicast Rx Frames: 202
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 58029
         Good Tx Frames: 76
         Broadcast Tx Frames: 2
         Multicast Tx Frames: 73
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 13149
         Rx + Tx 64 Octet Frames: 31
         Rx + Tx 65-127 Octet Frames: 232
         Rx + Tx 128-255 Octet Frames: 47
         Rx + Tx 256-511 Octet Frames: 25
         Rx + Tx 512-1023 Octet Frames: 17
         Rx + Tx 1024-Up Octet Frames: 16
         Net Octets: 71178
         Rx Start of Frame Overruns: 0
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 0
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 304
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 0
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 123
         Rx DMA chan: good_dequeue: 177
         Rx DMA chan: requeue: 0
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 76
         Tx DMA chan: tail_enqueue: 0
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 0
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 9
         Tx DMA chan: empty_dequeue: 76
         Tx DMA chan: busy_dequeue: 0
         Tx DMA chan: good_dequeue: 76
         Tx DMA chan: requeue: 0
         Tx DMA chan: teardown_dequeue: 0
    root@am335x-evm:~# [  105.693446] systemd-journald[87]: Sent WATCHDOG=1 notification.
    [  105.971874] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    echo performance > /sys/[  115.967503] systemd-journald[87]: Data hash table of /run/log/journal/ab441e048c874c4a9af4258f2ef969f5/sy
    stem.journal has a fill level at 75.0 (2086 of 2780 items, 1601536 file size, 767 bytes per hash table item), suggesting rotation.
    [  116.028398] systemd-journald[87]: /run/log/journal/ab441e048c874c4a9af4258f2ef969f5/system.journal: Journal header limits reached
     or header out-of-date, rotating.
    [  116.067159] systemd-journald[87]: Rotating...
    [  116.078338] systemd-journald[87]: Reserving 2780 entries in hash table.
    [  116.086326] systemd-journald[87]: Vacuuming...
    [  116.119013] systemd-journald[87]: Deleted archived journal /run/log/journal/ab441e048c874c4a9af4258f2ef969f5/system@bf4897c2a6744
    67cb57c0e0e76db6bfe-00000000000003f8-000543a68294b3eb.journal (1.5M).
    [  116.164190] systemd-journald[87]: Vacuuming done, freed 1.5M of archived journals on disk.
    [  116.255509] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    devices/system/cpu/cpu0/cpufreq/scaling_governor
    root@am335x-evm:~# [  126.432790] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
    720000
    root@am335x-evm:~# [  136.627290] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    iperf -c 192.168.1.20 -P 5 -t 60
    ------------------------------------------------------------
    Client connecting to 192.168.1.20, TCP port 5001
    TCP window size: 43.8 KByte (default)
    ------------------------------------------------------------
    [  6] local 192.168.1.1 port 34364 connected with 192.168.1.20 port 5001
    [  3] local 192.168.1.1 port 34358 connected with 192.168.1.20 port 5001
    [  4] local 192.168.1.1 port 34360 connected with 192.168.1.20 port 5001
    [  5] local 192.168.1.1 port 34362 connected with 192.168.1.20 port 5001
    [  7] local 192.168.1.1 port 34366 connected with 192.168.1.20 port 5001
    [  146.841025] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [  157.089661] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [  167.332451] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [  177.602864] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [  187.858900] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [  192.822833] systemd-journald[87]: Sent WATCHDOG=1 notification.
    [  198.075645] systemd-journald[87]: Successfully sent stream file descriptor to service manager.
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-60.0 sec   282 MBytes  39.4 Mbits/sec
    [  4]  0.0-60.0 sec   284 MBytes  39.7 Mbits/sec
    [  7]  0.0-60.0 sec   286 MBytes  40.0 Mbits/sec
    [  6]  0.0-60.0 sec   282 MBytes  39.4 Mbits/sec
    [  5]  0.0-60.0 sec   285 MBytes  39.8 Mbits/sec
    [SUM]  0.0-60.0 sec  1.39 GBytes   198 Mbits/sec
    root@am335x-evm:~# iperf -c 192.168.1.20 -P 5 -t 60[  208.224697] systemd-journald[87]: Successfully sent stream file descriptor to
    service manager.
    root@am335x-evm:~# ethtool -S eth0
    NIC statistics:
         Good Rx Frames: 483838
         Broadcast Rx Frames: 92
         Multicast Rx Frames: 225
         Pause Rx Frames: 0
         Rx CRC Errors: 0
         Rx Align/Code Errors: 0
         Oversize Rx Frames: 0
         Rx Jabbers: 0
         Undersize (Short) Rx Frames: 0
         Rx Fragments: 0
         Rx Octets: 33984401
         Good Tx Frames: 1034625
         Broadcast Tx Frames: 3
         Multicast Tx Frames: 75
         Pause Tx Frames: 0
         Deferred Tx Frames: 0
         Collisions: 0
         Single Collision Tx Frames: 0
         Multiple Collision Tx Frames: 0
         Excessive Collisions: 0
         Late Collisions: 0
         Tx Underrun: 0
         Carrier Sense Errors: 0
         Tx Octets: 1562684589
         Rx + Tx 64 Octet Frames: 61
         Rx + Tx 65-127 Octet Frames: 483796
         Rx + Tx 128-255 Octet Frames: 96
         Rx + Tx 256-511 Octet Frames: 28
         Rx + Tx 512-1023 Octet Frames: 10994
         Rx + Tx 1024-Up Octet Frames: 1023488
         Net Octets: 1596668990
         Rx Start of Frame Overruns: 498
         Rx Middle of Frame Overruns: 0
         Rx DMA Overruns: 498
         Rx DMA chan: head_enqueue: 1
         Rx DMA chan: tail_enqueue: 483329
         Rx DMA chan: pad_enqueue: 0
         Rx DMA chan: misqueued: 180
         Rx DMA chan: desc_alloc_fail: 0
         Rx DMA chan: pad_alloc_fail: 0
         Rx DMA chan: runt_receive_buf: 0
         Rx DMA chan: runt_transmit_buf: 0
         Rx DMA chan: empty_dequeue: 0
         Rx DMA chan: busy_dequeue: 272305
         Rx DMA chan: good_dequeue: 483202
         Rx DMA chan: requeue: 1
         Rx DMA chan: teardown_dequeue: 0
         Tx DMA chan: head_enqueue: 984045
         Tx DMA chan: tail_enqueue: 50580
         Tx DMA chan: pad_enqueue: 0
         Tx DMA chan: misqueued: 5406
         Tx DMA chan: desc_alloc_fail: 0
         Tx DMA chan: pad_alloc_fail: 0
         Tx DMA chan: runt_receive_buf: 0
         Tx DMA chan: runt_transmit_buf: 10
         Tx DMA chan: empty_dequeue: 984045
         Tx DMA chan: busy_dequeue: 12742
         Tx DMA chan: good_dequeue: 1034625
         Tx DMA chan: requeue: 5748
         Tx DMA chan: teardown_dequeue: 0                                                                                               

  • Wireshark capture for SK:

    iperf_sk_quiet.7z

  • Thanks for the statistics and thanks for the wireshark capture.

    From what I can see by analyzing the wireshark capture the lower bitrate might be related to the link partner sending TCP window adjustment messages that are causing delays in the traffic sent by the client. When I run the test on my setup which is a Linux PC and the am335x-evmsk I don't see these TCP window adjustments messages when filtering the wireshark capture.

    In the attached document are some are screen captures from the wireshark analysis tools. There are a couple of flat areas that possibly indicate line latency. Zooming in on one of the portions and looking at packet to packet delay shows that after the server machine sent a tcp adjustment message the looks to cause the client hold off transmitting for a brief period.

    Running the wireshark capture when connected to the EVM might be a good step and also with Wireshark running on the Linux PC if possible.

    wireshark_analysis_20170314.pdf

  • Hi Schuyler,

    Thanks for the analysis, and I honestly never used that function in wireshark before, so it was interesting to play with it!

    Unfortunately, the EVM also had flat lines, and as you mentioned, the packets seem to be in the order of microseconds for the EVM whereas it was in the order of 10/100us for the SK. I apologize that I couldn't attach the EVM capture last time, it failed so I thought it was too big. I managed to reattached it again here. I also recaptured the SK using wireshark on linux for your reference and it also had the flat lines as well.

    I'm wondering if it's possible for you to use the same SDK as me to verify? This is because we know that some other SDK versions are OK, so the quickest test is to use the one that we were using.

    Thanks!

    iperf_sk_linux.7ziperf_evm.7z

  • Hi Schuyler,

    I'm wondering if you were able to test with the new SDK?

    Thanks!
  • Hi Schuyler,

    I'm wondering if you were able to test with the new SDK?

    Thanks!
  • I am able to reproduce your results on the latest SDK. Doing a quick look at the wireshark capture you posted and one I captured today there appears to be gaps in the transmission lasting roughly 3mS, these are usually occuring after a ACK message from the iperf server.

    I tried the previous SDK on the same board to confirm this is not an issue on that version and it is not. The benchmark results are run on the GPEVM which I don't have for the moment but I will try to get one to re-run the test.

    Setting the governor to performance does not help and setting interrupt pacing has not effect either or running iperf in single thread configuration.

  • Hi Schuyler,

    Awesome, thank you for checking.

    I noticed an odd thing, which is that when the SK boots with kernel bootarg "debug" (set in uboot), it actually performs as well as the EVM speed.

    I'm hoping you can see it as well and it may point to a direction (something is configured/not configured when debug mode is enabled.)

  • I booted with debug and ran the test as you suggested and I see a significant improvement but not the same as the earlier version. Thanks for noticing that data point it may narrow down a cause.
  • Just a quick update, I tested on the latest SDK release which was released this week and unfortuntately I see the same performance drop. Also the debug option that you pointed out does show some improvement but not the expected value.

  • Hi Schuyler,

    Thank you for the update! You're right, the debug option doesn't fully restore the speed. If there's anything we can do to help you solve this speed discrepancy between the SK and EVM, please let us know!

  • Hi Schuyler,

    Sorry, we're also curious what's your plan for solving this?
    Thanks!
  • At this point I can only sayis that we have find the cause for this performance drop before we can provide an accurate plan to fix it.

    As you have noticed the same kernel behaves differently then we expect on the gpevm and the SK. I also tried an Beagle Bone Black which is essentially another evm and it sees it's normal bandwidth of over 90Mbps which is 20Mbps higher than the sk and it only has a 100Mbps phy on it. Since the beagle bone black and gpevm do not use dual mac mode I tried putting the sk into switch mode, this showed no improvement. Top is showing iperf running about 10-15% lower than expected. Sowe will look at next is why iperf is running slower than expected.
  • Hi Schuyler,

    Thanks for your reply.

    May I ask how is the debugging going? This is now becoming more urgent for us.

    Thanks again!

  • Hi Schuyler,

    May I ask what's the status of this? This is now urgent for us.

    Thanks again!

  • Keith, sorry for the delay....we are moving this week and this may have impacted workflow...
  • No problem.

    We just found out that disabling the kernel options CONFIG_EXPERT and CONFIG_DEBUG_KERNEL significantly increases performance. We have tested this with kernel 4.4.32 from SDK 03.02.00.05 and with kernel 4.9.25 from ti-linux-kernel-next:

    4.4.32:

    Direction am335x to host: 225Mbit/s
    Direction host to am335x: 347Mbit/s

    4.9.25:

    Direction am335x to host: 316Mbit/s
    Direction host to am335x: 333Mbit/s

    All measurements are from our custom board with the CPU running at 1MHz.