This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Why Exosite seems to not have IEEE 802.3 Pause frame flow control enable on the server NIC driver.

Guru 55923 points
Other Parts Discussed in Thread: EK-TM4C1294XL

The Exosite server seems unaware it can at times overrun a clients limited 4096 byte Ring Buffer and 8RX/TX DMA descriptors.

1. The IOT vanilla client is only an example yet it seems the sever ignores TX pause frames being sent by an very smart IOT client.

Asserting IEEE 802.3 pause frames in high traffic situations insures the client can inform the host to please hold off TXD until further notice.

2. Limited SRAM buffer (4096 bytes) in EK-TM4C1294XL Launch Pad and or any market ready production clients requires IEEE 802.3 flow control be enabled at the IOT server.

Windows clients usually don't have Ethernet frame buffering issues because of humongous DRAM and Disk cache being at the disposal of the Microsoft network client. Effectively todays MS client can easily RXD at 1 gigabit over copper.

More info as to how what and why:

http://e2e.ti.com/support/microcontrollers/tiva_arm/f/908/t/442517

  • Pause control is typically not enabled by switches, it's not granular enough. I doubt any pause frames you send are even making it to your ISP. TCP/IP has its own flow control mechanisms.

    blog.ine.com/.../802-3x-flow-control

    Robert
  • BP101 said:
    clients limited 4096KB Ring Buffer

    Did you intend that "KB" suffix?   Could such be considered, "limited?"

  • Umm IEEE 802.3 pause frames are the TCP flow control method. The TCP uses ACK/NAC to confirm a frame was received or not, it is not used for packet flow control.
  • Check the link. Cisco disables responding to it by default and does not pass it along. Cisco will not send pause frames up the chain.

    Robert

  • Note that the PAUSE frame uses MAC LLC encapsulation; (destined to a reserved multicast MAC address 01:80:C2:00:00:01).

    1. mixing PAUSE functionality with per-class services was impossible. (local network IE: Vlans)

    *CCIE point in Roberts link was mostly from a local network switch perspective yet did mention it has to be enabled at the NIC driver. "The attached device could be a host station or another switch – it does not matter."

    Seems 802.3 P-T-P flow control becomes more of an issue at the network device layer in switches if enabled. That versus the multicast MAC LLC encapsulation sent Client to Host/Host to Client.

    It would appear in Full Duplex 802.3 MAC LLC encapsulation ensures the host receives the senders multicast encapsulated MAC regardless of the switches/routers. The device driver in the sever decodes the MAC (01:80:C2:00:00:01) and retrieve the pause frames.
  • Would seem CTCP is default enabled for peer networks of Windows 7/vista clients and seemingly not the best choice for servers supporting TM4C clients with a fixed advertised TCP window size of 4096 bytes.

    CTCP is enabled by default in computers running Windows Server 2008 and disabled by default in computers running Windows Vista. You can enable CTCP with the "netsh interface tcp set global congestionprovider=ctcp" command. You can disable CTCP with the "netsh interface tcp set global congestionprovider=none" command.

    technet.microsoft.com/.../bb878127.aspx
  • At first it reads like that could be the case from a CCIE perspective, that enables frame flow control at the switch level on each router hop. BTW switches don't have hops only routers do so that info is a bit missing in scope.

    As for Cisco switches specifically blocking the multicast LLC encapsulated MAC at layer 2 that is not specifically put into clear concise words. Router layer 3 device blocks network broadcasts and perhaps pass multicast MAC LLC frames on TCP for link layer control in the remote MLID. The question is does that even matter being the 802.3 Flow Control time period is inserted into the Transmit control frame according to TM4C data sheet. How the transmit control frame in the EMAC becomes encapsulated into the LLC is not discussed. However as shown the forum post link EMAC register 7 enables and asserts the flow control bit times into the Transmit control frames on demand.

  • A short overview of 802.3 and its interaction with TCP flow control
    virtualthreads.blogspot.com/.../beware-ethernet-flow-control.html

    Robert
  • Some what better article still leaves many details out regarding LLC MAC (encapsulation). The author contradicts pause frame analysis in the example of two computers causing an escalation in TCP speed but fails to explain why that would occur. Seems IEEE 802.3 pause frame from the TI perspective indicates the DMA descriptor is OWN by DMA engine not the core. Some how they all missed the boat to offload the TX pause frames on a per client MAC LLC basis at the servers NIC card device driver. This seems more a software issue at the server/client not so much hardware. Appears Cisco tried to take over the RFC and fix it with hardware rather than send encapsulated MAC frames upstream and NIC vendors dropped the ball to offload the encapsulated MAC pause frame on a per client basis . The 802.1Qbb RFC extends the 802.3 pause frame semantics to multiple CoSs --- the durty laundry still in the shoot.

    That contradiction suggests other TCP issue at large such as CTCP dynamic window growing exponentially large.

    Good enough reason why CTCP should be disabled on Win2K8 servers with TM4C Launch Pad clients. Specially being non Windows clients attached to the Exosite server. The TCP buffer in the IOT client is only 4096 bytes can be easily overrun by the server when ever there is heavy local traffic on the client side network. That most often leads to crashing the client. Increasing the TM4C client buffer or heap more often leads to LWIP crashing the PBUF heap.

    802.1q Flow Control white_paper_c11-542809.pdf

  • The CCIE does sort of make that point yet the Cisco 5000 switch software is preconfigured to receive Pause Frames on layer 2 in the local network.

    Most all these article sources, even Cisco white paper hijack the RFC specifically designed for pause frames on layer 3, specifically the part about hops between router paths and the PTP client. Example: TM4C IOT launch pad clients connected to a remote server located many hops up stream.

    802.3 Pause Frame:
    In a network path that normally consists of multiple hops between source and destination, lack of feedback between transmitters and receivers at each hop is one of the main causes of unreliability. Transmitters can send packets faster than receivers accept packets, and as the receivers run out of available buffer space to absorb incoming flows, they are forced to silently drop all traffic that exceeds their capacity. These semantics work fine at Layer 2, so long as upper-layer protocols handle drop-detection and retransmission logic.

    IEEE 802.3x PAUSE is defined in Annex 31B of the IEEE 802.3 specification. Simply put, a receiver can generate a MAC control frame and send a PAUSE request to a sender when it predicts the potential for buffer overflow. Upon receiving a PAUSE frame, the sender responds by stopping transmission of any new packets until the receiver is ready to accept them again.
    IEEE 802.3x PAUSE works as designed, but it suffers a basic disadvantage that limits its field of applicability: after a link is paused, a sender (TM4C client) cannot generate any more packets. As obvious as that seems, the consequence is that the application of IEEE 802.3x PAUSE makes an Ethernet segment unsuitable for carrying multiple traffic flows that might require different quality of service (QoS).

    Time for switch upgrade TI ? -- Cisco Nexus 5000 and 802.1Qbb handles 8 Cos in that layer 2 VLAN configuration. Who really knows where the bottle neck is?

    Note: The Exosite software client server should be application dedicated, no others except for virus and backup etc.. running background.
  • BP101 said:
    Note: The Exosite software client server should be application dedicated, no others except for virus and backup etc.. running background.

    It won't be dedicated to your single board.

    Robert

  • Who knows one day in future perhaps - with deals like:
    Raspberry Pi2 Model-B, 900Mhz ARM 7 Quad core.
    1GB memory all for $35.00.
    4 USB ports, video core IV GPU, HDMI or composite video.
    Looks like it has an Ethernet port but not stated.
    The micro SD card has 6 different operating system choices.
    Also not stated what is the embedded language.

    http://www.mcmelectronics.com.
  • Standard TCP performance is limited
    ❙max window size (216-1 = 65,535 bytes)
    ❙max sequence numbers (232-1 ≅4GB=32 Gb)

    Oddly a 65kbyte TCP_WND default in WIn7/Vista. Still don't believe it necessary to have CTCP enabled server side, for reasons also brought out in this article. The severs TX frame size can grow to Giant size as the server dynamically widens the window more in response to the router hops RX buffer memory than the actual TM4 client far across the wire. Helped reduce client errors disabling RX of Giant frames (EMAC_DMAOPMODE_DGM, 2048).

    LWIP default window is set 4096k and reducing the advertised window to server (600 byte MSS x4 = TCP_WND 2400 bytes) produces fewer dropped frames at the TM4c Netif.

    More importantly a recent discovery ever (tiva-tm4c1294.c) drops an RX data frame it immediately frees that RX descriptors Pbuf. As that occurs in hardware it does not notify the application it has freed the Pbuf the app expected to next write into the ring buffer. Other words we might possibly end up with an looping RX http status (0) at high (local network) congestion followed by an unrecoverable and revolving frame error in the abstraction layer.

    Have witnessed this over and over after 16-20 hours online and for long time incorrectly deduced it was being caused by erroneous interrupt status codes in EMACDMARIS register. After fixing the register report status to print only interrupt AIS codes that were validated by bit [8] set in the Rx descriptor the entire debug picture has again changed.