Tool/software: Linux
Hi all,
Our company design a custom board for an important customer, which the main feature is to distribute time to other systems connected to the system through two network interfaces.
The board is equipped with a TI AM3352 microprocessor and, for the networking section, mounted two Microchip KSZ9031 components.
Below two figures show a summary of the microprocessor-ethernet system and their hardware connection to the CPU.
Figura 1- Overview sezione Microprocessore e ethernet
Figura 2 – Design hardware segnali verso i fisici KSZ9031
For the software section, a Debian-based Linux distribution has been recompiled and installed on the CPU (the version is 3.14.26-g9b32ca2-dirty) and the drivers used to handle the two physics are CPSW in version 1.0.
In addition to the operating system, an embedded software application is developed internally. It runs automatically on startup and implements all Ethernet management.
The two network channels have been managed in such a way that they are independent (both in same subnet and not) as request by the customer and therefore data traffic is bound at the interface level. Specifically, eth0 is used exclusively to distribute time data, while eth1 is dedicated to service messages. In the embedded software, the ports are bound to force the transmission of time data packets through eth0 and service messages to eth1 (all ethernet ports work fine).
The anomaly, occurring sporadically but still with a high distribution, was detected when the board answers at an ARP request message to both Ethernet ports eth0 and eth1 and when the board is used as NTP server (so when the kernel services manage the ethernet ports). Additionally, the problem occurs only when the two network interfaces are configured under the same subnet. In contrast when the ports are configured in different subnet, the exception does not occur.
As can be seen from the figure below, an ARP request sent by an external system to a specific network address is followed by two responses, one of the ports assigned the IP address specified in the broadcast packet and one on the other Ethernet interface. The two IP address are different.
This behavior is reflected on the system that sends UDP commands to the board, and in particular the ARP table of the external system reports the same MAC Addresses to the two IP addresses previously queried. The figure below shows what is stated above.
The effect of this erroneous ARP response management causes that the UDP messages sent to the eth1 address, report on the UDP packet the eth0 MAC address and the IP address assigned to eht1. Consequently the embedded application software filters and discards the messages.
Best regards,
Domenico



