Hello,
We are using DP83848 PHY in RMII mode with STM32F407 (LwIP stack). The 50MHz ref CLK is generated using STM32 PLL.
Here is the situation:
- 100M operation cannot be achieved. When enabling autonegotiation, the PHY goes to 100M full-duplex, but all frames on RXD0/1 are corrupted. For instance most frames will just consist of RXD0 staying at high level for 20 to 100 µs. 100M operation is not critical for our application, so further tests were conducted forcing the PHY to 10M. I am mentioning it for troubleshooting purposes.
- Two tests were conducted : ping commands, and Modbus TCP requests. Response to ping fails occasionally (around 5% of the time). When it does not fail, response time is low (<10ms). Using Modbus TCP, we observe that most requests are replied fast, but occasionally (every 20 seconds or so) the response time will increase to 500ms or 1 second. Setting the timeout value to 2 seconds, a timeout will generally occur after 10-20 minutes of operation.
- Debugging on STM32 shows that the failed pings did not reach the entry of LwIP stack (i.e. the stack replies to every ping it receives, and every reply does reach the host)
- To test integrity of packets, we set up a development board with the same parameters and connected it to same switch, and compared RXD0 signals. We spotted corruption on some frames.
Here are examples of regular frames. Except preamble length, the frames are identical on both boards:
Below is an example of corrupted frame. The frame from DP83848 (upper) differs from the dev board.
Based on these results, it seems that the DP83848 corrupts a small amounts of Rx packets. In TCP operation, the sender re-sends the packet, so it just induces a delay in the answer. Modbus timeout probably occurs when the same packet is corrupted several times
It seems to be a HW problem, so we tried adding 100 ohms series resistors on the X1, RXD0, and RXD1 signals. This made the problem a lot worse. Only a tiny fraction of the frames was transmitted without corruption. Load factor also seems to influence the phenomenon. When reducing the load factor tenfold, the percentage of corrupted frames was reduced.
Here is the CLK signal, without the series resistor
Based on this information, which design changes should we make to at least have reliable 10M operation ?
Here are the schematics. I should add that RMII signal traces are quite shorts, as the component is only 3cm away from the MAC. Diff signal traces are a bit longer (10cm). All RMII signals go directly to the microcontroller.
I