Our system is designed to use ethernet boot with the Keystone2, but we have not been experiencing reliable booting. Only about 80% of the time are we able to successfully download u-boot. We have our system set up such that when u-boot fails to load after a period of time, we toggle RESET on the Keystone2 to re-try the boot process. The boot statistics we have recorded are:
loads uboot over ethernet, with no retries – 53/67 – 79%
loads uboot over ethernet, on 1st 1 retry (u-boot doesn’t load on 1st attempt) – 11/67 – 16%
loads uboot over ethernet, one 2nd retry (u-boot doesn’t load on 1st and 2nd attempt) – 2/67 – 3%
fails to load uboot over ethernet, 3rd retry, (u-boot doesn’t load on 1st, 2nd, 3rd attempt) – 1/67 – 1.5%
It is plausible to imagine that given a larger set of data, we would continue to see about an 80% chance for a successful boot per re-try.
Our ethernet interface passes through an FPGA before going to a switch, where we monitor traffic after converting the physical media to GMII. The concerning behavior we have seen is that whenever the boot fails, we see data errors flagged on the GMII interface in the BOOTP request packet. We also see the Keystone2 re-send the BOOTP request after a designated sequence of timeouts. Every time the boot process fails, regardless of power cycles or resets or timeouts, the BOOTP request packets sent by the Keystone2 all have data errors in the same exact byte positions, with the same bad data values. Although these errors are flagged by the FGPA transceivers as either disparity or “not in table”, the precise and consistent nature of the failure suggests that the Keystone2 is actually sending bad packets rather than there being a signal integrity issue.