Dear Sirs,
we are currently observing a severe issue on the Ethernet Controller (EMAC).
The following happens from the perspective of our software when we release a Packet Buffer Descriptor (PBD) to the EMAC:
-
The software initializes all fields of the PBD, in particular
-
setting OWNERSHIP,
-
clearing all other flags,
-
setting packet length to 0,
-
setting buffer length to the size of the provided buffer,
-
setting the next pointer to 0;
-
-
The software then sets the next pointer of the last PBD, currently being 0, to the address of the to-be-released PBD;
- The software executes a Data Synchronization Barrier (dsb) instruction;
-
The software then checks whether the channel’s Completed PBD Pointer RXnCP points to a PBD (let’s call it PbdEnd) which has EndOfQueue set AND whether the channel’s RXnHDP is 0
-
If all conditions are true the software checks PbdEnd’s next pointer non-zero AND if true whether next PBD is owned by the EMAC
-
If this is true it will set the channel’s RXnHDP to this PBD
-
Otherwise if the to-be-released PBD is still owned by the EMAC, the software sets the channel’s RXnHDP to this PBD.
-
Otherwise we currently have no free buffer to set RXnHDP to, expecting that the software will process and then release another filled PBD
Nevertheless, in rare conditions. when having heavier reception load on the EMAC receive channel. the channel is stopped with MACSTATUS.RXERRCODE set to “Ownership bit not set in SOP buffer” which means that although RXnHDP is 0 AND the software has checked that the PBD that is to be assigned to RXnHDP is still owned by the EMAC, the EMAC will find the PBD already owned by the software when the EMAC next examines the RXnHDP, i.e. we have no clear rule to determine whether the EMAC is currently processing this PBD prior to setting RXnHDP to this PBD.
Could you please provide us with the following information:
- What is the exact order of activities that the EMAC is performing in processing a PBD, beginning with detecting that an Ethernet packet is arriving to the point in time where the EMAC clears the Ownership flag and sets RXnCP, preferably with timing information and information which activities are performed in atomic read-modify-writes?
-
What is the proper way to handle the above described problem?
This issue is really severe to us, as the Technical Reference Manual states having a non-zero value of MACSTATUS.TXERRCODE requires a hardware reset (cf. TMS570LC43x TRM May 2014, sec. 32.5.30, Table 32-69, Field TXERRCODE: “Transmit host error code. These bits indicate that EMAC detected transmit DMA related host errors. The host should read this field after a host error interrupt (HOSTPEND) to determine the error. Host error interrupts require hardware reset in order to recover. …”).
With best regards
Martin