This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Tool/software:
Hello TI-support-Team,
we have a customer that faces some issues with your TI PHY DP83867 and Cisco-Switches.
TI already confirmed this issue at customer's and also our side, so I am creating this ticket to get attention for this topic and to initiate the search for a solution.
Here is our customer's issue description:
they have problems with the PHY DP83867 on SA7.
The problem is called 'IPG/IFG 8 Byte issue', TI already confirmed the topic.
Here you will find some detailed explanation why we see the CRC-errors (loss of data-packages) on the conga-SA7:
On the module we use our Audio-over-IP (AoIP) - device.
AoIP is made of Low-Latency-network with many small packages (>3kHz, package size about 256B, UDP).
When using a Cisco-switch (e.g. from the Series SG350= we see the CRC-errors on RX.
The Switch uses the IEEE-Standard with 8 Byte in RX and uses - dependent on the number and size of the packages, less then 12 Byte IFG/IPG, also 8 Bytes.
These reduces the latency and increases the throughput with small packages.
Also IPG gets used for the orientation, Cisco describes this in their documentation.
The Cisco-switch is reommended by many manufacturers of AoIP-devices. That's why it is used by many end-users. Devices with a different PHY do not have Problems with 8 Byte IFG/IPG. The IEEE-standard for 1GBE also says that RX 8Byte has to be supported.
The CRC-errors are recognized in the Linux Kernel through the MAC and then the packages are lost. The CRC-check can be done by SW and HW. It has no effects on the number of CRC-errors. The Kernel module of the MAC increases the statistics. You can see this with an easy tool like ethtool.
We have done this test with VTM-register. This affects the error, see TI-forum:
DP83867CS: size-dependent packet loss with minimum IPG Part Number: DP83867CS Other Parts Discussed in Thread: DP83867CR , We'd like to understand the minimum IPG supported by this part. Based on a forum post for the e2e.ti.com |
Unfortunately the error is not gone completely. With the undocumented setting 0x3 (see forum) we get the most least errors, so we get about 1 error / hour. 0x2 leads to more or different errors.
TI already confirmed the issues with IPG/IFG 8 Byte on the used PHY.
Maybe it manages up to 10 or 11 Bytes but on lower values it fails.
Here is a description for the test to reproduce the CRS-errors:
When using the Cisco-switch with SA7 you get ~2% errors, that can be shown with ethtool -S.
With a direct connection or using a different switch (normal, no low latency, not managed), you cannot see the CRC-erorrs, as only 12 Byte IFG/IPG are used.
Setting the SA7 to 100MBt/s you will also not see the errors with a Cisco-switch, as the IPG/IFG is much higher.
SA7 (Server):
iperf3 -s -p 5000
PC (Client):
iperf3 -c IP_ADDRESS -p 5000 -u -b 5.85M -P 16 -l 201 --pacing-timer 125000 -t 60 -S 46
The load in the network corresponds to a 'nomal' AoIP-load 50-70 MBit/s)
With a 60s-test you will see about 90 errors (90 packages get lost).
Can you please have a look at this and let me know when the problem will approximately be solved on your side?
Thank you and Best Regards,
Anja Maier
Hi Anja
Thank you for the detailed query with the end application in mind. When you mention the Cisco SG350 switch, how is this being connected to the DP83867? You mentioned that different devices do not have this problem when connected to the SG350 switch, could you share a block diagram of how this set up looks like?
In the iperf example you gave, is our DP83867 in the PC (Client)? I could recreate this on my end and see the results.
Could you also clarify all of the parameters in the command, just so that we're both aligned. My current understanding is below
iperf3 -c IP_ADDRESS -p 5000 -u -b 5.85M -P 16 -l 201 --pacing-timer 125000 -t 60 -S 46
I currently tried running the command with the following setup:
Processor Board with 867(Client) <-Ethernet Cable-> Linux PC(Server)
The output below is from the Client side, showing 0% errors. Is this similar to your set up?
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 5] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.064 ms 0/217922 (0%) receiver [ 7] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 7] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.063 ms 0/217922 (0%) receiver [ 9] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 9] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.063 ms 0/217922 (0%) receiver [ 11] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 11] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.059 ms 0/217922 (0%) receiver [ 13] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 13] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.059 ms 0/217922 (0%) receiver [ 15] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 15] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.059 ms 0/217922 (0%) receiver [ 17] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 17] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.062 ms 0/217922 (0%) receiver [ 19] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 19] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.062 ms 0/217922 (0%) receiver [ 21] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 21] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.062 ms 0/217922 (0%) receiver [ 23] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 23] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.062 ms 0/217922 (0%) receiver [ 25] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 25] 0.00-60.00 sec 28.0 MBytes 3.91 Mbits/sec 0.064 ms 0/217922 (0%) receiver [ 27] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 27] 0.00-60.00 sec 27.9 MBytes 3.91 Mbits/sec 0.067 ms 0/217922 (0%) receiver [ 29] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 29] 0.00-60.00 sec 27.9 MBytes 3.91 Mbits/sec 0.070 ms 0/217921 (0%) receiver [ 31] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 31] 0.00-60.00 sec 27.9 MBytes 3.91 Mbits/sec 0.070 ms 0/217921 (0%) receiver [ 33] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 33] 0.00-60.00 sec 27.9 MBytes 3.91 Mbits/sec 0.064 ms 0/217921 (0%) receiver [ 35] 0.00-60.00 sec 41.8 MBytes 5.84 Mbits/sec 0.000 ms 0/217922 (0%) sender [ 35] 0.00-60.00 sec 27.9 MBytes 3.91 Mbits/sec 0.064 ms 0/217921 (0%) receiver [SUM] 0.00-60.00 sec 668 MBytes 93.4 Mbits/sec 0.000 ms 0/3486752 (0%) sender [SUM] 0.00-60.00 sec 448 MBytes 62.6 Mbits/sec 0.063 ms 0/3486748 (0%) receiver iperf Done.
Regards,
Alvaro
Hello Alvaro,
thank you very much for the fast feedback!
First of all, our customer is using the conga-SA7 that uses the DP83867CS for the Ethernet.
For further details, please have a look at the SA7 manual, especially the Block Diagram on Page 21.
Regarding the Test Setup, the DP83867CS (SA7) is used as the server, so you will probably see the issue by exchanging client and server on your side.
For further details, I will align with our customer and come back to you.
Thank you and Best Regards,
Anja
Hi Anja,
I was able to get another Processor board, such that the setup was:
Proc Board <-Ethernet Cable-> Proc Board
Where the DP83867 is on both ends of the link. The command:
iperf3 -c IP_ADDRESS -u -b 5.85M -P 16 -l 201 --pacing-timer 125000 -t 60 -S 46
still ran with 0 errors.
Figure 1 - Client and Server after run
Please let me know when you have further details from the customer, regarding the purpose of the command and what exactly it is doing.
Regards,
Alvaro
Hi Alvaro,
please hold on, I am still waiting for feedback from our customer.
Just a general question regarding your setup: are you using a Cisco-Switch between the Client and Server (or any other network with RX<12Byte, please see my first comment) or are they directly connected? I guess the problem won't be visible using a 'normal' network.
Thank you and BR,
Anja
Hi Anja,
No worries. To clarify, my set up at the moment is a direct connection with no switch in between.
Proc Board <-Ethernet Cable-> Proc Board
Whenever you have an update please feel free to reply here.
Regards,
Alvaro
Hi Anja, Hi Alvaro,
I'm Andreas. We use the conga-SA7 in our Product.
Iperf -S sets the DSCP/TOS bit vectors. This feature is used in network devices to use high priority queues for specific packets.
The issue only appears when IPG lower than 10/11 Bytes is used. When you connect DP83867 directly to another Device or use some older switch, the devices will use always 12 byte IPG. Hence, you see no errors.
Using a CISCO SG350 to connect all devices: DP83867 (iperf server) <-> CISCO SG350 <-> PC (iperf client), the error appears since the CISCO switch uses lower IPG settings to improve performance and to lower the latency. This behavior is IEEE conform. IEEE standard for 1 Gbit requires for RX path 8 byte RX support. It evens says in the standard "it must support 8 bytes".
TI already confirmed that DP83867 has problems regarding IPG lower than 10/11 bytes. So, with the VTM register you can set some algorithm parameters in RX path to allow 10/11 byte IPG. The default setting used in most of the implementations such as Linux kernel module for DP83867 works only for IPG exactly 12 bytes.
There is an undocumented setting in the VTM that could work partially with 8 bytes IPG. BUT like already described in the initial post, I creates other issues since the error correction in the phy then is very limited and not sufficient. Therefore, it is not documented.
In another Product from us which was using the DP86837 we already switched to another PHY, since it is the only solution for that problem. Other PHYs (non-TI) don't have issues with IPG lower than 12 bytes. At least we haven't found any.
Regards,
Andreas
Hi Andreas,
Understood, the iperf command is not directly setting the IPG and the setup that I tried was not highlighting the issue that you are currently seeing. I was hoping that something within the command was setting the IPG. Please let me discuss with my team, I will get back to you before end of day tomorrow, Aug 15th.
Regards,
Alvaro
Hi Andreas,
There is a conversation taking place over email, I will close this E2E and continue over there.
Regards,
Alvaro