AM623: Issues with CPSW / CPTS workaround for i2401 (SDK 9.2 and up)

Part Number: AM623
Other Parts Discussed in Thread: SK-AM62B, SK-AM62B-P1

Tool/software:

We have been working on transitioning software to use the workaround for i2401 and have encountered a few problems.

1. Third party software fails setting HWTSTAMP_FILTER_ALL 

This causes the software to exit immediately even though it appears to need timestamps only for PTP packets. As a workaround we have translated the call requesting HWTSTAMP_FILTER_ALL to instead enable HWTSTAMP_FILTER_PTP_V2_EVENT.

2. TI driver is not compatible with multiple PTP services

Due to third party software, we have two PTP services running. Since poll() deletes the timestamp and every packet is multicast only one client can receive timestamps. We can work around this by not deleting on poll() and allowing the timeout mechanism to delete timestamps. In our application the buffer for events does not appear to ever fill up.

3. Some packets are still missing timestamps

With the above workarounds, we still encounter an occasional PTP2 packet without a timestamp in the list. I have verified that it is not due to the list filling up and so far can not identify why the timestamp is missing.

  • Hello Mathew, 

    Thanks for clearly explaining the issues you observed. 

    First, a couple of questions,

    1. What SDK version are you specifically working with?

    2. Is this a custom board or a TI AM62x EVM that you observe the issue? 

    3. As we don't have access to your third party software using two PTP services and the only one we use as an example for running PTP on Linux is the linuxptp (ptp4l), is this issue something you can only observe on your third party software? Are you able to see the same issue with ptp4l?

    4. I see that in the CPSW CPTS Linux driver, there appears to already be an implementation for the i2401 workaround https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.12.y&id=30b3fe0672f2a97f22d96a863f2a8f2ed6c52a54 I'm a little confused because it sounds like you are trying to separately implement the workaround in software as well? Were you already aware of the changes made in this link?

    -Daolin

  • Hi Daolin,

    1. The issues are observed in SDK 9.03 and SDK 11.0.
    2. We are primarily working with a custom board but these issues can be easily replicated on the EVM.
    3. The issue with multiple PTP clients can be replicated by running two instances of ptp4l. One of the instances will never get timestamps.
    4. The reference to i2401 is that the change you linked is what caused all of these problems. The issues mentioned above do not exist in 9.00 SDK and earlier.

    I will write up some simple steps to replicate with example output.

  • Hi Matthew, 

    Thanks for explaining these details. I ran out of time to look into this issue further today. I will aim to look into it tomorrow or Thursday. I plan to respond with an update on Thursday. I need to check with the developer who made the changes I linked. 

    I will write up some simple steps to replicate with example output.

    Yes, this will be helpful for us to be able to recreate the problem.

    Please kindly ping this thread if you have not heard back by Thursday.

    -Daolin

  • Hi Matthew, 

    After talking with the developer, we came to the conclusion that we need to first recreate the problem to understand this issue better. From developer's perspective it doesn't seem like we've seen this issue before with PTP packets without a timestamp. 

    It would be good if you could share 

    1. What hardware setup is needed to recreate this problem (preferably with TI EVM)

    2. What ptp4l commands are necessary

    3. What is the expected output and what is actually observed.

    4. Any other details that would be helpful in reproducing this issue

    -Daolin

  • Hi Daolin,

    I can replicate these issues with the following setup:

    1. SK-AM62B EVM board
    2. 11.01 firmware from TI web site at https://dr-download.ti.com/software-development/software-development-kit-sdk/MD-BDCgfEXHLk/11.01.05.03/tisdk-default-image-rt-am62xx-evm-11.01.05.03.rootfs.wic.xz
    3. Connect eth0 to a network with a PTP2 master. Any network configuration will work but I already had a LLN with a master clock so used that.

    1. Fail setting HWTSTAMP_FILTER_ALL

    Execute the following line

    hwstamp_ctl -i eth0 -r 1

    Expected behavior as shown from SDK 09.00.00.010

    current settings:
    tx_type 1
    rx_filter 1
    new settings:
    tx_type 1
    rx_filter 1

    Observed behavior

    current settings:
    tx_type 1
    rx_filter 12
    SIOCSHWTSTAMP failed: Operation not supported

    This is a change to supported features. Existing software packages fail to start due to expecting this to be available, even if timestamps for all packets are not needed.

     

    2. Multiple PTP services

    I am working on test steps to replicate this on the SK and will post when completed.

    3. Packets missing timestamps

    Simply run ptp4l and select the appropriate domain:

    ptp4l -E -4 -H -i eth0 -s -l 6 -q -m --domainNumber=0

    You will see that there are occasional complaints of DELAY_REQ without timestamp

    Expected behavior as shown from SDK 09.00.00.010

    ptp4l[1307.950]: master offset        105 s2 freq    +417 path delay      2637
    ptp4l[1308.914]: master offset       -102 s2 freq    +242 path delay      2648
    ptp4l[1309.926]: master offset        -57 s2 freq    +256 path delay      2648
    ptp4l[1310.941]: master offset        108 s2 freq    +404 path delay      2639
    ptp4l[1311.954]: master offset        -92 s2 freq    +236 path delay      2639
    ptp4l[1312.919]: master offset         33 s2 freq    +334 path delay      2612
    ptp4l[1313.932]: master offset         93 s2 freq    +404 path delay      2617
    ptp4l[1314.945]: master offset        -89 s2 freq    +250 path delay      2659
    ptp4l[1315.958]: master offset         16 s2 freq    +328 path delay      2660
    ptp4l[1316.921]: master offset        -67 s2 freq    +250 path delay      2656
    ptp4l[1317.928]: master offset         45 s2 freq    +342 path delay      2656
    ptp4l[1318.941]: master offset        -52 s2 freq    +258 path delay      2677
    ptp4l[1319.949]: master offset         65 s2 freq    +359 path delay      2677
    ptp4l[1320.960]: master offset          5 s2 freq    +319 path delay      2677
    ptp4l[1321.924]: master offset        -90 s2 freq    +225 path delay      2681
    ptp4l[1322.936]: master offset         63 s2 freq    +351 path delay      2681
    ptp4l[1323.949]: master offset         34 s2 freq    +341 path delay      2681
    ptp4l[1324.959]: master offset         -2 s2 freq    +316 path delay      2685
    ptp4l[1325.922]: master offset        -76 s2 freq    +241 path delay      2684
    ptp4l[1326.933]: master offset        105 s2 freq    +399 path delay      2684
    ptp4l[1327.939]: master offset        -91 s2 freq    +235 path delay      2683

    Observed behavior

    ptp4l[62354.761]: master offset        -24 s2 freq   +1052 path delay      2713
    ptp4l[62355.762]: master offset         78 s2 freq   +1147 path delay      2713
    ptp4l[62356.763]: master offset        -30 s2 freq   +1063 path delay      2713
    ptp4l[62357.630]: port 1 (eth0): received DELAY_REQ without timestamp
    ptp4l[62357.763]: master offset         47 s2 freq   +1131 path delay      2715
    ptp4l[62357.784]: port 1 (eth0): received DELAY_REQ without timestamp
    ptp4l[62358.763]: master offset        -86 s2 freq   +1012 path delay      2715
    ptp4l[62359.148]: port 1 (eth0): received DELAY_REQ without timestamp
    ptp4l[62359.764]: master offset        -43 s2 freq   +1029 path delay      2710
    ptp4l[62360.764]: master offset        147 s2 freq   +1206 path delay      2702
    ptp4l[62361.765]: master offset        -58 s2 freq   +1045 path delay      2704
    ptp4l[62362.765]: master offset        -80 s2 freq   +1006 path delay      2704
    ptp4l[62363.766]: master offset         59 s2 freq   +1121 path delay      2706

  • An additional related question:

    Is PTP v1 no longer supported due to the change in hardware timestamping?

  • Hi Matthew, 

    3. Packets missing timestamps

    Does the behavior you see in 3. Packets missing timestamps require changes you are going to share about 2. multiple PTP services? Do the missing timestamps show up quickly or does ptp need to be running for a long time to see it? Once I get a confirmation on this I can try and see if I can reproduce.

    Is PTP v1 no longer supported due to the change in hardware timestamping?

    According to the AM62x TRM, CPTS is listed to enable compliance with IEEE 1588-2008 (PTPv2). It doesn't say anything about PTPv1 no longer supported but my understanding is that the CPTS was designed to comply with PTPv2. I don't think PTPv1 no longer supported because of the workaround for the i2401 errata.

    Snippet from the TRM below.

    12.3.1.4.7 Common Platform Time Sync (CPTS)
    The Common Platform Time Sync (CPTS) module is used to facilitate host control of time sync operations. It
    enables compliance with the IEEE 1588-2008 standard for a precision clock synchronization protocol.

    -Daolin

  • The behavior in #3 does not have any other prerequisites. It happens immediately and you will see it with the command I provided.

  • Hi Matthew, 

    I'm working on trying to replicate the issue. I initially had an SK-AM62B-P1 EVM with SDK 11.0 Linux had some trouble trying to replicate exactly what you saw so I'm flashing an SD card with the latest SDK 11.1 trying again. I'll let you know what I find tomorrow.

    -Daolin

  • Hi Daolin,

    If you see different results I think that there may be some difference due to external factors such as PTP master configuration and the number of multicast clients. We can try to make sure that our PTP connections match.

    Also, I am having trouble reproducing #2 now. Recently I have been working on 09.00.00.010 with the patch added for i2401 and my original testing was on 09.03.06 so I am going back to that version to gather more data.

  • Hi Matthew,

    If you see different results I think that there may be some difference due to external factors such as PTP master configuration and the number of multicast clients. We can try to make sure that our PTP connections match.

    I just tried out several different configurations of PTP on my AM62x EVM with SDK 11.1. For some reason, I cannot get E2E mode with the -4 option (UDP IPv4 I believe) to generate meaningful output. However, when running E2E mode with -2 option (IEEE 802.3 network transport), I didn't see the "DELAY_REQ without timestamps" message but I do see "received SYNC without timestamp" (below log not up to date to reflect this). 

    In P2P mode, I am seeing PDELAY_REQ without timestamp. I think this seems to be similar to DELAY_REQ without timestamp but just for P2P instead of E2E.

    In summary, I wasn't able to replicate exactly what you see but I see similar messages like "PDELAY_REQ without timestamp" and "SYNC without timestamp".

    What doesn't make sense to me is that if you do "ethtool -T eth0", the timestamping capabilities show that CPSW should be capable of the HWTSTAMP_FILTER_PTP_V2_EVENT that is specified in the errata workaround (https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.12.y&id=30b3fe0672f2a97f22d96a863f2a8f2ed6c52a54

    While I haven't looked into it more in detail, my next thought is if we can see if this am65_cpts_find_rx_ts() is returning anything useful: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/net/ethernet/ti/am65-cpts.c?id=30b3fe0672f2a97f22d96a863f2a8f2ed6c52a54#n946 

    Logs:

    AM62x Log using P2P mode and gPTP.cfg: https://gist.github.com/dao-qiu/dd0fb8bba1c3fbbbe9c954ab06c68a1f

    --> able to see "received PDELAY_REQ without timestamp"

    AM62x Log using P2P mode without gPTP.cfg: https://gist.github.com/dao-qiu/5b2dff3c3d9e48540b1fb6ed3fb12e58 

    --> able to see "received PDELAY_REQ without timestamp"

    AM62x Log using E2E mode with -2 option: https://gist.github.com/dao-qiu/9fdd4762199e753b0bf14ddb2da8e3f5

    --> TI not able to replicate "received DELAY_REQ without timestamp" with E2E mode

    AM62x Log using E2E mode with -4 option: https://gist.github.com/dao-qiu/6733f62b2c85dc5f36eea7c9057221f3 

    ---> TI not able to see any meaningful output with -4 option (UDP IPv4 network transport)

    -Daolin

  • Hi Daolin,

    It looks like your IPv4 does not see the master at all. You can test for PTP packets with "tcpdump -n dst port 319 or dst port 320".

    I did some instrumenting of am65_cpts_find_rx_ts() as follows:

    static u64 am65_cpts_get_rx_ts(struct am65_cpts *cpts, struct sk_buff *skb,
    			       u32 skb_mtype_seqid)
    {
       
       ... Existing function contents ...
    
    
        if(ns){
    		dev_warn(cpts->dev, "Return/del TS for %i", skb_mtype_seqid);
        }else{
            dev_warn(cpts->dev, "TS not found for %i", skb_mtype_seqid);
        }
    
    	return ns;
    }
    

    and an excerpt is below,

    Jul 27 14:06:42 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280036
    Jul 27 14:06:42 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215398
    Jul 27 14:06:43 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215399
    Jul 27 14:06:43 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4279609
    Jul 27 14:06:43 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280037
    Jul 27 14:06:44 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215400
    Jul 27 14:06:45 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4279610
    Jul 27 14:06:45 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280038
    Jul 27 14:06:45 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215401
    Jul 27 14:06:46 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280039
    Jul 27 14:06:46 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4279611
    Jul 27 14:06:46 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215402
    Jul 27 14:06:46 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: TS not found for 4279612
    Jul 27 14:06:47 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280040
    Jul 27 14:06:47 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215403
    Jul 27 14:06:47 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4279613
    Jul 27 14:06:48 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280041
    Jul 27 14:06:48 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: TS not found for 4215404
    Jul 27 14:06:49 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4279614
    Jul 27 14:06:49 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280042
    Jul 27 14:06:49 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4215405
    Jul 27 14:06:50 r2d2 kernel: am65-cpsw-nuss 8000000.ethernet: Return/del TS for 4280043

    So it appears that at least 80% of packets return a timestamp successfully.

  • Hi Matthew,

    Based on your findings from am65_cpts_find_rx_ts(), it could be that the condition (in the link below) is not true for every PTP packet, you could probably also quickly check. I'm planning on meeting with the developer tomorrow to understand what is meant by this conditional statement.

    https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/blame/drivers/net/ethernet/ti/am65-cpts.c?h=ti-linux-6.12.y&id=30b3fe0672f2a97f22d96a863f2a8f2ed6c52a54#n915 

    It looks like your IPv4 does not see the master at all. You can test for PTP packets with "tcpdump -n dst port 319 or dst port 320".

    I realized my problem was not setting up IPv4 addresses. Once that was setup, I do see synchronization happening but instead of "DELAY_REQ without timestamp" I see "SYNC without timestamp". While not exactly what you see, since both DELAY_REQ and SYNC are PTP messages, I think the root of problem is still that some PTP messages don't receive timestamps as you have pointed out.

    Actually, in the past I've seen this issue with SYNC messages, albeit with a different test setup involving 3 boards connected and configured like: EVM1 (slave) eth0<> eth0 EVM2 (master) eth1<> eth0 EVM3 (slave), setup up EVM2 eth0, eth1 as switch. 

    root@am62xx-evm:~# ifconfig eth0
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet6 fe80::1e63:49ff:fe0f:61eb prefixlen 64 scopeid 0x20<link>
        ether 1c:63:49:0f:61:eb txqueuelen 1000 (Ethernet)
        RX packets 202930 bytes 14014449 (13.3 MiB)
        RX errors 0 dropped 57976 overruns 0 frame 0
        TX packets 59280 bytes 4029835 (3.8 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
    
    root@am62xx-evm:~# ip addr add 192.168.1.10/24 dev eth0
    root@am62xx-evm:~# ifconfig eth0
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 192.168.1.10 netmask 255.255.255.0 broadcast 0.0.0.0
        inet6 fe80::1e63:49ff:fe0f:61eb prefixlen 64 scopeid 0x20<link>
        ether 1c:63:49:0f:61:eb txqueuelen 1000 (Ethernet)
        RX packets 202938 bytes 14016223 (13.3 MiB)
        RX errors 0 dropped 57976 overruns 0 frame 0
        TX packets 59290 bytes 4031978 (3.8 MiB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
    
    root@am62xx-evm:~# ping 192.168.1.11
    PING 192.168.1.11 (192.168.1.11) 56(84) bytes of data.
    64 bytes from 192.168.1.11: icmp_seq=1 ttl=64 time=0.566 ms
    64 bytes from 192.168.1.11: icmp_seq=2 ttl=64 time=0.365 ms
    ^C
    --- 192.168.1.11 ping statistics ---
    2 packets transmitted, 2 received, 0% packet loss, time 1053ms
    rtt min/avg/max/mdev = 0.365/0.465/0.566/0.100 ms
    root@am62xx-evm:~# 
    root@am62xx-evm:~# 
    root@am62xx-evm:~# 
    root@am62xx-evm:~# 
    root@am62xx-evm:~# ptp4l -E -4 -H -i eth0 -s -l 6 -q -m --domainNumber=0  
    ptp4l[59806.199]: selected /dev/ptp0 as PTP clock
    ptp4l[59806.206]: port 1 (eth0): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[59806.207]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[59806.207]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE
    ptp4l[59807.379]: port 1 (eth0): new foreign master 3408e1.fffe.80a7ad-1
    ptp4l[59811.379]: selected best master clock 3408e1.fffe.80a7ad
    ptp4l[59811.379]: port 1 (eth0): LISTENING to UNCALIBRATED on RS_SLAVE
    ptp4l[59814.379]: master offset   -1763 s0 freq -18112 path delay    417
    ptp4l[59815.379]: master offset   -1793 s2 freq -18142 path delay    417
    ptp4l[59815.380]: port 1 (eth0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
    ptp4l[59816.379]: master offset   -1792 s2 freq -19934 path delay    414
    ptp4l[59817.380]: master offset     0 s2 freq -18680 path delay    412
    ptp4l[59818.380]: master offset    533 s2 freq -18147 path delay    412
    ptp4l[59819.380]: master offset    522 s2 freq -17998 path delay    413
    ptp4l[59820.380]: master offset    365 s2 freq -17998 path delay    414
    ptp4l[59821.380]: master offset    201 s2 freq -18053 path delay    417
    ptp4l[59822.380]: master offset     96 s2 freq -18098 path delay    417
    ptp4l[59823.380]: master offset     29 s2 freq -18136 path delay    417
    ptp4l[59824.380]: master offset     2 s2 freq -18154 path delay    415
    ptp4l[59825.380]: master offset    -13 s2 freq -18169 path delay    417
    ptp4l[59826.380]: master offset    -14 s2 freq -18173 path delay    417
    ptp4l[59827.380]: master offset    -10 s2 freq -18174 path delay    417
    ptp4l[59828.381]: master offset     -9 s2 freq -18176 path delay    417
    ptp4l[59829.381]: master offset     -4 s2 freq -18173 path delay    414
    ptp4l[59830.381]: master offset     -3 s2 freq -18174 path delay    414
    ptp4l[59831.381]: port 1 (eth0): received SYNC without timestamp
    ptp4l[59832.381]: master offset     7 s2 freq -18164 path delay    410
    ptp4l[59833.381]: master offset     7 s2 freq -18162 path delay    410
    ptp4l[59834.381]: master offset     4 s2 freq -18163 path delay    409
    ptp4l[59835.381]: master offset     2 s2 freq -18164 path delay    410
    ptp4l[59836.381]: master offset     0 s2 freq -18165 path delay    410
    ptp4l[59837.381]: master offset     -1 s2 freq -18166 path delay    410
    ptp4l[59838.381]: master offset     -7 s2 freq -18173 path delay    410
    ptp4l[59839.381]: master offset     3 s2 freq -18165 path delay    410
    ptp4l[59840.381]: master offset     1 s2 freq -18166 path delay    410
    ptp4l[59841.382]: master offset     4 s2 freq -18163 path delay    410
    ptp4l[59842.382]: master offset     3 s2 freq -18162 path delay    410
    ptp4l[59843.382]: master offset     9 s2 freq -18156 path delay    410
    ptp4l[59844.382]: master offset     7 s2 freq -18155 path delay    411
    ptp4l[59845.382]: master offset     9 s2 freq -18151 path delay    410
    ptp4l[59846.382]: master offset     7 s2 freq -18150 path delay    410
    ptp4l[59847.382]: master offset     8 s2 freq -18147 path delay    411
    ptp4l[59848.382]: master offset     3 s2 freq -18150 path delay    411
    ptp4l[59849.382]: master offset     3 s2 freq -18149 path delay    411
    ptp4l[59850.382]: master offset     0 s2 freq -18151 path delay    411
    ptp4l[59851.382]: master offset     2 s2 freq -18149 path delay    411
    ptp4l[59852.383]: master offset     2 s2 freq -18148 path delay    411
    ptp4l[59853.383]: master offset     4 s2 freq -18146 path delay    411
    ptp4l[59854.383]: master offset     3 s2 freq -18145 path delay    411
    ptp4l[59855.383]: master offset     -2 s2 freq -18149 path delay    411
    ptp4l[59856.383]: master offset     -3 s2 freq -18151 path delay    411
    ptp4l[59857.383]: master offset     -2 s2 freq -18151 path delay    411
    ptp4l[59858.383]: master offset     -1 s2 freq -18151 path delay    411
    ptp4l[59859.383]: master offset     3 s2 freq -18147 path delay    411
    ptp4l[59860.383]: port 1 (eth0): received SYNC without timestamp
    ptp4l[59861.383]: master offset     5 s2 freq -18144 path delay    410
    ptp4l[59862.383]: master offset     6 s2 freq -18141 path delay    410
    ptp4l[59863.383]: master offset     7 s2 freq -18139 path delay    410
    ptp4l[59864.384]: master offset     2 s2 freq -18142 path delay    410
    ptp4l[59865.384]: master offset     3 s2 freq -18140 path delay    410
    ptp4l[59866.384]: master offset     4 s2 freq -18138 path delay    410
    ptp4l[59867.384]: master offset     4 s2 freq -18137 path delay    410
    ptp4l[59868.384]: master offset     4 s2 freq -18136 path delay    410
    ptp4l[59869.384]: master offset     3 s2 freq -18135 path delay    410
    ptp4l[59870.384]: master offset     4 s2 freq -18134 path delay    410

    -Daolin

  • Small update for today:

    First, I am having trouble replicating my issue #2 from the original post. I think I may have been confused by a series of packets without timestamps and assumed that no packets were getting to the second PTP service with timestamps. I also originally saw changes in the number of missing timestamps on the first service when I started and stopped the second service. I no longer see that behavior.

    Second, with some additional instrumentation I am showing that when am65_cpts_find_rx_ts() does not return a timestamp the list &cpts->events is empty (or all events have timed out) as the comparison to skb_mtype_seqid is never performed.

  • Hi Matthew, 

    Thanks for the update. After talking to the developer, we wanted to figure out if this issue is also observed on the other AM6x devices or if it only shows up for AM62x. The point is to narrow down if the problem is really related to the software driver or if it is hardware related. Following this conversation, I tested on AM62Px as a PTP follower and noticed that no "without timestamps" messages show up even after about 3 minutes of runtime. I checked that the i2401 errata is also present for AM62Px; the workaround implemented in the CPSW CPTS drivers should also be the same as AM62x. 

    Second, with some additional instrumentation I am showing that when am65_cpts_find_rx_ts() does not return a timestamp the list &cpts->events is empty (or all events have timed out) as the comparison to skb_mtype_seqid is never performed.

    Are you specifically referencing this line regarding &cpts->events? https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/blame/drivers/net/ethernet/ti/am65-cpts.c?h=ti-linux-6.12.y&id=30b3fe0672f2a97f22d96a863f2a8f2ed6c52a54#n903 

    Thanks for pointing this out. Since the developer is under the impression that since the issue is only with AM62x, implying that it could be hardware specific, let me find out what is next step here and get back to you tomorrow.

    -Daolin

  • Additional instrumenting produces some interesting results. Note that calls to the logging system will change timing and can impact system behavior.

    Aug 05 13:15:02.102953 r2d2 kernel: TS Q->L 0040d686, 4295050901
    Aug 05 13:15:02.115171 r2d2 ptp4l[1087]: [634.434] port 1 (eth0): received SYNC without timestamp
    Aug 05 13:15:02.168280 r2d2 kernel: TS missing 0040d686, 4295050901

    These lines show:

    1. am65_cpts_fifo_read() places a timestamp for mtype_seqid=0040d686 in event->list at jiffies=4295050901
    2. ptp4l reports a missing timestamp
    3. am65_cpts_get_rx_ts() does not fine a timestamp for mtype_seqid=0040d686 in event->list at jiffies=4295050901

    This indicates one of:

    • am65_cpts_get_rx_ts() calls am65_cpts_fifo_read() and immediately does not find the item just added to the list
    • am65_cpts_get_rx_ts() does not find the item and a subsequent interrupt call to am65_cpts_fifo_read() adds it to the list

    The second seems less likely due to both events having the same value for jiffies. I will investigate further to identify when the timestamp is removed from the list and to identify sequence of events better.

  • Hi Matthew, 

    I will investigate further to identify when the timestamp is removed from the list and to identify sequence of events better.

    Appreciate the update on what you discovered. Understanding the sequence of events leading to the timestamp being removed would also help us understand if this is really something wrong with the software implementation of the workaround if there may be some underlying hardware issue (given that this issue doesn't appear to show for AM62Px which uses the same CPSW driver and has the same i2401 errata). 

    -Daolin

  • I think that I have found a resolution to the missing timestamps in the following patch:

    diff --git a/drivers/net/ethernet/ti/am65-cpts.c b/drivers/net/ethernet/ti/am65-cpts.c
    index 29e0b9ff23bf..cd78519676ab 100644
    --- a/drivers/net/ethernet/ti/am65-cpts.c
    +++ b/drivers/net/ethernet/ti/am65-cpts.c
    @@ -826,7 +826,6 @@ static void am65_cpts_find_ts(struct am65_cpts *cpts)
     
     	spin_lock_irqsave(&cpts->lock, flags);
     	list_splice_init(&cpts->events, &events);
    -	spin_unlock_irqrestore(&cpts->lock, flags);
     
     	list_for_each_safe(this, next, &events) {
     		event = list_entry(this, struct am65_cpts_event, list);
    @@ -837,7 +836,6 @@ static void am65_cpts_find_ts(struct am65_cpts *cpts)
     		}
     	}
     
    -	spin_lock_irqsave(&cpts->lock, flags);
     	list_splice_tail(&events, &cpts->events);
     	list_splice_tail(&events_free, &cpts->pool);
     	spin_unlock_irqrestore(&cpts->lock, flags);
    

    The middle section which was previously not protected by a spinlock edits the list. When running concurrently with other functions this apparently interferes with list searching. This is the only use of the event list which was not protected by spinlock.

  • Hi Matthew, 

    Apologies for the delay in response.

    The middle section which was previously not protected by a spinlock edits the list. When running concurrently with other functions this apparently interferes with list searching. This is the only use of the event list which was not protected by spinlock.

    Thanks for sharing this solution. I tested your patch on several different PTP configurations using an AM62x as a PTP follower and I no longer see the "without timestamps" messages. I will discuss internally with the developer to see if this is a solution we can integrate as a fix. 

    -Daolin