This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6548: PRU Gigabit Ethernet Issue w/ SDK8.6

Part Number: AM6548

I have had issues with the PRU ethernet driver at gigabit speeds since SDK8.0. Luckily I haven't really needed to use it until now. I updated to SDK8.6 and it looks like there has been a lot of changes to the pru eth driver as well as presumably the PRU ethernet firmware. I am attempting to use the autoforwarding feature here as described in the TRM section 6.5.11.2.1.1. As far as I can tell, the new PRU firmware + driver seems to be working pretty well for network captures using TCPdump or my own custom capture program I have been using. For the sake of simplicity here though, I am just using TCPdump since it ships with the SDK image.

 

My setup here is as follows:

 

ICSSG2 I enable auto packet forwarding with the following register settings:

 

0x0B232000: 0x259

0x0B232010: 0x000B1201

0x0B232004: 0x251

0x0B232014: 0x000B1301

0x0B233000: 0x1042D

 

ICSSG0 I leave in the default configuration.

 

I then start two TCPdump instances in my am65x board, one sniffing ICSSG0_PRU0, and one sniffing one of the autoforwarded connections, ICSSG2_PRU0. I get identical results between the two; EXCEPT the second byte of the destination IP address when capturing on the autoforward side occasionally gets screwed up! It seems pretty random as to when it happens, and it also seems random as to what gets written to that byte. It’s as if there’s a pointer that isn’t treated properly, and it occasionally overwrites that byte. I’m unsure why that would happen only with the autoforwarding enabled though. The rest of the packet data looks to be identical between them, though I’ve only done random spot checks at this point. I can provide the two capture files if it would be helpful.

 

Any ideas of where I could look to solve this issue? Or any ideas what may be causing this?

  • Hi Keith,

    I am trying to reproduce the issue at my end.

    Will get back to you soon.

    Thanks and Regards,

    Rimika

  • Hi any update on this? My steps to recreate were to build the SDK with default options, though I add all three ICSSG instances to the am654 base board device tree file. I compile this, then just use devmem2 to set the register values I specified. I then just use tcpdump to capture on both interfaces just using two separate SSH sessions. The tap I am using is a Dualcomm ETAP-XG with their 10/100/10000 sfp modules installed (4x DPM-1G-TA-01).

  • Hi Keith,

    Thanks for these inputs, I am working on it.

    Thanks and Regards,

    Rimika

  • Hi any updates on this? Let me know if you need me to provide any more detail

  • Hi Keith, 

    I would like to apologize for late response, but dev team is currently occupied.

    Will try to resolve as soon as possible.

    Thanks and Regards,

    Rimika

  • Hi Keith,

    Can you share the two pcap files (sent and received) .

    Thanks,

    Rimika

  • Hi Rimika,

    Thanks for the response. I'm not too sure how to share the pcap files. They're both 6.4MB, and I'm assuming that's over the attachment limit for this forum. I tried to upload them but it doesn't seem to work. Do you have an email or somewhere I can send them to? 

  • Hi Keith,

    Can you sent me just 3 -4 packet instances', I'll replicate those or you can share it on r-gupta2@ti.com

    Thanks and Regards,

    Rimika

  • Hi Keith,

    I have generated the temporary pcap files.

    Can you explain me what do you mean by " the second byte of the destination IP address when capturing on the autoforward side occasionally gets screwed up", as if the destination address will get altered, you might not be able to receive the data at the expected end and then how are you able to capture the data at receiving end?

    Thanks and Regards,

    Rimika

  • Hi Rimika

    I just sent you my two pcap files, let me know if there is any issues receiving them and I can just upload small snippets here.

    So what I mean by the second byte of the destination IP getting altered, is that the captured packet gets corrupted when auto forwarding is enabled. As you can see from my capture pcaps, they are identical, except that when the packets are captured using the auto forwarded interface, occasionally the second byte of the destination IP address gets written over. So in the case of my captures, all traffic should've been 10.10.0.3 TO 10.10.0.101. But occasionally, we find garbage written into this second byte of the destination. So packet 130 is 10.4.0.101, packet 172 is 10.113.0.101, etc. So it seems like occasionally something in the PRU firmware or PRU eth driver is writing over this byte accidently. 

  • Hi Keith,

    I have received the files and as you have pointed out I am able to observe the altered IP address you mentioned.

    Working on it, will get back to you.

    Thanks and Regards,

    Rimika

  • Hi Keith,

    Can you try using switch mode instead of writing to address locations directly?

    If the address is getting altered, and you are still receiving the packet at the intended address that suggests address is getting altered while decrypting metadata i.e. at the receiving end.

    Thanks and Regards,

    Rimika

  • Hi Rimika,

    So if you look at my test setup from the first post, my AM6548 board is not the recipient of the data at all. It's just sniffing the data both with a tap and through one of the interfaces that is using auto packet forwarding. The setup is highlighting that when packet auto forwarding is enabled, the PRU eth driver (I suspect at least) is not handling the traffic correctly. Both captures are from the AM6548 board, so clearly it is capable of capturing at gigabit speeds as shown with the testTap.pcap file. I can try to change the interface that is using autoforwarding to use switch mode though if you think that might be useful. How might I do that?

  • Hi Keith,

    Try using AM65 in switch mode for packet forwarding once.

    PC1 -> AM65(icssg_port0) [switch mode] AM65(icssg_port1) -> PC2

    Thanks and Regards,

    Rimika

  • Okay I can try that as a test though it is ideally not what I want to do, as I want the minimal delay that the autoforwarding offers. How do I set the ICSSG instance into switch mode? Is there a register setting or command I need to send?

  • Hi Keith,

    These are the commands you need to run on board to enable switch mode:

    Assuming eth1 and eth2 as ICSSG0 interfaces

    ip link set dev eth1 down
    ip link set dev eth2 down
    devlink dev param set platform/icssg0-eth name switch_mode value 1 cmode runtime
    ip link add name br0 type bridge
    ip link set dev eth1 master br0
    ip link set dev eth2 master br0
    ip link set dev br0 up
    ip link set dev eth1 up
    ip link set dev eth2 up
    bridge vlan add dev br0 vid 1 pvid untagged self

    Thanks and Regards,

    Rimika

  • Hi Rimika,

    So when I enable switch mode as you mentioned above, the bridge forwarding works, I can log into remote PC, however I cannot capture packets on the am6548 board it seems. I try capturing on eth1, eth2 and br0 and none of them capture packets. It would seem in switch mode these packets aren't presented to the kernel? 

  •  Hi Keith,

    Can you share the flow you are trying to run.

    Thanks and Regards,

    Rimika

  • So what I did was boot into linux, waited a minute or so, and then executed all the commands you mentioned in your post. I then started tcpdump with the arguemnts -i eth1 -w /run/media/nvme0n1/test123.pcap. I tried this with eth2 as well as br0 and it collected nothing. I verified I could go back and enable auto forwarding with the PRU in the register settings (same settings as my first post) and it collects fine with that. Of course with the exception that some of the packets are corrupted as mentioned. My test setup is the same as my first post, except that the network tap wasn't connected

  • Hi Keith,

    When you are using the board in switch mode, the board is acting as an intermediator through which you sre forwarding your data, thus, you won't be able to capture at this point. You need to capture this data at the destination point.

    As per my understanding you want to auto forward your data to another destination with AM65 acting as the mediator thus we won't be requiring any data capture at this point.

    I may not have interpreted your query correctly. 

    Thanks and Regards,

    Rimika

  • Ah yeah there is a misunderstanding on what my goal here is. I want to auto forward the data to another destination while simultaneously capturing it on am6548 end. The auto forward feature with optional PRU snoop described in TRM section 6.5.11.2.1.1 seems to accomplish this. Using the register settings I posted in my initial post, this auto forwards every packet without error. And again with my test setup described in the first post, I show that it does indeed auto forward packets correctly, and I can capture them all correctly when using a network tap connected to the am6548 board.

    My issue is that I do not want to use the external network tap to sniff the network, I want to use the PRUs that are auto forwarding the packets. And this is where I see my issue with the sniffed packet data corruption. Again the actual packet data on the network is just fine, it is just the data that I sniff gets incorrectly read it seems. Interestingly enough, if I force both PRUs to 100mb/s speeds using ethtool, using the same packet forwarding register settings I see no corruption in the captured data. All captured data from them looks good. So this only happens at 1gb/s speeds.

  • Hi Keith,

    This is our target?

    Thanks and Regards,

    Rimika

  • Hi Rimika,

    Yes that is the target there! And specifically at 1GB/s speeds.

  • Hi Keith,

    This feature is not supported.

    Refer to this E2E: (+) AM6548: Lack of Auto-Forwarding and Preamble Packet Cut at >1G Speed - Processors - INTERNAL forum - Processors - INTERNAL - TI E2E support forums

    I'll confirm it once and get back to you.

    Thanks and Regards,

    Rimika

  • Hi Rimika,

    Is that an internal link? It won't open for me. Can you clarify what part of the feature isn't supported? The autoforwarding or the capturing? 

  • Hi Keith,

    Sorry, it is an internal link.

    Autoforwarding feature is not supported.

    Thanks and Regards,

    Rimika

  • So is autoforwarding not supported by the hardware or by the linux driver or the PRU firmware?

  • Hello Keith,

    Autoforwarding is a hardware feature that exists. However, it would need to be enabled in the PRU Ethernet firmware. At this point in time there are no plans to add that feature into the PRU Ethernet firmware, and then write additional Linux driver code to be able to take advantage of the autoforwarding.

    Apologies for the confusion here. These parts have a ton of hardware features, and our software teams do not have the resources to implement every single feature. It looks like autoforwarding was actually used in some other projects (SORTE_G, which is a kind of custom networking protocol), but not in the Ethernet firmware. We did take a look for you, but it would just take too many manhours to justify adding in.

    Regards,

    Nick

  • Hi Nick,

    Thank you for the helpful response! Just to clarify for my own understanding, autoforwarding seems to be working perfectly fine at 100mb/s and plays fine with the PRU firmware and linux drivers. So there is just something missing / not implemented in the PRU firmware itself for it to reliably work at 1gb/s? It seems to be working 95% just with the occasional corrupted byte. If it's in the PRU firmware I guess there's not much I can do so that is closed source? If it's something in the linux driver itself I can at least play with that

    Thanks,

    Keith

  • Hello Keith,

    We are starting to get out of my depth from a technical standpoint (I'm on the apps team with Rimika, not on the development side). Rimika and I won't be able to help with specific software tweaks here. However, I'll copy paste some of the discussion from that e2e thread Rimika linked above in case you are able to do something with it:

    "

    While BU cannot support this request [for auto-forwarding in PRU Ethernet] as this is not part of standard product and we are unable to provide deeper expertise on modifying the driver to add these features requested. We did connect them to deeper ICSSG experts in the SEM (TI systems) team.

    Capturing their response for the benefit of any future such queries:

    -----

    We added auto-forward mode with SORTE_G protocol. I believe that the issue that your customer might face is that the TX FIFO is overrun at 1G speed.

    There is a programmable forward delay in TXCFG registers called TX_START_DELAY, that will default to the 100Mbit case at device reset. I think it must be shortened for the Gbit case.

    For SORTE_G, we use the following configuration in RXCFG and TXCFG (this is a macro). This will forward the Ethernet traffic in both directions. Note also that TXL2 must be disabled (in case they did enable it).

    The customer can try to use the same register setting for RXCFG and TXCFG, and then the Ethernet traffic should be forwarded w/o PRU firmware interaction. This was tested on AM65x IDK.

     

    M_SET_MIIRT_AF            .macro

                    ldi           TEMP_REG_1.w0, 0x1f17

                    sbco       &TEMP_REG_1.b0, ICSS_MII_RT_CONST, CSL_ICSS_G_PR1_MII_RT_PR1_MII_RT_CFG_RXCFG0, 1

                    sbco       &TEMP_REG_1.b1, ICSS_MII_RT_CONST, CSL_ICSS_G_PR1_MII_RT_PR1_MII_RT_CFG_RXCFG1, 1

                    ldi32       TEMP_REG_1, 0x00330303

                    sbco       &TEMP_REG_1, ICSS_MII_RT_CONST, CSL_ICSS_G_PR1_MII_RT_PR1_MII_RT_CFG_TXCFG0, 4

                    ldi               TEMP_REG_1.b1, 2

                    sbco       &TEMP_REG_1, ICSS_MII_RT_CONST, CSL_ICSS_G_PR1_MII_RT_PR1_MII_RT_CFG_TXCFG1, 4

                    .endm  

     

    M_DISABLE_TXL2 .macro

        lbco    &TEMP_REG_1, ICSS_MII_G_RT_CFG_CONST, CSL_ICSS_G_PR1_MII_RT_PR1_MII_RT_G_CFG_REGS_G_ICSS_G_CFG, 4

        clr     TEMP_REG_1, TEMP_REG_1, 1   ; TX_L2_EN

        sbco    &TEMP_REG_1, ICSS_MII_G_RT_CFG_CONST, CSL_ICSS_G_PR1_MII_RT_PR1_MII_RT_G_CFG_REGS_G_ICSS_G_CFG, 4

        .endm

    "

  • Hi Nick and Rimika,

    Sorry for the slow response, finally got more time to play around with this. So I have seen their register settings before for SORTE_G, and indeed without the forwarding delay TX_START_DELAY, it would seem the FIFO gets overrun at gigabit speeds. With the appropriate delay though it would seem I can get consistent data as I outlined in my first post. However the problem with the corrupted 32nd byte persists no matter the delays or register settings (second byte of destination IP in a TCP packet).

    Playing with the register settings, if I start with the initial register values the PRU firmware and driver set, what causes the corrupted byte is the PRE_TX_AUTO_SEQUENCE bit in the TXCFG register. Leaving everything as it is, and only enabling this bit I can recreate the 32nd byte corruption. I would suspect based on this that something within the PRU firmware itself doesn't like this behavior, and that the issue is not within the linux driver. This tells me I need to start with my own PRU firmware. I am assuming the source code for the PRU ethernet firmware is something that cannot be shared with me as a starting point?

    Thanks,

    Keith

  • Hi Keith,

    It is not recommended to share TI's firmware code.

    Thanks and Regards,

    Rimika