This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM335x - CPSW - VRRP

Other Parts Discussed in Thread: AM3352

Hello,

I'm testing some router with AM3352 + Linux (kernel 3.12.10). CPSW is set to dual mac mode and both eth ports work fine except for VRRP. According to my experience some L2 switches have problems with VRRP. They react to the same source MAC address on 2 different ports in a short period of time by blocking one of them in some way. It seems to me the same happen with AM3352. So, is AM335x family compliant with VRRP? Did you test it? If you will claim that everything should be OK with VRRP then I'll go into details of the issue.

Thanks,

Michal

  • Hi Michal,

    I will forward this to the factory team.

  • Hello Biser,

    did the factory team find anything?

    Best regards

    Michal

  • No feedback so far. Reply will be posted on this thread when it comes.

  • There's still no feedback from you so I'm going to try to describe the core of the issue as simply as possible.

    3 devices connected to external L2 switch:
    - Router 1 (r1): AM3352, Linux (kernel 3.12.10), CPSW in dual mac mode, VLAN IDs not assigned manually (vconfig) to eth0 nor eth1, VRRP priority=255 (main).
    - Router 2 (r2), Linux, VRRP priority=100 (backup)
    - NTB Lenovo.

    r1 in master VRRP mode transmits VRRP advertisement periodicaly (1 sec) from eth0 (from CPSW port 0 to port 1):
    Src MAC: 00:00:5e:00:01:01 (ucast)
    Dst MAC: 01:00:5e:00:00:12 (mcast)

    r2 silent.

    NTB pinging r1 constantly (echo req):
    Src MAC: 28:d2:44:22:fb:e6 (ucast)
    Dst MAC: 00:00:5e:00:01:01 (ucast)

    Now reboot of r1. When finished and automatically switched to VRRP master mode then the lookup table of r1 looks like this (my debug output to kernel log):

    [   20.053265] --> cpsw_ale_match_addr: ale=0xdf2f1a00 ale_entries=1024
    [   20.053277] --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000008
    [   20.053288] --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    [   20.053300] --> idx=0: 0x00000000 0x20010000 0x03030003
    [   20.053312] --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    [   20.053324] --> idx=2: 0x0000000c 0x30010100 0x5e000001
    [   20.053336] --> idx=3: 0x00000000 0x20000000 0x07000007
    [   20.053347] --> idx=4: 0x0000000c 0x30010100 0x5e000012
    [   20.053359] --> idx=5: 0x00000000 0x20020000 0x05050005
    [   20.053371] --> idx=6: 0x00000014 0x3002ffff 0xffffffff
    [   20.053382] --> idx=7: 0x00000000 0x3002000a 0x1482dff9
    [   20.053393] --> idx=8: 0x0000000c 0x3001000a 0x1482dff8
    [   20.053412] --> idx=16: 0x00000004 0xf00128d2 0x4422fbe6
    [   20.053438] --> idx=32: 0x00000000 0xf000000a 0x1482dff8
    [   20.053483] --> idx=69: 0x00000000 0x30010000 0x5e000101

    The important line is that at idx=69. This item was inserted manually at the time VRRP master mode was set. The address is assigned to port 0 (CPPI) of CPSW. That's correct. r1 replies echo req from NTB. r1 transmits vrrp advertisement.

    Ocassionally, there can be VRRP advertisement from r2. It's a nature of VRRP. We must count on that. The src and dst MAC are the same like described above or VRRP advertisement from r1. So there are frames with the same src and dst MAC addresses in both directions (port 0 -> port1 of CPSW and vice versa). CPSW solves the situation this way.

    [  107.377134] --> cpsw_ale_match_addr: ale=0xdf2f1a00 ale_entries=1024
    [  107.377148] --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000002
    [  107.377159] --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    [  107.377172] --> idx=0: 0x00000000 0x20010000 0x03030003
    [  107.377184] --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    [  107.377196] --> idx=2: 0x0000000c 0x30010100 0x5e000001
    [  107.377208] --> idx=3: 0x00000000 0x20000000 0x07000007
    [  107.377220] --> idx=4: 0x0000000c 0x00010100 0x5e000012
    [  107.377231] --> idx=5: 0x00000000 0x20020000 0x05050005
    [  107.377243] --> idx=6: 0x00000014 0x3002ffff 0xffffffff
    [  107.377254] --> idx=7: 0x00000000 0x3002000a 0x1482dff9
    [  107.377266] --> idx=8: 0x0000000c 0x3001000a 0x1482dff8
    [  107.377278] --> idx=9: 0x00000000 0xf0000000 0x5e000101
    [  107.377296] --> idx=16: 0x00000004 0x700128d2 0x4422fbe6
    [  107.377321] --> idx=32: 0x00000000 0x4000000a 0x1482dff8
    [  107.377333] --> idx=33: 0x00000004 0x4001000a 0x14823570
    [  107.377377] --> idx=69: 0x00000004 0x30010000 0x5e000101

    The record at idx=69 changed. Now the unicast address 00:00:5e:00:01:01 is assigned to port 1 of CPSW (bits 67:66) instead of port 0 (previous state). This is caused by VRRP advertisement (in "öpposite" direction) from r2. There is also new record at idx=9. It was added due to periodic VRRP advertisement from r1 which is still transmitted regardless of one frame of VRRP advertisment from r2. The important is the record at idx=69 which now blocks echo req (ping) from NTB. If I put eth0 into promiscuous mode (ALE_BYPASS is set) then echo req is successfully received. This proves that problem is in ALE.

    Question 1:
    Can I do something to keep unicast address 00:00:5e:00:01:01 assigned to port 0 of CPSW? I'd expect that if VRRP ad from port 1 to port 0 (frame from r2) of CPSW causes the change of port assignent (0 -> 1) for  ucast 00:00:5e:00:01:01 then VRRP ad from port 0 to port 1 should do the same thing and renew the original port assignment. But ALE evidently doesn't behave like that. Why?

    I've ben thinking about NO_LEARN bit in PORTCTL1 register but it is not probably the general solution. I suppose I need port 1 to learn continuously.

    Back to table 2. There you can see a record at idx=3. This is our workaround for some problem I'll describe later. If this record is not manually added to the table then the record at idx=9 is not added to the table by ALE. Result is like this:

    [   77.375914] --> cpsw_ale_match_addr: ale=0xdf2f1a00 ale_entries=1024
    [   77.375927] --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000003
    [   77.375938] --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    [   77.375950] --> idx=0: 0x00000000 0x20010000 0x03030003
    [   77.375962] --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    [   77.375973] --> idx=2: 0x0000000c 0x30010100 0x5e000001
    [   77.375985] --> idx=3: 0x0000000c 0x30010100 0x5e000012
    [   77.375997] --> idx=4: 0x00000000 0x20020000 0x05050005
    [   77.376009] --> idx=5: 0x00000014 0x3002ffff 0xffffffff
    [   77.376021] --> idx=6: 0x00000000 0x3002000a 0x1482dff9
    [   77.376033] --> idx=7: 0x0000000c 0x3001000a 0x1482dff8
    [   77.376052] --> idx=16: 0x00000004 0xf00128d2 0x4422fbe6
    [   77.376093] --> idx=48: 0x00000004 0x4001000a 0x14823570
    [   77.376122] --> idx=68: 0x00000004 0x30010000 0x5e000101

    The purpose of our workaround above mentioned is to remove VLAN tags from frames on egress (port 1 and 2). I wonder why this functionality is not accomplished by these two lines:

    [   77.375950] --> idx=0: 0x00000000 0x20010000 0x03030003
    [   77.375997] --> idx=4: 0x00000000 0x20020000 0x05050005

    which you can see in all tables above and which are automatically added by linux drivers in dual mac mode.

    Question 2:
    Should those two lines do what I expect?

    Best regards

    Michal

  • Hi Michal,

    Sorry for this delay. I have asked the factory team again, notifying them about the new information you posted.

  • Hi Michal,

    Some feedback from the Linux SDK team:

    - Currently we do not test for VRRP as part of the TI SDK SW. This is a higher level protocol managed through a user level app, it does not seem there is a kernel config option for it.

    What seems to matter though for this protocol is to work is if the kernel network calls it is making are setting up the ALE correctly. The ALE is processes every packet seen, if it does not match a filter in the ALE is it dropped.

    - Regarding ALE note that some patches (PATCH: drivers: net: cpsw: fix multicast flush in dual emac mode) will be released publically soon for TI SDK 7.0.
    When available it might make sense to test them on your system since it related to the ALE hen dual EMAC is used.


    A.
  • Hello,

    I tested the patch (fix multicast flush in dual emac mode) you recommended me and the result is negative. No change in behaviour of CPSW regarding my problem.

    Now what to do? Nobody answered the questions from my previous post. It seems to me that mentioning of VRRP is a little bit confusing for you so I'll try to reformulate the questions trying to avoid any complicating details.

    So I have AM3352, running Linux (kernel 3.12.10), CPSW in dual mac mode, VLAN IDs not assigned manually (vconfig) to eth0 nor eth1.

    Unicast MAC address 00:00:5e:00:01:01 (let's call this MAC1) is assigned to eth0 (it means to port 0 (CPPI) of CPSW). This assignment creates this item in ALE:

    [   20.053483] --> idx=69: 0x00000000 0x30010000 0x5e000101

    Here is the whole contents of ALE (+ some important registers):

    [   20.053265] --> cpsw_ale_match_addr: ale=0xdf2f1a00 ale_entries=1024

    [   20.053277] --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000008

    [   20.053288] --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000

    [   20.053300] --> idx=0: 0x00000000 0x20010000 0x03030003

    [   20.053312] --> idx=1: 0x0000000c 0x3001ffff 0xffffffff

    [   20.053324] --> idx=2: 0x0000000c 0x30010100 0x5e000001

    [   20.053336] --> idx=3: 0x00000000 0x20000000 0x07000007

    [   20.053347] --> idx=4: 0x0000000c 0x30010100 0x5e000012

    [   20.053359] --> idx=5: 0x00000000 0x20020000 0x05050005

    [   20.053371] --> idx=6: 0x00000014 0x3002ffff 0xffffffff

    [   20.053382] --> idx=7: 0x00000000 0x3002000a 0x1482dff9

    [   20.053393] --> idx=8: 0x0000000c 0x3001000a 0x1482dff8

    [   20.053412] --> idx=16: 0x00000004 0xf00128d2 0x4422fbe6

    [   20.053438] --> idx=32: 0x00000000 0xf000000a 0x1482dff8

    [   20.053483] --> idx=69: 0x00000000 0x30010000 0x5e000101

    Multicast packet is transmitted from eth0 (it means port 0 -> port1 of CPSW) periodically (1 sec). Destination MAC address is 01:00:5e:00:00:12 (mcast). eth0 can be pinged at MAC1. Everything works fine.

    It can happen occasionally that some other source (another device) sends single multicast packet with source and destination MAC addresses matching exactly the same items of the periodically transmitted packet which was described above (source MAC is MAC1 and destination MAC is 01:00:5e:00:00:12). CPSW accepts this packet because we are registered to appropriate multicast group:

    [   20.053347] --> idx=4: 0x0000000c 0x30010100 0x5e000012

    CPSW solves the conflict of the same source MAC address (MAC1) from two sources (port0/port1) by changing appropriate item in ALE slightly from this state:

    [   20.053483] --> idx=69: 0x00000000 0x30010000 0x5e000101

    to this state:

    [  107.377377] --> idx=69: 0x00000004 0x30010000 0x5e000101

    Now MAC1 is assigned to port 1 of CPSW (see bits 67:66) instead of port 0 (previous state). It is acceptable reaction of CPSW to the situation (same packets in opposite directions). But our AM3352 keeps transmitting the periodic multicast packets mentioned above hence I'd expect that the assignment of MAC1 should be reverted back to port0 of CPSW (initial state). But it never happens and eth0 can't be accessed at MAC1 any more because CMSW "thinks" that MAC1 is somewhere behind port1 and filters out all packets coming from port1 with destination MAC address set to MAC1. Here's final "blocking" state of ALE:

    [  107.377134] --> cpsw_ale_match_addr: ale=0xdf2f1a00 ale_entries=1024

    [  107.377148] --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000002

    [  107.377159] --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000

    [  107.377172] --> idx=0: 0x00000000 0x20010000 0x03030003

    [  107.377184] --> idx=1: 0x0000000c 0x3001ffff 0xffffffff

    [  107.377196] --> idx=2: 0x0000000c 0x30010100 0x5e000001

    [  107.377208] --> idx=3: 0x00000000 0x20000000 0x07000007

    [  107.377220] --> idx=4: 0x0000000c 0x00010100 0x5e000012

    [  107.377231] --> idx=5: 0x00000000 0x20020000 0x05050005

    [  107.377243] --> idx=6: 0x00000014 0x3002ffff 0xffffffff

    [  107.377254] --> idx=7: 0x00000000 0x3002000a 0x1482dff9

    [  107.377266] --> idx=8: 0x0000000c 0x3001000a 0x1482dff8

    [  107.377278] --> idx=9: 0x00000000 0xf0000000 0x5e000101

    [  107.377296] --> idx=16: 0x00000004 0x700128d2 0x4422fbe6

    [  107.377321] --> idx=32: 0x00000000 0x4000000a 0x1482dff8

    [  107.377333] --> idx=33: 0x00000004 0x4001000a 0x14823570

    [  107.377377] --> idx=69: 0x00000004 0x30010000 0x5e000101

    The question is whether such behavior of CPSW is inevitable or can be changed by setting something in control registers of CPSW? This question implies the matter of suitability of AM3352 for protocols like VRRP.

    Best regards

    Michal

  • Michal,

    Could you try with the new SDK 8.0 (based on 3.14)?
    Some CPSW patches have been included compared to SDK 7.0 (included the pacth provided in this post):
    http://processors.wiki.ti.com/index.php/Sitara_Linux_SDK_Kernel_Release_Notes
    The Git tree is located at:
    http://gitorious.ti.com/sitara-linux/sitara-linux/commits/sitara-ti-linux-3.14.y

    A.

  • Hello,

    I downloaded all relevant files from

    http://gitorious.ti.com/sitara-linux/sitara-linux/trees/sitara-ti-linux-3.14.y/drivers/net/ethernet/ti

    plus 3 header files from

    http://gitorious.ti.com/sitara-linux/sitara-linux/trees/sitara-ti-linux-3.14.y/include/uapi/linux

    to make it compilable with my kernel 3.12.10.

    The result is negative again. CPSW behaves the same way like before. Are there really all (CPSW) patches included in sitara-ti-linux-3.14.y? I've noticed that some changes included in "fix multicast flush in dual emac mode" patch (mentioned in my previous post) are not included in sitara-ti-linux-3.14.y. It seems a little bit strange to me.

    So the issue still remains.

    Best regards

    Michal

  • Michal,

    I noticed two things when looking at your ALE table dumps.

    1. Table 2 shows that entry #9 was learned that ties port 0/VLAN ID 0 to the unicast MAC address 00:00:53:00:01:01. This is alongside the entry #69 where the same MAC address is tied to port1/VLAN ID 1. This seems to show that the packets that are being sent by the host to port 0 have a VLAN header with a VLAN ID of 0 (or the packets are being sent to the switch VLAN untagged and the PORT_VLAN register is appending a VLAN header with VLAN ID 0 during switch ingress). I may be wrong but I think packets sent from the host towards external port 1 should have a VLAN header with VLAN ID 1. This should, in theory, cause the address table entry #69 to be updated with port0/VLAN ID 1 as you expected to happen.

    2. Table 2 also shows that your VLAN/Multicast entry #4 (that was allowing your multicast packets to 01:00:5e:00:00:12 out of the device) has somehow had its entry type (bits 61 and 60) changed to 00, which means that this is now an empty table entry. Which will cause your multicast packets destined for 01:00:5e:00:00:12 to become unregistered multicast packets (since no table entry matches the destination address) and subsequently dropped.

    Hope this helps,

    Jason Reeder

  • Hello,

    I suppose both issues you have described in your post are caused by this line in ALE:

    [ 107.377208] --> idx=3: 0x00000000 0x20000000 0x07000007

    which we add because of the reason which I described in the post from Dec 3, 2014. We can solve all issues connected to this line later. First of all, I want to solve the issue of blocking receiver by CPSW by letting unicast address assigned to wrong port of CPSW. So, once again and now without confusing line in
    ALE.

    Initial state of ALE:

    --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000009
    --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    --> idx=0: 0x00000000 0x20010000 0x03030003
    --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    --> idx=2: 0x0000000c 0x30010100 0x5e000001
    --> idx=3: 0x0000000c 0x30010100 0x5e000012
    --> idx=4: 0x00000000 0x20020000 0x05050005
    --> idx=5: 0x00000014 0x3002ffff 0xffffffff
    --> idx=6: 0x00000000 0x3002000a 0x1482dff9
    --> idx=7: 0x00000014 0x30020100 0x5e000001
    --> idx=8: 0x00000000 0x30010000 0x5e000101
    --> idx=9: 0x0000000c 0x3001000a 0x1482dff8

    Let's have multicast packet with source MAC address 00:00:5e:00:01:01 (ucast) and destination MAC address 01:00:5e:00:00:12 (mcast). This packet is periodically transmitted from eth0 (it means port 0 -> port1 of CPSW) each second all the time.

    Now, let's have one piece of exactly the same packet transmitted in opposite direction (it means port 1 -> port 0 of CPSW). The result is like this:

    --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000009
    --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    --> idx=0: 0x00000000 0x20010000 0x03030003
    --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    --> idx=2: 0x0000000c 0x30010100 0x5e000001
    --> idx=3: 0x0000000c 0x30010100 0x5e000012
    --> idx=4: 0x00000000 0x20020000 0x05050005
    --> idx=5: 0x00000014 0x3002ffff 0xffffffff
    --> idx=6: 0x00000000 0x3002000a 0x1482dff9
    --> idx=7: 0x00000014 0x30020100 0x5e000001
    --> idx=8: 0x00000004 0x30010000 0x5e000101
    --> idx=9: 0x0000000c 0x3001000a 0x1482dff8
    --> idx=10: 0x00000004 0x4001000a 0x14823570
    --> idx=11: 0x00000004 0xf00128d2 0x4422fbe6

    The assignment of ucast address 00:00:5e:00:01:01 (line at idx=8) changed forever from port 0 to port 1. This assignment is not reverted by packets periodically sent from port 0 to port 1 as mentioned above. Why? Could you please answer this question?

    Best regards

    Michal
  • Michal,

    How are these ALE entries making it into the table? Entry #9 seems to be tied to port 3 which doesn't exist on the AM335x device.  Can you discern which ones you are adding manually, which are being learned, and which are being added by Linux drivers?

    Your previous post seemed to show that an ALE entry was being learned that tied port 0 and VLAN ID 0 to the unicast source address in question (00:00:5e:00:01:01). This would suggest that packets from the host device destined to exit through port 1 are using VLAN ID 0 instead of VLAN ID 1. 

    Have you attempted to set bit 5 in the PORTCTL1 Register? According to the TRM this should stop port 1 from updating source port numbers in existing ALE table entries. So when the single multicast packets enters from port 1 -> port 0 then the ALE entry #8 should remain tied to port 0.

    Jason Reeder

  • Hello,

    "How are these ALE entries making it into the table? Entry #9 seems to be tied to port 3 which
    doesn't exist on the AM335x device. Can you discern which ones you are adding manually, which
    are being learned, and which are being added by Linux drivers?"

    Yes, you're right. This should be patched. Linux drivers for CPSW are not ready for the situation when unicast MAC address is first added as unicast entry to ALE table then removed and finally added as multicast entry. One of user space applications behaves that strange way. I will try to handle this issue in some way later. There are two ways to do it. I can either clear the whole line when something is removed from ALE table or I can modify user space application to behave better way.

    Now, let's concentrate on the main issue. So once again and now without strange entry in ALE table.

    Initial state:

    --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000003
    --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    --> idx=0: 0x00000000 0x20010000 0x03030003
    --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    --> idx=2: 0x0000000c 0x30010100 0x5e000001
    --> idx=3: 0x0000000c 0x30010100 0x5e000012
    --> idx=4: 0x00000000 0x20020000 0x05050005
    --> idx=5: 0x00000014 0x3002ffff 0xffffffff
    --> idx=6: 0x00000000 0x3002000a 0x1482dff9
    --> idx=7: 0x00000014 0x30020100 0x5e000001
    --> idx=8: 0x00000000 0x30010000 0x5e000101

    And final problematic state:

    --> idver=0x00290104 ctrl=0x80000004 uvlan=0x00000000 tab_ctrl=0x00000003
    --> port_ctrl0=0x00000003 port_ctrl1=0x00000003 port_ctrl2=0x00000000
    --> idx=0: 0x00000000 0x20010000 0x03030003
    --> idx=1: 0x0000000c 0x3001ffff 0xffffffff
    --> idx=2: 0x0000000c 0x30010100 0x5e000001
    --> idx=3: 0x0000000c 0x30010100 0x5e000012
    --> idx=4: 0x00000000 0x20020000 0x05050005
    --> idx=5: 0x00000014 0x3002ffff 0xffffffff
    --> idx=6: 0x00000000 0x3002000a 0x1482dff9
    --> idx=7: 0x00000014 0x30020100 0x5e000001
    --> idx=8: 0x00000004 0x30010000 0x5e000101

    Our point of interest is now at idx = 8.

    "How are these ALE entries making it into the table?"

    idx = 0, 1, 4, 5 - Added by CPSW Linux drivers.
    idx = 2, 7 - I don't know precisely. I suppose these entries are not important for us.
    idx = 3, 6, 8 - Added by user space application (vrrp daemon).

    "Have you attempted to set bit 5 in the PORTCTL1 Register? According to the TRM this should stop
    port 1 from updating source port numbers in existing ALE table entries. So when the single
    multicast packets enters from port 1 -> port 0 then the ALE entry #8 should remain tied to port 0."

    I thought about this possibility (see my post from Dec 3, 2014). It's probably solution to the issue but I'd like to avoid it because of side effects on the system which I'm not able to predict. But If you state that wrong assignment of port number (entry at idx = 8) is intrinsic property of CPSW and can't be solved better way then I'll have to accept it.


    Best regards

    Michal
  • Michal,

    The switch should follow the updating procedure outlined in section 14.3.2.7.3.2 (Updating Process) of the TRM. We can see this process occurring correctly as entry number 8 updates its port number from 0 to 1 when the SRC:00:00:5e:00:01:01 DST:01:00:5e:00:00:12 packet arrives at port 1.

    So, when the next broadcast comes from the host port 0 with the same SRC, DST, and VLAN ID of 1 then we should see entry 8 update to once again get tied to port 0. However, in an earlier post (when you had a VLAN entry allowing VLAN ID 0) there appeared to be a learned table entry (section 14.3.2.7.3.1 of the TRM) that tied your source address of 00:00:5e:00:01:01 to both port 0 and VLAN ID 0. This makes me think that your daemon may be somehow incorrectly sending packets with VLAN ID 0 and that is why you originally had to add a VLAN entry for VLAN ID 0. If this is the case, then entry number 8 will not get updated because of the third 'if' statement in the updating process of section 14.3.2.7.3.2. 

    Would it be possible to provide me a list of steps so that I can reproduce this issue on my hardware?.

    Jason Reeder 

  • Hello,

    I can confirm that broadcast (SRC:00:00:5e:00:01:01 DST:01:00:5e:00:00:12) from port 0 to port 1 of CPSW contains VLAN TAG which indicates VLAN ID 0. Here is output from tcpdump (from another device with different type of MPU - not using L2 switch in eth controler - directly connected to the router with AM3352):

    00:46:56.968592 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100)
    , length 64: vlan 0, p 0, ethertype IPv4, 192.168.2.151 > 224.0.0.18: VRRPv2, Ad
    vertisement, vrid 1, prio 255, authtype none, intvl 1s, length 20

    The crucial question is why the packet was assigned VLA ID 0? You suppose that vrrp daemon is the source of it. I'am not sure. Here is the part of kernel log with my debugging messages from CPSW driver:

    cpsw.c: cpsw_ndo_start_xmit: skb->vlan_proto=0 skb->vlan_tci=0x0000
    davinci_cpdma.c: cpdma_chan_submit: directed=1 len=54 data:
    0x1 0x0 0x5e 0x0 0x0 0x12 0x0 0x0 0x5e 0x0 0x1 0x1 0x8 0x0 0x45 0x0 0x0 0x28 0xd4 0x0 0x0 0x0 0xff 0x70 0x44 0x13 0xc0 0xa8 0x2 0x97 0xe0 0x0 0x0 0x12 0x21 0x1 0xff 0x1 0x0 0x1 0x1c 0xb6
    0xc0 0xa8 0x2 0x9d 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0

    You can see that at the time of pushing broadcast packet into DMA TX channel there is no VLAN TAG inserted in the data and that the packet is sent to port 1 (parameter "directed") so one would expect that VLAN ID 1 should be assigned to this packet bacause we are in VLAN aware mode (CTRL_ALE=0x80000004) and port 1 is indicated as VLAN member list on this line only:

    --> idx=0: 0x00000000 0x20010000 0x03030003

    But I'm not sure abou assignment of VLAN ID to packects without VLAN TAG. I found only this:

    If (VLAN_Unaware)
    force_untagged_egress = “000000”
    reg_mcast_flood_mask = “111111”
    unreg_mcast_flood_mask = “111111”
    vlan_member_list = “111111”
    else if (VLAN not found)
    force_untagged_egress = unknown_force_untagged_egress
    reg_mcast_flood_mask = unknown_reg_mcast_flood_mask
    unreg_mcast_flood_mask = unknown unreg_mcast_flood_mask
    vlan_member_list = unknown_vlan_member_list
    else
    force_untagged_egress = found force_untagged_egress
    reg_mcast_flood_mask = found reg_mcast_flood_mask
    unreg_mcast_flood_mask = found unreg_mcast_flood_mask
    vlan_member_list = found vlan_member_list

    in TRM. But I'm not sure whether it is useful for me in some way.

    "Would it be possible to provide me a list of steps so that I can reproduce this issue on my hardware?."

    I'm using kernel 3.12.10 and simple vrrp daemon which can be downloaded for example from here:

    sourceforge.net/.../vrrpd

    What matters is to make two devices to send vrrp advertisement (above mentioned broadcast packet) at the same time.

    Best regards

    Michal
  • Michal,

    I think I've gotten to the bottom of what's going on here. I created this wiki page to give some (probably way too much) background on how the CPSW switch hardware works: 

    The wiki page also attempts to describe how the Linux driver takes advantage of the switch capabilities to achieve dual emac mode. My initial understanding of the driver was that packets sent by the ARM host would be appended with a VLAN ID of 1 or 2 before being sent but this was incorrect. In fact, as you pointed out, the packets are directed packets that get appended with a VLAN ID of 0 during switch ingress at port 0. 

    The issue that you are facing boils down to the fact that the protocol that you are using creates a situation where one of our driver created Port 0 ALE Table Entries (that are necessary for dual emac mode to work) is getting updated to a new port that stops the ARM host from being able to receive future packets destined to its MAC address. This happens when the switch assumes that a MAC address has moved from port 0 to port 1. Since the Linux driver is sending directed packets (that eventually get appended with VLAN ID 0) the table entry will never be automatically updated by the switch because the entry we want updated is tied to VLAN ID 1.

    In order to not allow our manually added ALE Table Entries to get updated (which as you've seen will stop packet reception at host port 0) we need to disallow source address updating on the 2 external Ethernet ports. I would take it a step further and say that in dual emac mode we should also disable learning all together on the two external Ethernet ports (since we manually add the ALE entries we need to receive packets and the transmitted packets are directed). I would suggest setting the NO_SA_UPDATE and the NO_LEARN bits both to 1 in the PORTCTL1 and PORTCTL2 registers.

    Let me know if anything in the wiki page is unclear or confusing and I will change it.

    Jason Reeder

  • Michal,

    Were you able to try out the suggestion from the previous post? If so, did it correct the issue that you were seeing?

    Jason Reeder
  • Hello,

    I'm back on this project. So let's continue.

    I've read your wiki page and it seems pretty good. Maybe, TI should add it to TRM or insert a link at least. A few lines of text (comparing to hundreds of pages describing CPSW) and the core of the system is explained.

    Finally, I accepted the soluton based on "fixed" ALE table that is not modified/updated by packets going through the switch. I had been thinking about another solution based on inserting VLAN tags (VLAN ID 1/2) to transmitted packets but it'would propably decrease throuput of the system. So this issue can be stated as done.

    Thank you very much for collaboration

    Best regards

    Michal