This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VH-Q1: CPSW rxTopOfFifoDrop keeps increasing

Part Number: TDA4VH-Q1
Other Parts Discussed in Thread: TDA4VM

Hello, We encountered a packet loss problem when using CPSW-9G as a switch, here is the network topology of our project:

net device 2 sends data to net device 1 via CPSW-9G, and we found occasional packet loss in this chain. here is the log output of CPSW-9G printed by function "EnetAppUtils_printMacPortStats9G" after running for some time:

2024-03-13 23:39:20 [MCU2_0]  54465.384674 s: -------------- host port ----------------
2024-03-13 23:39:20 [MCU2_0]  54465.384791 s:   rxGoodFrames            = 26205323
2024-03-13 23:39:20 [MCU2_0]  54465.384843 s:   rxMcastFrames           = 2749203
2024-03-13 23:39:20 [MCU2_0]  54465.384881 s:   aleDrop                 = 1755431
2024-03-13 23:39:20 [MCU2_0]  54465.384918 s:   rxOctets                = 33460433846
2024-03-13 23:39:20 [MCU2_0]  54465.384954 s:   txGoodFrames            = 13340626
2024-03-13 23:39:20 [MCU2_0]  54465.384987 s:   txBcastFrames           = 2669
2024-03-13 23:39:20 [MCU2_0]  54465.385019 s:   txMcastFrames           = 1238382
2024-03-13 23:39:20 [MCU2_0]  54465.385055 s:   txOctets                = 1331271158
2024-03-13 23:39:20 [MCU2_0]  54465.385088 s:   octetsFrames64          = 26396
2024-03-13 23:39:20 [MCU2_0]  54465.385121 s:   octetsFrames65to127     = 15177207
2024-03-13 23:39:20 [MCU2_0]  54465.385155 s:   octetsFrames128to255    = 1728356
2024-03-13 23:39:20 [MCU2_0]  54465.385188 s:   octetsFrames256to511    = 547119
2024-03-13 23:39:20 [MCU2_0]  54465.385222 s:   octetsFrames512to1023   = 1050898
2024-03-13 23:39:20 [MCU2_0]  54465.385255 s:   octetsFrames1024        = 21015973
2024-03-13 23:39:20 [MCU2_0]  54465.385290 s:   netOctets               = 34791705004
2024-03-13 23:39:20 [MCU2_0]  54465.385325 s:   portMaskDrop            = 1755431
2024-03-13 23:39:20 [MCU2_0]  54465.385358 s:   aleVidIngressDrop       = 33
2024-03-13 23:39:20 [MCU2_0]  54465.385399 s:   txPri[0]                = 13313403
2024-03-13 23:39:20 [MCU2_0]  54465.385435 s:   txPri[6]                = 27222
2024-03-13 23:39:20 [MCU2_0]  54465.385471 s:   txPriBcnt[0]            = 1328411968
2024-03-13 23:39:20 [MCU2_0]  54465.385507 s:   txPriBcnt[6]            = 2857995
2024-03-13 23:39:20 [MCU2_0]  54465.385534 s: ----------- mac port 2 -----------
2024-03-13 23:39:20 [MCU2_0]  54465.385569 s:   rxGoodFrames            = 11577923
2024-03-13 23:39:20 [MCU2_0]  54465.385604 s:   rxBcastFrames           = 2498
2024-03-13 23:39:20 [MCU2_0]  54465.385638 s:   rxMcastFrames           = 1132294
2024-03-13 23:39:20 [MCU2_0]  54465.385675 s:   aleDrop                 = 82
2024-03-13 23:39:20 [MCU2_0]  54465.385709 s:   rxOctets                = 905449005
2024-03-13 23:39:20 [MCU2_0]  54465.385744 s:   txGoodFrames            = 32753380
2024-03-13 23:39:20 [MCU2_0]  54465.385799 s:   txBcastFrames           = 145
2024-03-13 23:39:20 [MCU2_0]  54465.385835 s:   txMcastFrames           = 7819695
2024-03-13 23:39:20 [MCU2_0]  54465.385875 s:   txOctets                = 38531878477
2024-03-13 23:39:20 [MCU2_0]  54465.385909 s:   octetsFrames64          = 43956
2024-03-13 23:39:20 [MCU2_0]  54465.385942 s:   octetsFrames65to127     = 15095509
2024-03-13 23:39:20 [MCU2_0]  54465.385976 s:   octetsFrames128to255    = 2299516
2024-03-13 23:39:20 [MCU2_0]  54465.386010 s:   octetsFrames256to511    = 409184
2024-03-13 23:39:20 [MCU2_0]  54465.386044 s:   octetsFrames512to1023   = 3101162
2024-03-13 23:39:20 [MCU2_0]  54465.386078 s:   octetsFrames1024        = 23381976
2024-03-13 23:39:20 [MCU2_0]  54465.386114 s:   netOctets               = 39437327482
2024-03-13 23:39:20 [MCU2_0]  54465.386148 s:   portMaskDrop            = 82
2024-03-13 23:39:20 [MCU2_0]  54465.386180 s:   aleVidIngressDrop       = 34
2024-03-13 23:39:20 [MCU2_0]  54465.386212 s:   aleUnknownUcast         = 18
2024-03-13 23:39:20 [MCU2_0]  54465.386244 s:   aleUnknownUcastBcnt     = 1268
2024-03-13 23:39:20 [MCU2_0]  54465.386277 s:   aleUnknownMcast         = 152
2024-03-13 23:39:20 [MCU2_0]  54465.386310 s:   aleUnknownMcastBcnt     = 12492
2024-03-13 23:39:20 [MCU2_0]  54465.386345 s:   aleUnknownBcast         = 96
2024-03-13 23:39:20 [MCU2_0]  54465.386377 s:   aleUnknownBcastBcnt     = 6160
2024-03-13 23:39:20 [MCU2_0]  54465.386412 s:   alePolicyMatch          = 11520066
2024-03-13 23:39:20 [MCU2_0]  54465.386454 s:   txPri[0]                = 32752300
2024-03-13 23:39:20 [MCU2_0]  54465.386493 s:   txPri[6]                = 1077
2024-03-13 23:39:20 [MCU2_0]  54465.386530 s:   txPriBcnt[0]            = 38531794932
2024-03-13 23:39:20 [MCU2_0]  54465.386566 s:   txPriBcnt[6]            = 80203
2024-03-13 23:39:20 [MCU2_0]  54465.386599 s:   txPriDrop[0]            = 386773
2024-03-13 23:39:20 [MCU2_0]  54465.386633 s:   txPriDrop[6]            = 48
2024-03-13 23:39:20 [MCU2_0]  54465.386668 s:   txPriDropBcnt[0]        = 223083172
2024-03-13 23:39:20 [MCU2_0]  54465.386704 s:   txPriDropBcnt[6]        = 3482
2024-03-13 23:39:20 [MCU2_0]  54465.386724 s: ----------- mac port 5 -----------
2024-03-13 23:39:20 [MCU2_0]  54465.386779 s:   rxGoodFrames            = 11484631
2024-03-13 23:39:20 [MCU2_0]  54465.386819 s:   rxBcastFrames           = 171
2024-03-13 23:39:20 [MCU2_0]  54465.386853 s:   rxMcastFrames           = 8070930
2024-03-13 23:39:20 [MCU2_0]  54465.386890 s:   rxOctets                = 5959081594
2024-03-13 23:39:20 [MCU2_0]  54465.386925 s:   txGoodFrames            = 1964207
2024-03-13 23:39:20 [MCU2_0]  54465.386957 s:   txBcastFrames           = 2493
2024-03-13 23:39:20 [MCU2_0]  54465.386990 s:   txMcastFrames           = 1747697
2024-03-13 23:39:20 [MCU2_0]  54465.387025 s:   txOctets                = 192297929
2024-03-13 23:39:20 [MCU2_0]  54465.387059 s:   octetsFrames64          = 26251
2024-03-13 23:39:20 [MCU2_0]  54465.387092 s:   octetsFrames65to127     = 6007071
2024-03-13 23:39:20 [MCU2_0]  54465.387127 s:   octetsFrames128to255    = 1860744
2024-03-13 23:39:20 [MCU2_0]  54465.387160 s:   octetsFrames256to511    = 953251
2024-03-13 23:39:20 [MCU2_0]  54465.387194 s:   octetsFrames512to1023   = 2134951
2024-03-13 23:39:20 [MCU2_0]  54465.387227 s:   octetsFrames1024        = 2466570
2024-03-13 23:39:20 [MCU2_0]  54465.387262 s:   netOctets               = 6151379523
2024-03-13 23:39:20 [MCU2_0]  54465.387297 s:   rxTopOfFifoDrop         = 386821
2024-03-13 23:39:20 [MCU2_0]  54465.387330 s:   aleUnknownUcast         = 60
2024-03-13 23:39:20 [MCU2_0]  54465.387361 s:   aleUnknownUcastBcnt     = 5036
2024-03-13 23:39:20 [MCU2_0]  54465.387393 s:   aleUnknownBcast         = 19
2024-03-13 23:39:20 [MCU2_0]  54465.387427 s:   aleUnknownBcastBcnt     = 1216
2024-03-13 23:39:20 [MCU2_0]  54465.387461 s:   alePolicyMatch          = 9842036
2024-03-13 23:39:20 [MCU2_0]  54465.387499 s:   ietRxSmdError           = 1
2024-03-13 23:39:20 [MCU2_0]  54465.387535 s:   txPri[0]                = 1964206
2024-03-13 23:39:20 [MCU2_0]  54465.387570 s:   txPriBcnt[0]            = 192297871

As shown in the log, there are 386821 packets dropped by mac port 5 described as rxTopOfFifoDrop, and this drop is possibly caused by txPriDrop[0] +  txPriDrop[6] of mac port 2.

I found this two items in the manual, described as:

but I can still not understand what this two items actually  mean.

So could you please explain them to me in a more comprehensible way? and in what situation could this drop may happen?

Thank you very much!

  • Hi,

    As shown in the log, there are 386821 packets dropped by mac port 5 described as rxTopOfFifoDrop, and this drop is possibly caused by txPriDrop[0] +  txPriDrop[6] of mac port 2.

    Yes, These two statistics are inter related.

    txPriDrop indicates how many packets were dropped due SOF overrun (which means by time it try to send the packet out the SOF is missing for the packet from FIFO it might be overrun by other data. so those packets will be dropped.

    As it is dropped at one of Port egress then rxTopOdFifoDrop count will increment at the same above dropped packet ingress port.
    And also, "rxTopOdFifoDrop" count will increment by no.of egress ports dropped the packet (means for example if packet receiving on Port-2 will be forwarded to Port-1, Port-3, Port-5 and it was dropped at Port-3, Port-1 and sent successfully at Port-5 then "rxTopOdFifoDrop" count will increment by 2 as the packet is dropped at Port-1, Port-3).

    but I can still not understand what this two items actually  mean.

    So could you please explain them to me in a more comprehensible way? and in what situation could this drop may happen?

    Above will explanation for the counters and relation between txPriDrop & rxTopOdFifoDrop.

    It can happen when ingress rate is more/faster than egress.

    If ingress packet is egressed on multiple ports then egress rate should be higher than ingress (for example Port-2 ingress rate is 100Mbps then Port-1, Port-3 egress rate should be more than 200Mbps as the same ingress packet to be duplicate on both Port-1 & Port-3).

    Best Regards,
    Sudheer

  • Hi Sudheer,

    I'm a colleague of Haijun.

    For this issue I have another question. From the log Haijun attached, we can notice that the number of frames received from Host Port is much larger than that from Port-5. Meanwhile, packets ingressed on both Host Port and Port-5 should be egressed on Port-2. So, why there is no rxTopOfFifoDrop on Host Port but only on Port-5? Does Host Port have higher priority? Or the statistical  rules are different for Host Port?

    Best Regards,

    Mian

  • Hi,

    the statistical  rules are different for Host Port?

    No, rules will be same.
    But, Rx of External Ports will be Tx of Host Port and similarly Tx of Host Port is Rx of External Port when ALE entries are there for forwarding data between Host Port and External Ports.

    Host will receive (ingress) the data from internal cores and forward to External Ports (egress)

    Host Port operations will be based on CPPI clock (320MHz default with 128bytes width) which is much faster than all External Ports Max supported speed.

    In above case data forwarding between external Ports which depends upon the PHY interface like SGMII/USXGMII and Link speed with remote device.
    (FIFOs will be empty based on the clock used for port link speed).

    Best Regards,
    Sudheer

  • Hi,

    Host will receive (ingress) the data from internal cores and forward to External Ports (egress)

    Yes I understand this. All the Rx and Tx statatistics are from the perspective of the switch. 

    What I mean above is that in our case packages from SOC, i.e. rxGoodFrames of Host Port, are also transmitted to external Port-2. And rxGoodFrames of Host Port is mcuh larger than that of external Port-5.

    As you mentioned, Host Port is much faster than external ports. So I don't understand why the txPriDrop of Port-2 only causes rxTopOfFifoDrop on Port-5, buf not on Host Port.

    BTW, what's the theoretical bandwidth for Host Port, on both Tx and Rx direction?

    Best Regards,

    Mian

  • Hi, 

    As you mentioned, Host Port is much faster than external ports. So I don't understand why the txPriDrop of Port-2 only causes rxTopOfFifoDrop on Port-5, buf not on Host Port

    In your test, the data is from port-5 to port-2 between net device-1 & net device-2.

    So, while forwarding the ingress data from port-5 to egress on port-2 above is happend due to SOF might be overwritten by ingreds data on port-5 before egress on port-2.

    Also, as you mentioned Host port data also egress on port-2, which could also reason for fifo overflow at port-2. 

    But, may be host port forwarding data may not be lost so, we may not observed any drop in that path. 

    Best Regards, 

    Sudheer

  • Hi,

    In your test, the data is from port-5 to port-2 between net device-1 & net device-2.

    I'm sorry but the topology schema above is not a complete one. Please refer to the picture below.

    Frames ingressed on Host Port and Port-5 engress on Port-2 at the same time. And the data amount ingressed on Host Pot is larger than on Port-5.

    However, the number of txPriDrop on Port-2 is identical to rxTopOfFifoDrop on Port-5. And from the log, there is no rxTopOfFifoDrop on Host Port. So I wonder why when Host Post has larger data amount and faster operation.

    Best Regards,

    Mian

  • Hi,

    In your test, the data is from port-5 to port-2 between net device-1 & net device-2.

    I'm sorry but the topology schema above is not a complete one. Please refer to the picture below.

    I understood from above statistics log, Host Port also sending traffic on Port-2.

    In this case we have seen only the packet coming from Port-5 to Port-2 only overflowed, from statistics.

    Let me check with expert/IP team/, why Host Port data is not overflown at Port-2.

    Best Regards,
    Sudheer

  • Hi,

    I have some more questions concerning FIFO of CPSW9G. I'll use datasheet of TDA4VM below as an example.

    First in datasheet, I can see description concerning FIFO in section 12.2.2.4.6.5 as in pciture below.

    What does it mean by Each of the two CPSW_9G ports?  Does it mean two ports share one common FIFO? Or does it mean there is a FIFO of 20K size between every two ports? And does this description apply to Host Port as well?

    Then, I see another decription concerning FIFO in secition 12.2.2.4.6.10.11 as below.

    Here it says all ports have a FIFO of 80K in size. I'm a little confused. What's the difference between FIFOs decribed in these two sections? Are they the same thing? Similarly, does this section goes for Host Port, too?

    Besides, I also wonder what's the bandwidth of the Host Port. As I know we can set the speed of external ports to 10M, 100M or 1000M. Can we do the same for Host Port?

    Looking forward to your reply.

    Best Regards.

    Mian

  • Hi,

    Here it says all ports have a FIFO of 80K in size. I'm a little confused. What's the difference between FIFOs decribed in these two sections? Are they the same thing? Similarly, does this section goes for Host Port, too?

    Each Port has 20KB FIFO which is divided into 8 Tx Queues (for Priority0 to Prioriry7) and 1 Rx Queue.

    Size of each Tx Queue will be 2K default, and Rx will use 4K.

    Which is same in case of Host Port also. There might  be some typos in TRM.

    Besides, I also wonder what's the bandwidth of the Host Port. As I know we can set the speed of external ports to 10M, 100M or 1000M. Can we do the same for Host Port?

    The Link speed are 10M, 100M, 1000M where the External Port FIFOs will be empty at the link speed rate.

    Host Port is internal to CPSW, Speed will not applicable for that.

    Best Regards,
    Sudheer

  • Hi,

    Above rxTopOfFifoDrop is because of following flow in forwarding the packets from one Port ingress to other port egress.
    in Ethernet port ingress, packets will be either:
    (A) forwarded if there is space in the destination port's FIFO,
    or
    (B) dropped if no space on destination port's FIFO (rxTopOfFifoDrop) and on destination port it will increment the TxPriDrop based on packet priority.

    And the same above can't be seen in Host port ingress, because packets forward will be either:
    (A) forwarded if there is space in the destination port's FIFO
    or
    (B) host port ingress will be halted if destination port's FIFO is full (here Host Port ingress is halted, RX fifo gets full condition and doesn't fetch new descriptors).

    Best Regards,
    Sudheer