This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2432: EtherCAT CRC errors

Part Number: AM2432

Hi TI experts,

At our end customer's side, they are facing crc errors.

As you can see, from node 113-119 (XMC 4800), crc error count of IN ports were all 21. From node 128 - 118, crc error count of OUT ports were 28. But for node 127(AM2432), the crc errors of IN port was only 4. Node 129 (XMC 4800) was 0.

From ETG 1000.3, The slave should not fix the crc code if it detects an error, the passed frame should also pass the incorrect crc. node 113-119(XMC 4800) seems to follow this standard. But AM2432 seems not.

ETG1000.3_CRCerror.png

Our PRU version is 0x532 for drives at customer's side.

My question is whether PRU is implemented along with 1000.3. Current implementation may cause confusion for us to located the errored node which causes crc errors.

 

  • For clarification of the attached image, Omron reads ESC reg 300h and 302h for IN and OUT port crc errors.

  • Hi Jianyu,

    Apologies for the delay in the response as I was out for couple of days. Could you share the complete wireshark logs for this scenario?

    But for node 127(AM2432), the crc errors of IN port was only 4. Node 129 (XMC 4800) was 0.

    Do you see the forwarded error incremented for these nodes?

    Regards,
    Aaron

  • As you can see, from node 113-119 (XMC 4800), crc error count of IN ports were all 21. From node 128 - 118, crc error count of OUT ports were 28. But for node 127(AM2432), the crc errors of IN port was only 4. Node 129 (XMC 4800) was 0.

    Can you also share topology diagram if possible as it is not clear from screenshot - are there any EtherCAT junctions or hubs in the network?

  • Hi Jianyu,

    One more clarification I need, is ENABLE_MULTIPLE_SM_ACCESS_IN_SINGLE_DATAGRAM macro set to 1, that is, is the TX_START_DELAY set to 0x98 from the application?

    Regards,
    Aaron

  • . Could you share the complete wireshark logs for this scenario

    This is at the customer's site, insert a sniffer would halt their production, which would not be possible.

    Can you also share topology diagram if possible as it is not clear from screenshot - are there any EtherCAT junctions or hubs in the network

    Yes, there is. The screen shot is only one side branch from a EtherCAT hubs. The following picture shows a general view of the topology. It should be similar but not completely same.

    One more clarification I need, is ENABLE_MULTIPLE_SM_ACCESS_IN_SINGLE_DATAGRAM macro set to 1, that is, is the TX_START_DELAY set to 0x98 from the application?

    ENABLE_MULTIPLE_SM_ACCESS_IN_SINGLE_DATAGRAM is 0. 

    This reminds me of what we are testing at our side. When ENABLE_MULTIPLE_SM_ACCESS_IN_SINGLE_DATAGRAM is 1, AM2432 constantly pick up crc errors. However it is XMC 4800 that collects CRC errors.

    So I am wondering if you guys can do a test. Say you have 1 master and 9 slaves, the first slave is an intentioned bad node which would pass one frame with random CRC every second. The first 4 slaves after the bad slave are using ET1100 solution like P65, and last 4 slaves are normal AM2432.

    If my guess is correct, the crc cnt of IN port of slave 2-5 would increase 1 each second, and the crc cnt of slave 6-9 would stay 0.

  • Hi Jianyu,

    This is at the customer's site, insert a sniffer would halt their production, which would not be possible.

    Understood. Is customer able to reproduce this issue?

    The following picture shows a general view of the topology. It should be similar but not completely same.

    Does this mean that Node 113 - 119 and Nodes 127-135 are in different branches, that is, all the nodes are not sequentially connected?

    Regards,
    Aaron

  • Does this mean that Node 113 - 119 and Nodes 127-135 are in different branches, that is, all the nodes are not sequentially connected

    They are sequentially connected.

  • So Omron uses Yellow lines on the left to display the topology, Node 113 - 119 and Nodes 127-135 are sequentially connected to X3 port of a secondary level spliter. You can see X4 and X5 port of that spliter in the image.

  • If my guess is correct, the crc cnt of IN port of slave 2-5 would increase 1 each second, and the crc cnt of slave 6-9 would stay 0.

    Can you clarify which ESC error counters are you referring to here? 0x300 and 0x302 alone or does it include forwarded error counters 0x308 and 0x30a as well?

    Which PHY is used in your design with AM243? Is odd nibble detection enabled in PHY ?

  • Can you clarify which ESC error counters are you referring to here? 0x300 and 0x302 alone or does it include forwarded error counters 0x308 and 0x30a as well?

     0x300 and 0x302 alone

    Which PHY is used in your design with AM243

    DP83826

    Is odd nibble detection enabled in PHY

    Is it recommended to be enabled? I need to check the hardware design

  • 0x300 and 0x302 alone

    These error counters track CRC errors originating from given ESC

    0x308 and 0x30a will track the forwarded error counters - let's say previous device detected the CRC then it will add odd nibble post FCS of the packet and ESC will instead count forwarded CRC errors here.  Can you also check these counters in AM243 ?

    Is it recommended to be enabled? I need to check the hardware design

    Yes. Bit 1 (Odd-Nibble Detection Disable) needs to be set to 1 to forward odd nibble to PRU firmware and for PRU firmware to insert odd nibble in outgoing packets.

  • Is odd nibble detection enabled in PHY ?

    Disabled

  • Can you check whether forwarded error counters (0x308 and 0x30a) are incremented in AM243 ESC as this is the expected behavior per spec if previous SubDevice is introducing errors.

  • Hi,

    I want to update the status of the issue. At the end customer's side, we have updated the firmware of AM2432 drives due to other issues. Surprisingly, the CRC error is gone after the update. It might be related, or irrelevant.

    At current state the end customer let us pass. But we are also unable to reproduce the issue. 

    So I would like to ask like what a standard scenario is, when a fault node constantly generates crc errors. When it happens again, we will take a closer look into these registers (0x300, 0x302, 0x308 and 0x30A)

  • Hi Jianyu,

    I want to update the status of the issue. At the end customer's side, we have updated the firmware of AM2432 drives due to other issues. Surprisingly, the CRC error is gone after the update. It might be related, or irrelevant.

    Thank you for the update. Glad that migration to the latest EtherCAT firmware resolved the issue. 

    o I would like to ask like what a standard scenario is, when a fault node constantly generates crc errors. When it happens again, we will take a closer look into these registers (0x300, 0x302, 0x308 and 0x30A)

     0x300/301 increments due to CRC errors and PHY RX_ERR respectively. ESC adds odd nibble to mark source of error as it forwards this packet to next device. This results in the following node incrementing forwarded error counter 0x308/309.

    Regards,
    Aaron