This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2432: Omron NX701 set drive back to init if the drive is power cycled

Part Number: AM2432

Tool/software:

Hi TI experts,

We are having issue with Omron PLC for AM2432 drives, while XMC4800 drives are OK.

The simplified connection is,

Omron NX701 - network analyzer - XMC4800 (adp 2) - AM2432(adp 1)

1. With First power up / PLC program download, drives becomes OP correctly. 

2. When clearing fault is issued, nothing happens

3. XMC4800 is power off, power up, clear fault from PLC to set all drives back to OP.

4. When more Clearing fault command is issued, nothing happens.

5. AM2432 is power off (drive_power_off.pcap),  power on, clear fault to set all drives to OP(drive_power_on_reset_to_op.pcap)

6. When 2nd clear fault command is issued, AM2432 is set back to Init (After_power_on_reset_to_Init.pcap).

7. When 3rd Clear fault command is issued, AM2432 is set to OP (After_power_on_Init_reset_to_OP.pcap)

8. step 6 & 7 can be repeated to reproduce the issue.

9. If cable between AM2432 and XMC4800 is disconnected & connected, repeat step 6 & 7 would no longer reproduce the issue.

TI_SDK.zip

So I think the connection behavior of AM2432 is different from power off & power on to cable disconnect & reconnect.

Code of AM2432 is reduced to beckhoff slave stack built from SDK and the issue persists.

One noteable issue is that when powering of AM2432, the package is lost for 150 ms.

So we think PHY might still be active while PRU is not active.

Now phy is reset when AC loss fault is detected(this would not fix the Omron issue).

PHY reset is also applied on first time Init to PreOP( which could fix the issue).

However we are thinking if the reset can be put on power up (tested but not working), and if anyone can understand the fundamental difference that caused the issue

Thanks.

  • Hi,

    So I think the connection behavior of AM2432 is different from power off & power on to cable disconnect & reconnect.
    • Can you give more details on the DL Status is Step 6 & 7. Will have to see if the DL status shows active link or the PHY goes into a reset state when the device goes to INIT, since you mentioned that after reconnecting the cable, the issue is resolved.

    How is the PHY reset logic implemented in the SubDevice? The recommended sequence is to do a PHY Reset only once during the startup. Note that PHY reset is an expensive step and doing this at un-intended time can cause a link drop.

    PHY reset is also applied on first time Init to PreOP( which could fix the issue).
    • This is not recommended and can be problematic due to the above mentioned scenario. PHY reset is to be done before stack initialization or state transition as the EtherCAT MainDevice expects the SubDevice to be responsive during state transition and with a reset, this link can get broken.
    One noteable issue is that when powering of AM2432, the package is lost for 150 ms.
    • This will be expected keeping in mind that the PHY initialization along with the EtherCAT firmware initialization will take some time before a valid link is established. This can be confirmed by logging the time for PHY Reset and Initialization, when there is a successful MDIO communication, EtherCAT stack is initialized, etc.
    However we are thinking if the reset can be put on power up (tested but not working)
    • Yes this is the recommended sequence. By not working, do you mean the link is not established or the link is established but the SubDevice won't transition to OP (step6)? 

    Regards,
    Aaron 

  • Hi Aaron,

    Sorry for the late reply, as customer did spend a lot of time to debug on it. Let me share the update and answer your question here.

    • How is the PHY reset logic implemented in the SubDevice? The recommended sequence is to do a PHY Reset only once during the startup. Note that PHY reset is an expensive step and doing this at un-intended time can cause a link drop.

    The hard reset was not performed, or only soft reset was attempted. We tried modifying the pull-down configuration and pulling high during initialization, but both methods resulted in OP drop issues. The solution was to perform a soft reset once after the first entry into OP mode, which significantly reduced link drops during subsequent PLC resets.

    • Yes this is the recommended sequence. By not working, do you mean the link is not established or the link is established but the SubDevice won't transition to OP (step6)? 

    After power-on, the first PLC reset can enter OP mode, but triggering the PLC reset a second time will cause it to fall back to Init mode

    Based on the packet capture analysis, the PLC sends a command to transition the slave to the INIT state but no subsequent actions follow. In contrast, customers' other drives using TI's DSP or those equipped with the XMC4800 microcontroller do not receive this INIT command from the PLC. The PLC reports a device WDT error specifically for the AM2432 model. Notably, the series with TI's DSP and the XMC4800-based drives operate without this issue. Only the AM2432 has this problem.

    Could you provide some more suggestions for customer please?

    Thanks,

    Kevin

  • Hi Aaron,

    Add more details after further debugging.

    There is currently a peculiar phenomenon. We have observed that the issue is somewhat related to the first node. If all first nodes are restarted simultaneously, the OP state loss does not occur. However, if the first node remains un-restarted, after the first PLC reset, the system can enter the OP state normally, but during the second PLC reset, the OP state will be lost.

    Thanks,

    Kevin

  • Hi Kevin,

    The solution was to perform a soft reset once after the first entry into OP mode, which significantly reduced link drops during subsequent PLC resets.
    • I understand this workaround helps in this issue but as mentioned earlier, this is not recommended as packets can be dropped if a soft reset is done during SAFEOP to OP. 
    We have observed that the issue is somewhat related to the first node.
    • Is the first node AM2432 running EtherCAT SubDevice?
    If all first nodes are restarted simultaneously, the OP state loss does not occur.
    • You mean with this, the AM2432 doesn't fall back from OP to INIT after multiple PLC resets? Looks like the first device is not starting up correctly? And is the reset logic different in AM2432 as compared to other nodes?

    Regards,
    Aaron

  • Hi Aaron,

    • Is the first node AM2432 running EtherCAT SubDevice?

    We have tried some different type of models as first node, the result is all same.

    You mean with this, the AM2432 doesn't fall back from OP to INIT after multiple PLC resets? Looks like the first device is not starting up correctly? And is the reset logic different in AM2432 as compared to other nodes?

    Yes. If the first node do power cycle together with other AM2432 nodes, the AM2432 doesn't fall back from OP to INIT after multiple PLC resets.

    However, if the first node remains un-restarted, after the first PLC reset, the system can enter the OP state normally, but during the second PLC reset, the OP state will be lost. The problem is that only the AM2432 in the network has lost OP. Other drives are tested ok.

    Today, we test AM243-launchpad with TI's beckhoff ssc demo project in the network. It also has this issue.

     It seems not related to the phy reset.

  • I would like to add an explanation.

    PLC reset here refers to a reset all button in the Sysmac which is PLC's upper computer software. it is something like clear all the fault.

  • Hi

    We managed to define the behavior of the problem better now, and now we see the same issue with AM243x-LP with "ethercat_slave_beckhoff_ssc_demo" example.

    The way to reproduce the problem is as follows:

    1. at least 2 nodes when the second node running with AM243x and TI's ethercat stack (the first node can be the same but doesn't have to), connected to OMRON NX701        controller.

    2. power up the nodes and set up the network - all nodes are in OP state.

    3. power cycle the second node (LP or our drive, depend on the test setup).

    4. click the "Reset All" button in OMRON Sysmac software - both nodes go to OP, all errors in the controller log are cleared.

    5. click the "Reset All" button again, second node only will fall from OP to INIT - Process Data Communication Error will appear.

    6. click the "Reset All" button again and the node will go to OP, all errors in the controller log are cleared.

    steps 5 and 6 will occur on repeat.

    the way to fix the problem can be done in one of two ways:

    1. disconnect and connect the ECAT cable from second nodes IN port.

    2. power cycle the first node.

    I have attached a dump of the ESC registers (0x3001000 - 0x30010ECF) from the same drive, one time when the issue is "fixed" and one time when the issue occurs.

    Could you please help identify if there are values you suspect can cause this issue based on the description above?

    Thank you,

    Sahar Schwartz 

    0x30010000_0x30010ECF_no_issue.dat0x30010000_0x30010ECF_with_issue.dat

  • Hi Sahar,

    Comparing the wireshark logs for no issue and with issue, and I see that 0x0442 Register value is 1 (PD Watchdog expired) whereas for the working case, it's 0:


    0x0443 (PDI Watchdog Expiry count) is incremented in both the cases. Apart from this, I don't see any other issues with the ESC Registers.

    Additionally, we want to understand from the PLC's point of view on which type of frame the PLC is sending out when Reset All is done and how the ESC is taking care of that frame/datagram. So for that, could you share the wireshark logs during Step 4 to Step 6?

    Regards,
    Aaron 

  • 79_clearfault_drives_lose_OP.rar

    Hi Aaron,

    I'm trying to find some differences between AM2432 and other type of slaves in the ethercat packets before the master sending AL control INIT to the AM2432.

    The attached file is the sniffer packets during the second PLC reset. slave 0x4F is AM2432.

    There are indeed some differences. But I'm not sure which ones will affect the results.

    1. AM2432 uses 8 byte access and PDI emulates eeprom while others use 2byte address

    2.  AM2432 has no MII ext. link detection bit in ESC features.

     ...

    i wonder if we can change these esc register values? 

    It seems the master read 0xA  0xC 0xE 0x40 0x910 and then send preop to other normal drives but send FPWR 0x600 and INIT  to AM2432.

    Could you please help to look at the sniffer packets?

     

  • 3515.79_clearfault_drives_lose_OP.rar

    Hi Aaron,

    I'm trying to find some differences between AM2432 and other type of slaves in the ethercat packets before the master sending AL control INIT to the AM2432.

    The attached file is the sniffer packets during the second PLC reset. slave 0x4F is AM2432.

    There are indeed some differences. But I'm not sure which ones will affect the results.

    1. AM2432 uses 8 byte access and PDI emulates eeprom while others use 2byte address


    2. AM2432 has no MII ext. link detection bit in ESC features.

    i wonder if we can change these esc registers' value?


    It seems the master read 0xA 0xC 0xE 0x4 0x910 and then send preop to other normal drives but send FPWR 0x600 and INIT to AM2432.

    Could you please help to analyze the sniffer packets?

  • 3515.79_clearfault_drives_lose_OP.rar

    Hi Aaron,

    I'm trying to find some differences between AM2432 and other type of slaves in the ethercat packets before the master sending AL control INIT to the AM2432.

    The attached file is the sniffer packets during the second PLC reset. slave 0x4F is AM2432.

    There are indeed some differences. But I'm not sure which ones will affect the results.

    1. AM2432 uses 8 byte access and PDI emulates eeprom while others use 2byte address


    2. AM2432 has no MII ext. link detection bit in ESC features.

    i wonder if we can change these esc registers' value?


    It seems the master read 0xA 0xC 0xE 0x4 0x910 and then send preop to other normal drives but send FPWR 0x600 and INIT to AM2432.

    Could you please help to analyze the sniffer packets?

    3. AM2432 0x141 PDI ctrl has no Enhanced link detection. Enable Dc sync out , Enabel DC latch in bits

  • Hi Aaron,

        One more question is the sys time of the AM2432 nodes.

    Before the master send INIT to the AM2432, the master read 0x910 DC systime.

    it shows the Am2432's DC systime is a bit larger.

    Will this have an impact?

  • Hi Aaron

    Attached are wireshark logs for when "Reset All" causes fall to init and where it doesn't.

    In addition, i attached the PHY registers values of the two drives in chain, one in a "good" condition and the other in a "bad" condition.

    Sahar 

    wireshark_rec.zipPHY_Compare.xlsx

  • In the wireshark recording you can find 2 nodes.

    the first node is 0x002c and the second, the one who also falls to init, is 0x000d

  • Hi

    just found out ESC register 0x141 is 0 even though we enable enhanced link indication.

    once i write 1 to this register the problem seems to be solved.

  • Hi Sahar,

    once i write 1 to this register the problem seems to be solved.
    • So you're saying if you set 0x0110.bit2, then the AM243x does not go to INIT during "Reset All" issued from the Omron PLC?
  • i uploaded the wrong picture. the register is 0x0141

  • But we are enabling enhanced link indication.

    isn't the value of the register should be 0x1 from the first place?

  • Hi Sahar,

    0x0141 is updated by the EEPROM:

    Snapshot taken from bsp_eeprom_load_esc_registers() API in tiescbsp.c:

    The contents of eeprom_cache[1] is written to 0x0141 ESC Register.

  • I actually misled you yesterday — sorry about that.

    I intended to write 1 to bit 1, but in fact, I wrote 1 to bit 0. So this is not related to Enhanced Link Detection.

    It’s actually related to the behavior described as: "AL Status will be set to the value written to the AL Control register."

    Any idea why this would make a difference?
    In addition, we recorded the AL Status and AL Control values, and they don't seem to change together, contrary to what the description of bit 0 suggests.

  • Hi Sahar,

    We will review the firmware behavior for ESC Reg0x141.bit0 = 1 and get back to you with more details. 

    Regards,
    Aaron

  • In addition, we recorded the AL Status and AL Control values, and they don't seem to change together, contrary to what the description of bit 0

    This makes sense, at present FW does not use 141.Bit0 and its value is don't care. Unfortunately, we have no good explanation here why it helps. Do you need to write to 0x141 to see the improvement or any other register write is helping?

  • i didn't try other registers.

    i did try to write to register 0x141 bit#1 and that didn't help. in addition, when i set back bit#0 to 0 the issue returns.

    i also tried to write 0 few consecutive times and that didn't help.

  • Hi Team,

    You have provided the ESC Register space earlier. In addition to that, could you provide the complete ICSS memory dump for the non-working case and working case (when 0x141.bit0 = 1)? I'm referring to the following Registers in the AM64x/AM243x Processors - Technical Reference Manual:

    Regards,
    Aaron