This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM335x TwinCAT DCs

Other Parts Discussed in Thread: AM3359

Dear Sirs,


since longer time we have problems to start DC-Sync0 pulses at our am3359 controler with TwinCat at every time. Some times one controler of n controlers doesn't brings pulses at Sync0 output. Only an coldstart of this controler can remove the problem in this case. That means twincat couldn't reinit this controler anymore with working Sync0 pulses. OP mode is on and is working and we'll see the problem only indirect when a second processor don't get any DC Sync0 pulses. But why the pulses will not generated?

I've found today this thread's here an I ask me. Could some things here in may be similar problems?

I'll check now following registers in case of such an error:

ECT_REG_DCSYSTIME (0x0910)
ECT_REG_DCSTART0 (0x0990) (in your case it should something in the area of the previous read value + 1_000_000)
ECT_REG_DCCYCLE0 (0x09A0) (i.e. 1.000.000)

0x982:0x983, the Sync Impulse Length
ECT_REG_DCSYNCACT (0x0981)

Could it be a problem with LRW or LRD / LWR?

The problem is: this problem occures very rarely, but at one of our customers very often...

Can you give me tipps with this problem? Actualy ee are using Ind/SDK v01.01.01.01 with the pruss-firmware. But there is also the pruss-firmware v1.0 (for Am3359) from Ind/Sdk v2.1.1.2. Which one is the correct one?

Best Regards

Frank Gottschalt

  • The Industrial team has been notified. They will respond here.
  • I've found now with an own teststand such a possible fail-situation.

    The registers

    ECT_REG_DCSYSTIME (0x0910)
    ECT_REG_DCSTART0 (0x0990)
    ECT_REG_DCCYCLE0 (0x09A0)
    0x982:0x983, the Sync Impulse Length
    ECT_REG_DCSYNCACT (0x0981)

    seems to be ok - like the good running controlers.

    But

    ECT_REG_DCCTRLERR (0x092c:0x092e) is showing swaying values also greater than 1000!

    // following testfunctions shows absolute values over 1000:
    EC_RET_T ec_dcGetError(void)
    {
    UNSIGNED16 dcactivation_status;
    UNSIGNED16 dc_ctrl_L;
    UNSIGNED16 dc_ctrl_H;
    UNSIGNED16 dc_delay_L;
    UNSIGNED16 dc_delay_H;
    INTEGER32 dcSystemTimeDifference;
    INTEGER32 dcSystemTimeDelay;

    dc_ctrl_H = ESCREG_READW(DcCtrlError.dcctrl_error_H);
    dc_ctrl_L = ESCREG_READW(DcCtrlError.dcctrl_error_L);
    dc_delay_H = ESCREG_READW(DcCtrlError.dcsys_time_delay_H);
    dc_delay_L = ESCREG_READW(DcCtrlError.dcsys_time_delay_L);

    dcactivation_status = (UNSIGNED16)ESCREG_READW(DcInterrupt.DCactivationStatus);
    dcSystemTimeDifference = ((INTEGER32)dc_ctrl_L & 0x0FFFF) | ((INTEGER32)(dc_ctrl_H & 0x07FFF) << 16);
    dcSystemTimeDelay = ((INTEGER32)dc_delay_L & 0x0FFFF) | ((INTEGER32)dc_delay_H << 16);
    if ((dc_ctrl_H & 0x08000) == 0x08000) {
    dcSystemTimeDifference *= -1;
    }
    if (((labs(dcSystemTimeDifference) >= 1000) && (dcSystemTimeDelay != 0)) ||
    ((dcSystemTimeDelay == 0) && (dcSystemTimeDifference != 0)))
    {
    return(RET_ERROR);
    }
    return(RET_OK);
    }

    I've no idea how to cancel this situation without a coldstart (only thing what seems to help - but this isn't helpful for our customer!
  • Click here to play this video

    (good working dc in the same bus (dc-slave))

    Click here to play this video

    (bad working dc of one controler in the bus (dc-slave))

    Do you know, what is here happen?

  • I tested some things now.

    restart of TwinCat,
    plug-out-and-in bus cables,
    changes of OP-modi,
    set some DC register values manualy

    - nothing seems to help to fix this fail-situation.

    The only thing is to make a coldstart of this controler. (But we have an effect with warm/coldstart with wrong program data in second stage bootloader from rom-bootloader while rebooting. Something with spi reinit may go fail here or the spi-flash-chip goes bad. We want to test all signals after the summer holiday period. As long we have this issue, we can't do a cold/warmstart via software as possible workaround of this problem.
  • Hi Frank, thanks a lot for your comprehensive information, let me digest it and come back to you soon.
    Paula
  • Hi Frank, I will work with the developers in trying to reproduce the issue, however, in the meantime I was wondering if you can give a try to EtherCAT master firmware from latest Industrial SDK (downloads.ti.com/.../index_FDS.html) ?

    Some bugs for EtherCAT were fixed in this release:
    processors.wiki.ti.com/.../SYSBIOS_Industrial_SDK_02.01.02_Release_Notes
    There is one relate to Slave DC mode issues with Acontis master stack..

    Also for your reference (in case you want to see what have been fixed between firmware versions): processors.wiki.ti.com/.../AM335x_EtherCAT_Slave_Errata

    Thank you,
    Paula
  • Hello Paula,

    1. I've tested the LRD/LWR instead of LRW mode of TwinCAT now at all am3359-controlers in bus. But the DC errors are still there.

    2. Ok, i'll try now the Ecat pruss firmware (v1.0 for am335x) from ind-sdk v2.1.2.2

    3. Additional I'll check all stack<->pruss register interactions in between a normal case and in an error case. May be there is an timing issue while initialisation from INIT to OP mode. I've created a log module for all register interactions - so may be I'll found something bad, I can fix this. But if there is something inside of Pruss and TwinCat - I'll need help from you.

    Best Regards
    Frank
  • Hello Paula,

    the second point, test of sdk v2.1.2.2, is very difficult for me because TI has changed big parts of the entire Driver-Interface and other OS-Drivers in sdk v2.0. Therefore I used the last sdk v1.0 (v1.1.1.1) instead of the sdk 2.0.

    I'll now pick the changes from sdk v2.0 with pruss handling and pruss driver to my project. This may need some time.

    Additional we have another big problem at our customer. In some cases after switch on the machine it occures that EtherCAT Port-B of one of our controlers take no connection to the controler behind.



  • Hi Frank, wondering if you have had the chance to finish porting/testing latest EtherCAT slave firmware?. Also about your second issue could you please share more details? Maybe in a new E2E so we don't mix issues which can create confusion.

    thank you,
    Paula
  • Hello Paula,

    after 3 long hard days of coding, I've just finished to port the sdk 2.1.2.2 to our am3359 board. Puh - that was crazy! TI has had changed very much in the driver sdk interface.


    At one point there is an error in sdk The pin muxing of the pruss isn't board specific in the sdk anymore, but we have an other

    tiescutil.c:

    line 578: PRUICSS_pinMuxConfig(pruIcss1Handle, 0x00);   // PRUSS pinmuxing

    Here I've had to call this function board specific like in older sdk's for example in such way:

    PRUICSS_pinMuxConfig(pruIcss1Handle, (board == BOARD_TYPE_IDK) ? 0x01 : 0x00);   // PRUSS pinmuxing

    Now the actual PRUSS firmware and driver is running at our boards.


    1. It seems to be, the problems with closed ports / links are gone. The communication seems to be very good in the early tests. Tests at our customer will follow. May be the comm/link problems in my last post are gone with this new sdk+drivers. Thank you for this!

    2. The DC Sync0 problem seems to be allways there, but more rare! I'll take now a such error and check if the registers of DC will show the same behavior like with the older sdk.


    I'll test now the new sdk and I'll give you more informations.

    Best Regards

    Frank

  • correction of my last post: I'm using now sdk v2.1.2.2!
  • Hello Paula,

    1. The second issue (closed ports/links after a while) is gone away with the new sdk v2.1.2.2. (Yes I know - it's not good to mix two different issues in one thread.)

    2. But the DC Sync0 probem is still there at some time at one or more am3359 controlers in the EtherCat-line.

    - The problem is more rare than the problem with sdk v1.1.1.1, but it is still present.

    - The dc registers are showing now a different view.

    - the large floating DCCtrlError values are now gone. The DCCtrlError shows now good smal values.

    - but at the 'bad' controler, which is not generating Sync0 pulses, the registers 0x0990:0x0992 (DC StartTime0 L) are showing a frozen value, while the 'good' controlers are showing changing values at these registers!! Thats the difference in registers I've seen between good and bad ones.

    the well/good operating controler:

    Click here to play this video

    Click here to play this video

    Click here to play this video

    the bad/wrong operating controler without Sync0 pulses:

    Click here to play this video

    Click here to play this video

    Click here to play this video

    Best Regards

    Frank

  • Hi Frank, we haven't been able to reproduce the issue. We used four AM3 boards in DC mode and 10,000 frames/sec. We checked for ~30 minutes.

    few request/questions:
    - Could you shared more details of your configuration
    - Typically, how much time it takes to start getting boards with DC Sync0 problems?
    - Typically,How many boards fail? how big is the bus?

    Thank you,
    Paula
  • Hello Paula,

    the issue is always an boot/initializing effect! If the controlers all running in OP and bring the sync0 pulses - all will be fine for hours...may be days (years..)! But our customer and we looking for an issue which is occuring direct after powering on a machine. Then 1 of 10..20times may be one or sometime also two of the controlers aren't synced because it generates no sync0 pulses (without any error in TwinCAT). The only possibility to remove this issue is to power down and power on the machine again. Thats the reason why our customer doesn't accept this as usable product.

    I've 5 am3359 controlers as testline.

    Our customer has 7 of our controlers and 13 other slave devices in line. We've nearly the same issue error rate.

    Now with sdk v2.1.2.2. the issue occures may be 1 time of 10-20 time switching the power of the controlers on, while TwinCAT is allways running.

    Then, if one of our controlers is showing this behavior directly after init DC and going to OPERATIONAL from our TwinCAT 3.1. This issue is present until the controler will got a warmreset or coldstart! We can detect the loss of sync0 pulses with an additional hardware and bring it to a LED and/or a CoE readable object. That makes it easier to detect the issue.

    May be you could read the registers 0x990:0x992 for changes. If these register don't change their values after going to OP - then you may have detect the issue. Then you have to switch off and switch on again all slave controlers to catch the issue.

    Our EtherCAT PDO cycle time is 500µs. The DC sync0 period is set to 9000µs (yes, a not common value). May be this high value could be a possible reason for this issue? I'll set this value to 500µs and will test it again!

    Best regards

    Frank

  • PS:

    - I've detected the issue also at the first controler (dc reference clock master) as like at dc reference slaves.

    - The issue occures with 9000µs Sync0 period and also with 500µs period (same like pdo cycle time of 500µs).  

    the screenshots are showing a controller with the issue (0x910:0x912 running sys time, 0x990:0x992 frozen DC StartTime0):

    If the issue occures, the DC StartTime0 register stands frozen and shows a Time near the init-time of OP/DC from TwinCAT. The DC SysTimeL is running with the actual time.

    At the other controlers without the issue, the DC StartTime0 register is a changing value with timevalues near the DC SysTimeL.

  • Hi frank we haven't been able to reproduce the issue. Below what we tried:

    Options checked:

    Hard reset ~30 times

    Soft reset ~50 times

    Disconnecting the In-Port of slave1 ~30 times

     

    We did not see the issue (constant values at register 0x0990 and 0x0992).

     

    Additional information:

    4 AM3 boards in the network.

    10,000 frames/sec and 2000 frames/sec

    wondering which are the differences between your controller (with AM335x) and our boards, if you can help us with this information we will appreciate it. Or if you think we are missing anything in our setup that can help us to reproduce your observed Sync0 issue

    thank you,

    Paula

  • Hello Paula,

    we use TLK 106 as phy's and we have a connection layout as like your old IDK board was routed.

    I've to call the function of the pruss init

    PRUICSS_pinMuxConfig(pruIcss1Handle, 0x01);


    With a value '1' because of this case. May be there is something different like this, whats forcing this issue? Your AM3 Boards are more ICE V2 like, right?


    There is also a different EtherCAT stack we are using. I'm to find now differences in stack<->PRUSS register interactions now between successfull case and issue case. Thats needs actual some time. If I don't find a such difference, the stack is not involved into this issue, I think.

    Is there a possibility to send you some of our controllers for tests? May be you could plug into the jtag-interface to see what the pruss are doing. In which country are you testing?

    Best Regards

    Frank

  • Hi Frank, I am located in US, we have our protocol developers in India and also a team for industrial support in Germany. I will discuss internally where it would make more sense for you to send your board.
    You are correct, we are using ICEv2 for testing, I also tried AM437x IDK.. We are also currently checking that we don't have any time stamp overflow which could cause the issue.

    Thank you,
    Paula
  • Hello Paula,

    to show you the issue at our controlers, we could give access of my pc with issue at the jtag-connected controler with the possibility to have a view inside of the Am335 and it's pruss. We discuss this, if we can do this with teamviewer or skype. We have also to check, if our company allows that. But I think it would be a possibility.

    If you are interessted in, please give me an answer.

    Best Regards

    Frank

  • Frank, we have been working on reproducing the issue. I am collecting information from the team which is working on this and I will give you an update soon.

    Thank you,

    Paula

  • Hi Frank, we are currently testing a potential fix in the firmware. We will keep you informed

    Thank you,

    Paula

  • Frank, Could you please try attached EtherCAT firmware (build 0x035E) in your setup and let us know if you still face SYNC0 lost issue?.

    (removed link to firmware)

    Let me know if  you have any question. thank you,

    Paula

  • Hello Paula,


    I've tested your firmware now for some hours and more than 150 switch-on/switch-off cycles. I've not one issue detected anymore. You made us a very good day! Good job! Thanks! The issue seems to be gone.

    Best Regards

    Frank

  • Thanks a lot for the confirmation Frank!
    Best regards,
    Paula
  • Hi Frank, just to let you know that we are still testing shared firmware. So please take it as an engineering drop and not as a production drop. We will let you know when we have it officially release.

    thank you,
    Paula
  • Hello Paula,

    I know this. Nevertheless we let one of our customers test this firmware inside of our firmware to show the ongoing progress. We communicated with the customer about the beta-stand of this firmware.

    I'll wait for the next release of ind-sdk to build a release version.

    Thanks

    Frank

  • Hello Paula,

    I've a question about the actual IndSdk v02.01.03.02.

    PINDSW-925 EtherCAT: SYNC0 lost (0x2c) error seens occasionaly when Device is powered on and off (or INIT to OP transitions) during regression tests

    Does this fix means our problem from last year? Is this fixed now with this release Version v02.01.03.02?

    Best Regards
    Frank
  • Hi Frank,
    yes, this is listed as closed and fixed in 2.1.3 release. And it refers to this thread too.

    Best regards,