This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6442: PRU/DMA communication with R5F performances problems

Part Number: AM6442
Other Parts Discussed in Thread: SYSCONFIG

Tool/software:

Hello everyone,

I am contacting you because I notice performance problems in my system.
To give some context, I work with the AM6642 EVM card in order to establish an EtherCAT communication with a slave network.
The performance of our system is quite critical with a cycle time around one hundred μs. We have to optimize everything so that everything happens as quickly as possible. 

As such, our application uses DMA and PRU to interface with the Ethernet PHY which allows sending the EtherCAT frame. 

My problem is the following: The time between when the frame arrives on the PHY of the AM6442 and when the frame is actually available in our application is abnormally long (it takes 15 μs for a frame of 800 bytes). 
I would have thought that this time would be much shorter, about 2-3 μs. Our application is running on a R5F-cortex (MAIN_Cortex_R5_0_0). 

If I understand correctly, the frame arrives on the Ethernet port, the PRU processes it in real time (so we hope that it takes very little time) and then gives the frame to the DMA which puts it in the desired memory space (in our application in MSRAM). 

From what I understand, the DMA is well used and we configure the PRU in the sysconfig as follows: 






I am aware of not giving much information but I do not know what to provide. 

Do you know how I can reduce this time? Is it possible to reduce it significantly? Do you have any idea how long this could take?
Don’t hesitate to let me know if you need information.

Thank you very much for your help.

Maxime. 

  • What EtherCAT master stack are you running on the R5?

    How are you measuring the 15us? Is is always 15us or does it vary?

    What SW are you running on the R5? SDK version?

      Pekka

  • Dear Pekka, 

    Many thanks for your quick answer. 

    We are using the SOEM master stack. 

    We measure always the same time of 15 µs. 
    Currently, we are using the SDK "mcu_plus_sdk_am64x_08_01_00_36". 



  • Hi Nicaise, 

    We measure always the same time of 15 µs. 

    May I ask how exactly are you measuring this latency between when the Ethernet frame arrives on the PHY on the AM64x EVM and when the frame is available in your application/EtherCAT Master? 

    -Daolin 

  • Hello and thank you for your answer, 

    The way we have proceeded is perhaps not very scientific but the order of magnitude and quite correct. Side of the slave network we instrumented ourselves on the first slave of the network (so the last slave to return the frame to the master once it transited over the network) a signal on port 0. This signal is trigged when the last byte of the EtherCAT frame passes through this port (see red cross in the diagram).

    The time between this latched time and the arrival of the full frame on the PHY of the sitara is very low because it is just the circulation time on the ethernet cable and the decoding time of the PHY of the master.



    On the side of our application, there is in the SOEM stack a "ecx_receive_processdata_group" function which itself contains an "ecx_waitinframe" function. We set up a measurement point in this "ecx_waitinframe" function, once the frame was available in our application.


    Please let me know if this is not clear to you.

    Kind regards, 
    Maxime



  • The way we have proceeded is perhaps not very scientific but the order of magnitude and quite correct. Side of the slave network we instrumented ourselves on the first slave of the network (so the last slave to return the frame to the master once it transited over the network) a signal on port 0. This signal is trigged when the last byte of the EtherCAT frame passes through this port (see red cross in the diagram).

    The time between this latched time and the arrival of the full frame on the PHY of the sitara is very low because it is just the circulation time on the ethernet cable and the decoding time of the PHY of the master.

    This also assumes no system time offset between SOEM Main-Device and first Sub-Device? Are you measuring this with DC enabled?

  • Yes, DC feature is enabled. But I don't get the idea with the system time offset. How could this have an impact on the measured time?

  • Yes, DC feature is enabled. But I don't get the idea with the system time offset. How could this have an impact on the measured time?

    You are comparing the time captured in S-device with timer in M-Device?

  • Yes we measured the time between the last byte of the frame leaves the first slave of the network (which is therefore the last through which the frame passes on the return) before the frame is actually available in our application. 

  • Currently, we are using the SDK "mcu_plus_sdk_am64x_08_01_00_36". 

    Update to something more recent. Many bugs fixed and updates especially to Ethernet and ICSSG in the dozen or so releases since 2021. 11.0 was released a month or so ago. https://www.ti.com/tool/download/MCU-PLUS-SDK-AM64X .

    On the side of our application, there is in the SOEM stack a "ecx_receive_processdata_group" function which itself contains an "ecx_waitinframe" function. We set up a measurement point in this "ecx_waitinframe" function, once the frame was available in our application.

    Generally running 3rd party SW like SOEM and debugging the modifications you needed to make is not something we support. I would recommend running https://www.ibv-augsburg.de/en/products/icnet/ethercat-master/ or https://public.acontis.com/manuals/EC-Master/3.2/html/ec-master-class-b/emlltieneticssg.html for EtherCAT master/main. They will also document the performance they can reach.

    If you still want to proceed on your port of SOEM my suggestions as pointed out by Pratheesh, what clocks are used and how did measure the 15us. If comparing times on the master/main and slave/subdevice, first measure the clock accuracy between the master/main and slaves/subdevice using a oscilloscope and pulse per second type setup. Need to know how accurate any timestamp on one device is compared to the other.

      Pekka