This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3356: PRU-ICSS

Part Number: AM3356
Other Parts Discussed in Thread: AM3354

We are using the AM3356 processor which performs   TCP/UDP communication over the eth0 and eth1 ports.

We are using the CPSW to handle the 2 ethernet ports.

Our implementation on the CPU is an intensive one and we are experiencing some CPU load. I consider to to use the PRU-ICSS for TCP/UDP communication to lower the CPU load on the AM3356 main core.

I have some questions regarding the PRU units on AM3356 :

1) Can I run a full UDP/TCP/IP stack in the PRU?  Is such stack TCP/UDP/ stack available to run fully in the PRU? Are there any limitation to run such a stack in the PRU (memory, etc.) ?

2) What would be the method of exchanging data between the main core & PRU .  Will such an architecture lower the CPU load on the main core? 

3) Can I run the eth0 through the CPSW while the eth1 through the PRU?

  • Hello Martin,

    Which Linux SDK releases support AM335x PRU Ethernet?

    First off, please note that we support AM335x PRU Ethernet on AM335x Linux SDK releases up to AM335x Linux SDK 8.2 (Linux kernel 5.10). We did NOT port AM335x PRU Ethernet to the latest Linux SDK 9.1. We are planning to re-add PRU Ethernet support in the 2024 AM335x SDK release.

    Unfortunately this descope did not make it into the AM335x SDK 9.1 documentation. I am working to get the docs updated. Future readers, you should find information about that descope here:
    https://software-dl.ti.com/processor-sdk-linux/esd/AM335X/09_01_00_001/exports/docs/devices/AM335X/linux/Release_Specific_Release_Notes.html 

    ok, I'm using an AM335x SDK that supports PRU Ethernet. Now what? 

    Keep in mind that the software stacks will still run on the ARM Linux core. So TCP/UDP are supported with both CPSW and PRU Ethernet.

    On older AM335x software releases, we supported Ethernet networking protocols like HSR & PRP. For those protocols, the PRU is able to offload some of the processing. However, for generic Ethernet, I would expect the load on the Linux core to be similar for both CPSW and PRU Ethernet. At a conceptual level, you can think of it like this: the PRU is basically a programmable set of cores, while the CPSW is a bunch of circuits. However, the PRU cores can be programmed to behave like the CPSW circuits when Linux is interacting with it.

    Yes, you can have CPSW control one Ethernet interface, and PRU control one Ethernet interface. Just note that even though one single PRU core will be used to control one PRU Ethernet interface, the other PRU core should not be used for anything in that usecase.

    Regards,

    Nick

  • Nick,

    Thanks for the reply.

    I just now realize that the PRU is a small CPU  with only 8KB of instruction &  data RAM. So my understanding is that up to 8 Kbytes of code can be loaded to PRU and this means that I should carefully consider what can be populated in those 8 Kbytes of instruction code. 

    We consider 3 alternatives:

    1. To use the PRU as a dedicated CPU which periodically sends/receives Ethernet packets from/to the shared memory. In this design the ARM main core will read/write the shared memory for data.

    2. The main ARM core periodically sends/receives Ethernet packets through the PRU Ethernet port 

    3. The main ARM core periodically sends/receives Ethernet packets through the CPSW Ethernet port

    Question:  Is there any advantage of option 1 over option 2 & 3  (in CPU load time of the ARM core) ?  Is option 2 heavier in ARM core load than option 1? Is option 3  heavier in ARM core load than option 2 or 1? 

    Option 1 looks more deterministic in sending/receiving data (Ethernet packets) from/to the  shared memory in PRU, however  for the main ARM core it also depends on the rate the shared memory is read/updated. 

    Option 3 looks better, because we do not have to make any change in software and hardware.

    Please your opinion.

  • Hello Martin,

    Option 1: This is not a usecase TI supports. The PRU core handles getting signals into and out of the chip, and then a different core handles the networking stack. On AM335x that different core is the ARM core, while on later devices like AM64x the networking stack can be handled by the Linux A53 cores or the RTOS R5F cores.

    For a general purpose Ethernet usecase, I am not aware of specific benefits to using PRU Ethernet instead of CPSW Ethernet. The primary benefit would be if your usecase needs 3-4 Ethernet ports instead of 1-2 Ethernet ports.

    If you are not using the PRU-ICSS, you may consider using a different AM335x part number, like AM3354, where the Linux core can run at up to 1000MHz. That could get you slightly more performance out of the ARM core.

    On the other hand, if your Linux core is spending a bunch of cycles doing non-networking tasks like monitoring an external voltage value, that is the kind of task that you could offload to the PRU subsystem. Let me know if you have additional questions about PRU.

    Regards,

    Nick

  • I double-checked with the engineer who supports CPSW Ethernet, and he confirmed that he would expect CPSW Ethernet to have the same or less ARM processing required to do Ethernet than PRU Ethernet. That combined with CPSW Ethernet supported on all AM335x Linux SDK releases would lead me to suggest using CPSW Ethernet for your usecase.

    Regards,

    Nick

  • Hello Nick,

    Thanks again for  your answer.

    I am not familiar with the code  & architecture of the Ethernet subsystem in the PRU and how it interconnects with the AM335x main core.

    I assume that the Ethernet driver on the AM3356 core somehow transfers/receives the Ethernet data packets to/from the PRU. I assume that the PRU implements the MAC layer and controls the MII lines which are connected to a PHY. Am I correct?

    The guys from my team insist to clarify the following option:

    Is it is feasible (possible) to  change the Ethernet driver within the PRU in a way that it will aggregate several incoming Ethernet or UDP packets to a one packet and forward the single aggregated packet to the AM335x ARM core. The idea is to reduce the time of serving multiple packets to one incoming packet . I am not sure if I have the ability to control within the PRU the Ethernet data traffic which goes through the PRU to the AM335x main core Ethernet driver. 

    Please advise on this issue.

  • Hello Martin,

    It has been a while since I dug into exactly where the division of labor occurs between the Linux Ethernet driver and the PRU firmware, but your assumption is generally correct. For more information, refer to the SDK docs, starting at section "How It Works"

    https://software-dl.ti.com/processor-sdk-linux/esd/AM335X/08_02_00_24/exports/docs/linux/Foundational_Components/PRU-ICSS/Linux_Drivers/PRU-ICSS_Ethernet.html

    Another thing you'll note is that the AM335x's PRU-ICSS version of PRU Ethernet uses a shared memory strategy to pass frames between Linux and the PRU, instead of using DMA. That is another point in favor of CPSW, as I believe that uses DMA for data transfer.

    What about aggregating Ethernet packets?

    From a "modify how the Ethernet driver / PRU firmware works" standpoint:
    I would NOT suggest trying to modify the Linux driver / PRU firmware interface. Technically I can point you to the PRU firmware source code, but it is very complex, undocumented assembly code. The interface also changed from SDK 7.3 --> SDK 8.2.

    I would expect that the Linux ethernet driver has an option to aggregate packets instead of interrupting the ARM core every single time a TX packet is received. I would look into that option with CPSW before trying anything with PRU Ethernet.

    I'm reassigning your thread to the CPSW guy if you have followup questions about that subject.

    Regards,

    Nick

  • Actually, I'll hold onto the thread for a bit longer.

    Please start by taking a look at 
    https://software-dl.ti.com/processor-sdk-linux/esd/AM335X/09_01_00_001/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/Network/CPSW.html

    section "Interrupt Pacing"

    Regards,

    Nick

  • Nick,

    Thanks for the help.  We will use the 1G processor. This will give us some additional power. We will also use the distributed network that we have to reduce the CPU load of the main CPU. 

    Thanks again for the assistance. 

  • No problem Martin!

    One additional note on interrupt pacing - I would expect that to help with RX Ethernet packets (though I'm not sure how much it would improve things since I have not tested myself). That would not lower processor load for TX Ethernet packets generated by the ARM core though.

    Regards,

    Nick