This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SK-AM64: TMDS64EVM - Support for ICSSG Ethernet Driver on A53 CPU of AM64x

Part Number: SK-AM64

Is TI willing to provide information and some help if I port the ICSSG Ethernet driver to run on the A53 CPU?  Our CEO is willing to give TI the results of any porting effort in exchange for some information or help by TI.

If TI already has plans to port the driver, I am also willing to do alpha or beta testing and provide feedback or do debugging.

Our application requires that the A53 CPU can transmit multiple Ethernet messages within a millisecond, with some time to spare.  Our testing with the Ethernet driver running on the R5F shows that there is too much latency using the inter-CPU communication to send and receive messages through the R5F.  In addition cache operations on the R5F are slower than on the A53.

The most straightforward solution to the problem is to run the Ethernet driver on the A53 CPU and have it perform the cache operations along with starting DMA.

I looked at the possibility of running the UDMA driver on the A53 with the Ethernet driver on the R5F.  There are too many dependencies between the two drivers for that to be practical.  There also seems to be no mechanism to get internal configuration information from the Ethernet driver that is needed by the UDMA driver.  The Ethernet driver depends on calling the UDMA driver directly to configure that.

The A53 CPU has the performance that we need for our calculations and logic, but the end-to-end Ethernet delay is too large to meet a 1000 samples per second 1 ms. calculation cycle.  Our redundancy depends on Ethernet to keep CPUs synchronized. I can provide more details about our application and the measured performance on the AM64x if that is helpful.

  • Hi ,

    Thanks for your query.

    Which OS you are planning to run Linux or freeRTOS on AM64x-SK board?

    Best Regards

    Ashwani

  • I'm assuming you are thinking A53 freeRTOS ICSSG driver, ICSSG is supported on A53 using Linux today https://software-dl.ti.com/processor-sdk-linux/esd/AM64X/latest/exports/docs/linux/Foundational_Components/PRU-ICSS/Linux_Drivers/PRU_ICSSG_Ethernet.html . Note the Linux A53 driver uses IO cache coherency (ASEL 14, 15).

    We are not currently planning A53 freeRTOS support for ICSSG, although there is no HW restriction. Are you not able to use Linux where this is supported today?

    Do you have your own board design? Are you already working with someone from TI sales of field applications engineer? The primary test platform for MCU+ SDK and ICSSG with AM64x is not SK-AM64 (this is mainly for Linux development) but the https://www.ti.com/tool/TMDS64EVM .

      Pekka

  • We are actually using our own RTOS. I have written the required Driver Porting Layer for our OS to support TI's software. I can also build our software to use FreeRTOS.

    We have obtained safety qualification for products using our RTOS in the past and that is the main reason for using it.  We use our own network stack for TCP/IP communication, and have safety qualified that in the past.

    We will be using the AM6422 on our board, with three Gigabit Ethernet ports and a single SPI port for communication with our IO card products.  The system supports two or three CPUs in a redundant configuration with a 1 ms. sample rate.  Ethernet is used for inter-CPU communication and communication with host computers.

    We have that software running on the Raspberry Pi Compute Module 4, but we don't have confidence that we can get the required technical information to safety qualify a system using the CM4. The AM6422 is better documented and we believe it will have better pricing and availability versus the CM4.

    When we have considered other solutions, the problems are usually the requirement for at least three gigabit Ethernet ports and the performance required for fast end-to-end communication on Ethernet.  We have been using the Intel PCIe 82576 dual port chip as an Ethernet solution, but that is expensive with limited availability.  The solution using the CM4 requires the Intel 82576 chip to get two ports in addition to the built-in Ethernet.

  • We are using the TMDS64EVM development board.  We first purchased the SK-AM64B and discovered that it had no transceivers for the PRU Ethernet ports.  Our application requires three Ethernet ports.  We designed our own board and it connects the three Ethernet ports to the PRUs.  Unfortunately the TMDS64EVM development board doesn't support three PRU Ethernet ports using the on-board transceivers. We are currently testing using only two ports.

    When I spoke to our TI FAE, he told me that TI wanted us to work through the TI support site, and not through a FAE.  I have been using the resources on TI's web site so far.

    I was surprised to find that the SDK does not support direct use of the Ethernet from the A53 CPU.  The other surprise was that the SDK does not support direct access to both the ICSW and ICSSG Ethernet driver on the same R5F CPU.  We decided that the extra "throw away" code to implement the ICSSG on some other additional R5F CPU core was not worth the time.

    The most important and current concern is the end-to-end communication delay from the A53 CPU through the R5F CPU to/from Ethernet.  I am trying to find a way to reduce that delay.  The system sends three small Ethernet messages and receives three small Ethernet messages every 1 MS to keep redundant CPUs in sync.  That entire exchange on the Raspberry Pi Compute Module 4 and Intel 82576 Ethernet takes less than 300 microseconds.  The same message exchange takes about 500 microseconds on the AM64x.  That reduces the processing time available for calculation by 20%.

    I am trying to find a way to reduce that time.  My first thought was to put the ICSSG Ethernet driver on the A53 CPU.  I ruled out the idea of just doing the DMA transfer on the A53 with the Ethernet driver on the R5F as there is no easy way to separate the two.

    I am currently working on an implementation that will use the A53 CPU to directly access the DMA buffers and also perform cache management.  The R5F CPU will still start the DMA and handle DMA completion.  However, the R5F will not access the data in the DMA buffers and will not do any cache management for those. That might reduce the time.

  • 500us for 3 Ethernet frames does seem very high. Couple thoughts o optimizing, are you using the TCM memory of the R5, and copying from there to A53 (cached DDR)? Or alternative suggestion as it is your own RTOS, just send a notification that frame is at location X to A53

    For any ICSSG Ethernet development please use https://www.ti.com/tool/TMDS64EVM , not the SK boards.

    For safety related detailed information please click the request link for this on the product page copied here https://www.ti.com/licreg/docs/swlicexportcontrol.tsp?form_id=225507&prod_no=AM64x_RESTRICTED_DOCS_SAFETY&ref_url=EP%3EProc%3ESitara .

      Pekka

  • Pekka,

    I apologize for taking so long to respond.

    I modified our application so that the A53 CPU stores the data in the DMA buffers and reads the data from the DMA buffers.  The A53 CPU also performs the necessary cache operations.  The R5F performs all of the Ethernet driver calls including those to start DMA and check for completion. Our software sets the "disableCacheOps" flag in the segment list for each buffer.  The time to send three messages and receive the three responses was reduced by about 200 microseconds.  That got us closer to our target.

    I found that packets are being dropped if I try to queue multiple packets at the same time.  I reported that issue in a separate thread.  When I submit each packet individually using a separate queue, then the packets are transmitted correctly.  It is unclear to me how much time would be saved by submitting multiple packets at the same time. Our application may send up to four packets at the same time to four different destination addresses.

    I am hoping to reduce the time further, but the only idea I have is to port the Ethernet driver completely to the A53 CPU.  My concern is that this might expose problems because of the faster CPU.  I have already encountered a problem when sending multiple packets as fast as possible.

    The hardware design of our board is completed and we are manufacturing some prototype boards.  The board is based on the TI reference design for the AM6422.  It has three gigabit Ethernet ports connected to the PRUs.  We are using the ICSSG dual mac driver.  We are using an SD card for booting and data storage.  One SPI controller is used to communication with IO devices on other boards.

    Our application depends on high performance communication through Ethernet ports in order to support CPU redundancy and distributed IO.  Every node sends the data acquired by its SPI interface and inter-CPU data for redundancy.  For example, every command sent by a host computer to a redundant CPU is also forwarded to the other CPUs.  CPUs send each other the results of calculations to compare results.  CPUs vote on which data is correct and shut down CPUs that produce incorrect results.

    In the past we have been using Intel CPUs and the Intel 82576 PCIe dual port Ethernet chip.  That has more Ethernet performance than we need, and is expensive.  The upgrade path from the 82576 requires writing a new Ethernet driver for some new chip.

    The Raspberry Pi Compute Module 4 (CM4) has more calculation performance than we need, but it has only a single Ethernet port.  We used the Intel 82576 to get the other two Ethernet ports.  That is not optimal because the supply of 82576 chips is limited.  The availability and price of the CM4 has also been inconsistent.  But, the major issue with the CM4 is the lack of documentation much of the hardware.  We plan to have a safety qualified version of this product that is more cost effective than our older Intel based products.

    TI's AM64x is the closest solution that we have found to our actual requirements.  Having three built-in Ethernet ports makes it attractive because it eliminates the need for the Intel 82576 chip and a PCIe driver.  A single core of the A53 CPU is about the same performance as a single core of the Raspberry Pi CM4.  Our application only requires two cores, and we found that the R5F has sufficient performance to run the software for one of the cores, while the other software requires the A53.  The CM4 uses more power and creates more heat than the AM6422 and we are not even using the graphics processor on the CM4.

    The challenges using the AM6422 are related to the Ethernet software.  Our software running on the A53 generates and processes most of the Ethernet frames and it runs our TCP/IP network stack. Our Ethernet driver is running on the same A53 CPU because it also handles the distribution of messages to the other CPU core and network redundancy decisions.  We moved only the TI driver related portion of the Ethernet to the second R5F core.

    The remaining challenge is how to meet the performance requirements for Ethernet communication.  So far with identical configurations, the TMDS64EVM is taking about 200 microseconds longer than the Raspberry Pi for the processing being done every millisecond.  I still need to verify that the 200 microseconds is because of Ethernet communication and why it occurs.

  • Are you still looking for something further support from TI for your own driver? We don't have A53 based bare metal Ethernet driver planned currently, so the R5 driver is the example we support.

  • I am still asking if TI is willing to support us if we port the ICSSG dual MAC Ethernet driver to the A53.  That also will include the support for the PHYs on the TMDS64EVM.  The support from TI would be answering technical questions about details of the SDK software and AM64x hardware.  Porting the driver to the A53 may be necessary to meet our performance goals.  I got some improvement in performance by doing calculations on the second A53 CPU core in parallel with the first A53 CPU core.

  • We can support Enet LLD interface level https://software-dl.ti.com/mcu-plus-sdk/esd/AM64X/latest/exports/docs/api_guide_am64x/enetlld_top.html (note the 9.0 SDK will be updated with ICSSG based Ethernet examples tomorrow Sep 8). Routing interrupts to GIC is likely the main difference to R5, the UDMA https://software-dl.ti.com/mcu-plus-sdk/esd/AM64X/latest/exports/docs/api_guide_am64x/DRIVERS_UDMA_PAGE.html PktDMA is what ICSSG Enet LLD uses so that should come from there.

    Beyond this level for more use case specific drivers I'd need to involve some business folks to prioritize the extra work, are you already in contact with TI sales and/or FAE ? If not I will contact you via email to get some project background.