This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6442: Options for sharing data between PRU cores across a slice

Part Number: AM6442

I'm trying to understand what options are available for sharing data between PRU cores across a slice (ie sharing the same data between PRU0, PRU1, RTU_PRU0, TX_PRU0, PRU1, RTU_PRU1, and TX_PRU1). This question is general, but the first application will be processing data from multiple ADCs and writing commands to multiple DACs. Since the PRUs are the only cores with access to GPIO, I want to transfer data from PRUs to RTU_PRUs and TX_PRUs for processing, and then write resulting commands back to PRUs. I'm thinking I'll need all 4 aux cores (RTU_PRU0/1 and TX_PRU0/1) for processing the ADC streams and generating the DAC commands, and both PRU cores (in a slice) for reading/writing GPIO to the ADCs and DACs.

We originally planned to share data using DRAM or SRAM, but it appears this incurs arbitration delays if 6 PRU cores are reading or writing the same endpoint at the same time. Since our application is latency-sensitive I'm concerned these delays could be a problem.

Looking for alternatives, it seems AM6442 doesn't include dual-port memories that would allow the PRU cores to avoid arbitration during read/write. Is that correct?

The TRM (rev C) on page 3372 and 3373 indicates that there might be broadside connections to transfer between the various PRU cores (" Broadside (32 Byte) connection to...RTU_PRUm and TX_PRUm (where m = 0 or 1)") but I can't find any broadside IDs for inter-core transfer in Table 6-428 on page 3427. Other comments on the forums suggest that PRUs on AM6442 do not actually have a broadside connections between cores.

The IPC Scratchpad looks like an option, but this scratchpad is small and only supports RTU_PRU, not the TX_PRU.

XFR2VBUS allows access to DDR and other external memories, but we've benchmarked this and the latency is too high for our application.

Are there any options I've missed for sharing data between the various PRU cores that can avoid arbitration delays if all 6 cores in a slice are reading from the same endpoint at the same time?

  • Hello Steven,

    Please read through the FAQ on PRU read latencies if you have not already done so: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1096933/faq-pru-how-do-i-calculate-read-and-write-latencies. I finished the last draft of that FAQ last week.

    In general, broadside interfaces are the fastest way to get smaller amounts of data between PRU cores. AM64x does not provide the option to copy registers directly from one PRU core into the other. However, they can still use scratchpads to pass register values from one core to another as demonstrated in the PRU Software Support Package (PSSP) example at https://git.ti.com/cgit/pru-software-support-package/pru-software-support-package/tree/examples/am64x/PRU_RPMsg_Echo_Interrupt0.

    Shared local memory is probably the better option if you need to pass larger quantities of data between cores. Note that interrupts, the PRU_ICSSG spinlocks, etc can be used to coordinate core access to the memories if you calculate the potential arbitration delay latencies in your system and they are unacceptably high.

    Note that the AM64x MCU+ SDK 8.3 just released with new software for AM64x PRUs controlling external ADCs. This project may be what you are looking for: https://software-dl.ti.com/mcu-plus-sdk/esd/AM64X/08_03_00_18/exports/docs/api_guide_am64x/PRU_IO.html > "PRU ADC", https://software-dl.ti.com/mcu-plus-sdk/esd/AM64X/08_03_00_18/exports/docs/api_guide_am64x/EXAMPLES_PRU_IO.html

    Regards,

    Nick

  • Hi Nick,

    Thank you for writing that PRU read latency article, I've referenced it many times. I have a few questions about latencies that I'll ask in a separate thread to keep this focused on options for sharing data between PRU cores.

    When you say "Shared local memory", are you talking about DRAM and/or PRU SRAM? It looks like those memories are only accessible through the PRU internal CBASS interconnect, and hence limited to 32-bit transfers. I'm really looking for a memory that is accessible through the broadside interface but also shared between PRU cores on AM6442.

    I was wondering if the 8kB buffers in the FDB could be used for this, but the TRM indicates that only 1 PRU core can read/write the FDB buffers. The scratchpad memories are close to what I'm looking for, but they are very small, and if I use them I cannot simultaneously use the Task Manager since it uses scratchpads when context switching.

    The 2kB Broadside buffers appear to be attached to each core individually and cannot be accessed by other cores.

    Is it correct to say that AM6442 doesn't have any broadside accessible memory that can transfer data from one PRU core to another?

  • Hello Steven,

    Yes, I mean memories internal to the PRU subsystem (both DRAM and SRAM). You are correct that there is a broadside interface on ICSSG cores between a core and it's dedicated 2kB broadside RAM. However, other processors do not have visibility into the broadside RAM.

    I do not think your understanding of the Task Manager is correct. The registers are saved by the PRU firmware when context switching, all the task manager does is save and restore the PC and flags. So the PRU firmware can save the other registers to the SPAD, SRAM, DRAM, etc.

    I have not spent much time looking at XFR2VBUS at this point in time, but I think this is exclusively used to read and write to memories outside of the PRU subsystem. (I could be wrong, no time to dig deeper this evening). If that is the case, then there are not any shared memories connected to the broadside interface, and the easiest way to read/write more values than could be stored in the SPAD would be reading and writing to DRAM / SRAM.

    Regards,

    Nick

  • Hi Nick,

    After reviewing the TRM it looks like the Task Manager doesn't explicitly require the scratchpad banks, but since the TM only changes the program counter and flags the SPAD memories are the most practical way to save/restore registers when using the TM.

    It sounds like the most efficient option on AM6442 is to use the 32-byte IPC SPAD, which can share data between the PRU and RTU_PRU. All other memories go through the internal CBASS and are limited to 32-bit transfers.

    The PRUs are a powerful differentiator on TI processors and the main reason we choose Sitara over competing SoC's based around FPGAs. For future products, a memory buffer that is broadside accessible and also shared between PRU cores would make the PRUs even more useful. Especially so if this new buffer was a dual-port or tri-port RAM that allowed multiple reads or writes without arbitration. The main scenario I'm considering is one where the PRU core reads/writes from I/O (whether the I/O port is Ethernet, A/Ds, D/As, encoders, etc) while efficiently passing that data to and from the TX_PRU and RTU_PRU for processing via XFR instructions.