This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3358: PRU to DDR memory access times

Part Number: AM3358
Other Parts Discussed in Thread: TIDA-01555

Tool/software:

Hi,

I am using the PRU to access some of the main DDR memory on the AM3358. A block of memory is excluded from use by the OS using an overlay, and the PRU writes to it (in a ping-pong buffer style). The main application (on the ARM core) reads from this memory after the receipt of an rpmesg informing it which area (ping or pong) has been written to.

The problem I am experiencing is significant memory access latency (around 1 - 2 usec, regardless of how much is being written) which is causing some jitter in my PRU (ADC) sampling loop. This problem does not occur when writing to the PRU local shared memory.

My questions are:

1) Is this expected behaviour?

2) Is this possible to mitigate the variance in these access times?

3) Can the ARM core access the PRU shared memory?

4) Any suggestions on an alternate approach?

Sampling to DDR:

Sampling to PRU mem

Thanks,

Tom

  • Edited June 18 2024

    Hello Tom,

    I am experiencing is significant memory access latency (around 1 - 2 usec, regardless of how much is being written)

    1) Is this 1-2 usec for the PRU core to write the memory? Or 1-2 usec for the ARM core to access the memory?

    This result does NOT make sense to me for the time for the PRU core to write the memory. Refer to https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1096933/faq-pru-how-do-i-calculate-read-and-write-latencies , section "Calculating write latencies with SBBO"

    If you have a single buffer, AND it is taking a longer time for the ARM core to access the memory, AND the PRU is just waiting for the ARM to complete a read before it continues writing, then I would suggest using a ping-pong buffer approach. That way the PRU can continue writing to the second memory buffer at the same time that the ARM is reading from the first memory buffer.

    2) What OS are you running on the ARM core? If you are running Linux, is the PRU communicating with Linux userspace, or Linux kernel space?

    The A8 A53 core should also be able to access the PRU's internal memory. The details would depend on what software is running on the A8 A53 cores.

    Regards,

    Nick

  • Hi Nick,

    Thanks for your reply. Just a few things to clarify:

    1) The latency is for a write and yes, I am using a ping pong buffer. The PRU notifies the Linux userspace application via RPmsg as to which buffer it should read from. I will read through the SBBO latencies note - thanks.

    2) Linux userspace

    3) I am using an AM3358 (A8 core not a A53)

    Regards,

    Tom

  • Hello Tom,

    Question

    How much memory is the PRU writing when it is taking 1-2us per write?

    Design ideas

    If your code is written on a single PRU core, and looks like this:

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    sample ADC
    sample ADC
    ...
    sample ADC
    write ADC data to DDR
    sample ADC
    sample ADC
    ...
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    Then there are a couple things you could do to help with the latency:

    1) divide up your code between the 2 PRU cores. Often we see customers using one PRU core to control the pins (i.e., the ADC sampling), and the other PRU core to move data around between the PRU subsystem and the other processor core(s).

    2) write less data to DDR at a time

    3) have Linux read directly from the PRU local memory, as you indicated above

    Linux userspace considerations when accessing PRU local memory 

    If you decide to go this route, we need to think about security considerations. As a general rule, Linux userspace is NOT given direct access to device memory, and usually has to go through a Linux driver.

    If you have a headless system with no way (or reason) for a bad actor to hack into the computer, some customers just give userspace elevated permissions in their final design. That allows them to directly access the PRU memory with /dev/mem.

    Another option if you don't want to give Linux userspace sudo permissions is to write a custom character driver to expose a specific memory region up to userspace.

    Some examples to get you thinking are here

    https://e2e.ti.com/support/processors/f/791/t/689315

    OR see TIDA-01555 documentation and code:

    http://www.ti.com/tool/TIDA-01555

    in your AM335x SDK at example/applications/pru-adc-x.x/ (Accessing dev/mem requires sudo permissions)

    Regards,

    Nick

  • Hi Nick,

    The PRU is not writing much data - 8 bytes every 2.5 usec. When writing to local PRU memory there is a small latency( a few 10's of nsec) 

    Thank you very much for the follow up ideas. I will try suggestion (3) first, as this is a fairly simple change to the current user space application as well as the PRU code. It is also a headless application so elevated permission access to /dev/mem is acceptable.

    For now I will mark this as resolved.

    Thanks again for the suggestions and counsel.

    Regards,

    Tom

  • Sounds good! Feel free to reach out if something else comes up, either on here, or on a different thread if you have a different question.

    Regards,

    Nick