This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMAP-L138 PRU Memory Bandwidth

Other Parts Discussed in Thread: OMAP-L138

Hello,

I have been running performance tests on the OMAP-L138 PRU to determine the speed of accessing the various kinds of memory that the PRU can touch (PRU DRAM, DDR, memory from the two cores). My tests have been showing that each 32-bit write, independent of the type of memory, takes roughly one PRU cycle. 32-bit reads take 1-5 PRU cycles depending on the type of memory.

My question is, why are writes faster than reads? I'm guessing that the PRU puts the write request onto the SCR and continues execution without waiting for a response, while it must wait for the response when performing a read. Am I on the right track here? Is there somewhere I could go to read more about how this works?

 

In case it's relevant, the PRU code I am using for testing is essentially the following:

LD32    r1, pruCycleCountRegister

SBBO   r14, memAddr, 0, 0x40      // or LBBO for read test

SBBO   r14, memAddr, 0x40, 0x40      // or LBBO for read test

// many similar SBBOs or LBBOs

LD32   r2, pruCycleCountRegister

// divide (r2 - r1) by the number of 32-bit accesses to get PRU cycles per access

 

Thanks very much to anyone who can help.

-Emily

  • Hi Emily,

    That is correct.  The PRU must wait for the response and read data from the target.  The PRU does not wait for a response for write transactions.

    Regards,

    Melissa

  • Hi Melissa,

    I have some follow-up questions for you if you don't mind. If the PRU does not wait for a response on write transactions, I assume the pending write transactions must be queued somewhere. Where are they queued, and what is the queue size? When the queue is exceeded, what happens?

    Thanks very much for your help.

    -Emily

  • Hi Emily,

    There are no internal queues for PRU write transactions to the internal PRUSS data memory.  If another master is accessing the internal data memory, then the PRU will get wait states until the data memory is free.

    There are several queues for PRU write transactions to external targets.  If the nearest queue gets full, then the PRU will receive wait states until the queue is not full.

    Regards,

    Melissa

  • Thanks a lot Melissa!

  • Hi Melissa,

    Could you please give me more information about the PRU write queues? How many are there and how deep are they? What would the PRU need to do to fill the queues and receive a wait state?

    Thanks!

    -Emily

  • Hi Emily,

    The PRU write queues/ FIFOs only fill up if, and only if, we have back pressure (i.e. if we cannot keep up with the data rate). 

    The basic data flow during a write is PRU -> fabric -> memory controller -> DDR.  There are two basic FIFOs, one in fabric and one in the memory controller.  I will need to get back with you regarding the depth of these FIFOs.

    Regards,

    Melissa