This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Tool/software:
Hi,
i need to transfere data from the msram to pru dram or pru shared dram and back. The background is that i programmed the pru0 to act as a rgmii driver to transmit and receive at 1Gbps. I need to transmitt around 300Bytes which takes around 3us only for the rgmii. I took a few measurements to determine which way to get the data from pru dram to msram and vise versa.
The PruIcss_writeMemory is, depending on the data amount, slowest and cant by used for my purpose.
Using memcpy to copy data directly from r5f memory to pru dram is with 4us for ~300Byte also to slow.
I used the udma_memcpy_interrupt example and modified it for the two measurments. Both seem really slow for an dma. Maybe that can be improved?
The fasted way i found is letting the pru handle the copy. For that i use lbbo msram ->sbco dram and lbco dram -> sbbo msram.
Another time consuming factor are the interrupts between pru and r5f. r5f to pru interrupt takes around 200ns, Pru to r5f around 890ns.
The next graph shows the data transfere time from copying the data from r5f to msram -> send interrupt to pru -> pru copy data from msram to dram. And the other way around:
Combining that with the 3us for rgmii results in around 6.5us for reading data and 5.5us for writing. I need this times closer to 4us each. Is there a way to get the data faster from msram to pru dram?
Thanks and Regards
Lucas
Hi Lucas,
Firstly, all the previous work done around read/write latencies can be found on this FAQ. Please take a look if you have not already:
https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1096933/faq-pru-how-do-i-calculate-read-and-write-latencies
For higher read speed I would also suggest looking into XFR2VBUS hardware accelerator (refer TRM section 6.4.6.3.1 PRU_ICSSG XFR2VBUS Hardware Accelerator).
The programming guide for the same is available in the TRM and you can also refer to this thread.
Regards,
Nitika
Combining that with the 3us for rgmii results in around 6.5us for reading data and 5.5us for writing. I need this times closer to 4us each. Is there a way to get the data faster from msram to pru dram?
As Nitika mentioned, use XFR2VBUS from MSRAM directly for read and write (can do 64 bytes in 120ns or so). Use different banks (128KB per bank in AM64/AM243) for read and write to avoid contention.
Hi,
thank you for the response. Im still verifying the timings with xfr2vbus. I will add my timings for future reference and close the issue as soon as i have it working.
Regards Lucas
Hi Nitika,
thanks for the reply. I followed the TRM and this FAQ entry to program the xfr2vbus read: SK-AM64: Follow up question: How to read/write data with the PRU_ICSSG XFR2VBUS Hardware Accelerator - Processors forum - Processors - TI E2E support forums
I also get the problem with the pru getting stuck cause the RD_BUSY is never cleared. Even worse the RD_BUSY flag always gets set when issuing the read and never gets cleared, so the next time i start the program it gets stuck in the first loop. As mentioned in the other thread, it can only be cleared by off and on powering the board. I also tried stopping the automatic read with zero &18, 12; xout 0x60, &r18, 12 after ready all bytes, but that also doenst change the state of the RD_BUSY or any other flag.
This is my assembler code:
xfr2vbus_mem_read:
zero &r18, 12
ldi r19.w2, 0x7000 ; Set adress to msram start
lbbo &r0.w2, r19, 0, 2 ; Read data size from 0x70000000
xfr2vbus_wait_rd_busy_clear:
xin 0x60, &r18, 1
qbbs xfr2vbus_wait_rd_busy_clear, r18.b0, 0 ; Wait for RD_BUSY to be clear, program stalls here since RD_BUSY is enevr cleared
ldi r18, 0x0007 ; 64Byte Automode
add r19.w0, r0.w0, 0 ; Add data offset (Currently not used)
xout 0x60, &r18, 10 ; Start xfr2vbus data read
xfr2vbus_wait_for_data:
xin 0x60, &r18, 1 ; Clear command fifo? Allways reads 0x05
qbbc xfr2vbus_wait_for_data, r18.b0, 2 ; Wait until RD_DATA_FL is set
xin 0x60, &r2, 67 ; Read 64 bytes / Clear data fifo
sbco &r2, c24, r0.w0, 64 ; Store data in dram
sub r0.w2, r0.w2, 64 ; dec data counter
add r0.w0, r0.w0, 64 ; inc dram postion
qblt xfr2vbus_wait_for_data, r0.w2, 64 ; repeat until all data is read
zero &18, 12
xout 0x60, &r18, 12 ;stop automatic read
(Edit: Fixed last qblt label and add indentation for readablitiy)
Hi Lucas,
Allow me some time to test this on my setup and get back to you.
Regards,
Nitika
Hi Lucas,
Thank you for the ping. I was not available most of the last week due to other commitments.
I was able to reproduce your issue on my setup as well and saw the same behaviour.
I experimented with a couple of stuff and got it working with the below approach :
xfr2vbus_wait_rd_busy_clear: xin 0x60, &R18, 1 qbbs xfr2vbus_wait_rd_busy_clear, R18.b0, 0 ; Wait for RD_BUSY ldi32 R19, 0x30080000 ; DMEM0 address ldi R20, 0x0 xfr2vbus_read_data: ldi32 R18, 0x4 xout 0x60, &R18, 12 wait_for_read_data_fifo: xin 0x60, &R18, 1 qbbc wait_for_read_data_fifo, R18.b0, 2 xin 0x60, &r2, 32 ; Perform read add R19, R19, 0x20 ; Set address for read add R27, R27, 0x1 ; Increment count qbge xfr2vbus_read_data, R27, 4 ; repeat until all data is read halt ; end of program
From what I observed, the below line of code sets the RD_BUSY bit of R18 and on restarting the program the bit is still set causing the loop to get stuck.
xout 0x60, &r18, 12 ;stop automatic read
This code fixes the issue by re-arranging the sequence of operation to prevent any XOUT operation right before the program is re-run.
Can you please try the above logic in your code and let me know if it resolves the issue.
Regards,
Nitika