This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MCU-PLUS-SDK-AM243X: MSRAM to PruIcss DRAM Communication

Part Number: MCU-PLUS-SDK-AM243X

Tool/software:

Hi,
i need to transfere data from the msram to pru dram or pru shared dram and back. The background is that i programmed the pru0 to act as a rgmii driver to transmit and receive at  1Gbps. I need to transmitt around 300Bytes which takes around 3us only for the rgmii. I took a few measurements to determine which way to get the data from pru dram to msram and vise versa.

The PruIcss_writeMemory is, depending on the data amount, slowest and cant by used for my purpose.
Using memcpy to copy data directly from r5f memory to pru dram is with 4us for ~300Byte also to slow.
I used the udma_memcpy_interrupt example and modified it for the two measurments. Both seem really slow for an dma. Maybe that can be improved?

The fasted way i found is letting the pru handle the copy. For that i use lbbo msram ->sbco dram and lbco dram -> sbbo msram.
Another time consuming factor are the interrupts between pru and r5f. r5f to pru interrupt takes around 200ns, Pru to r5f around 890ns.
The next graph shows the data transfere time from copying the data from r5f to msram -> send interrupt to pru -> pru copy data from msram to dram. And the other way around:

Combining that with the 3us for rgmii results in around 6.5us for reading data and 5.5us for writing. I need this times closer to 4us each. Is there a way to get the data faster from msram to pru dram?

Thanks and Regards

Lucas

  • Hi Lucas,

    Firstly, all the previous work done around read/write latencies can be found on this FAQ. Please take a look if you have not already:
    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1096933/faq-pru-how-do-i-calculate-read-and-write-latencies

    For higher read speed I would also suggest looking into XFR2VBUS hardware accelerator (refer TRM section 6.4.6.3.1 PRU_ICSSG XFR2VBUS Hardware Accelerator).

    The programming guide for the same is available in the TRM and you can also refer to this thread.

    Regards,

    Nitika

  • Combining that with the 3us for rgmii results in around 6.5us for reading data and 5.5us for writing. I need this times closer to 4us each. Is there a way to get the data faster from msram to pru dram?

    As Nitika mentioned, use XFR2VBUS from MSRAM directly for read and write (can do 64 bytes in 120ns or so). Use different banks (128KB per bank in AM64/AM243) for read and write to avoid contention.

  • Hi,

    thank you for the response. Im still verifying the timings with xfr2vbus. I will add my timings for future reference and close the issue as soon as i have it working. 

    Regards Lucas

  • Hi Nitika,

    thanks for the reply. I followed the TRM and this FAQ entry to program the xfr2vbus read: SK-AM64: Follow up question: How to read/write data with the PRU_ICSSG XFR2VBUS Hardware Accelerator - Processors forum - Processors - TI E2E support forums

    I also get the problem with the pru getting stuck cause the RD_BUSY is never cleared. Even worse the RD_BUSY flag always gets set when issuing the read and never gets cleared, so the next time i start the program it gets stuck in the first loop. As mentioned in the other thread, it can only be cleared by off and on powering the board. I also tried stopping the automatic read with zero &18, 12; xout 0x60, &r18, 12 after ready all bytes, but that also doenst change the state of the RD_BUSY or any other flag.


    This is my assembler code:

    xfr2vbus_mem_read:
        zero &r18, 12
        ldi r19.w2, 0x7000 ; Set adress to msram start
        lbbo &r0.w2, r19, 0, 2 ; Read data size from 0x70000000

    xfr2vbus_wait_rd_busy_clear:
        xin 0x60, &r18, 1
        qbbs xfr2vbus_wait_rd_busy_clear, r18.b0, 0 ; Wait for RD_BUSY to be clear, program stalls here since RD_BUSY is enevr cleared

        ldi r18, 0x0007 ; 64Byte Automode
        add r19.w0, r0.w0, 0 ; Add data offset (Currently not used)
        xout 0x60, &r18, 10 ; Start xfr2vbus data read

    xfr2vbus_wait_for_data:
        xin 0x60, &r18, 1 ; Clear command fifo? Allways reads 0x05
        qbbc xfr2vbus_wait_for_data, r18.b0, 2 ; Wait until RD_DATA_FL is set
        xin 0x60, &r2, 67 ; Read 64 bytes / Clear data fifo
        sbco &r2, c24, r0.w0, 64 ; Store data in dram
        sub r0.w2, r0.w2, 64 ; dec data counter
        add r0.w0, r0.w0, 64 ; inc dram postion
        qblt xfr2vbus_wait_for_data, r0.w2, 64 ; repeat until all data is read

        zero &18, 12
        xout 0x60, &r18, 12 ;stop automatic read

    (Edit: Fixed last qblt label and add indentation for readablitiy)

  • Hi Lucas,

    Allow me some time to test this on my setup and get back to you.

    Regards,

    Nitika

  • Hi Nitika,

    did you find the time to test it on your setup?

    Regards,
    Lucas

  • Hi Lucas, 

    Thank you for the ping. I was not available most of the last week due to other commitments.

    I was able to reproduce your issue on my setup as well and saw the same behaviour.

    I experimented with a couple of stuff and got it working with the below approach :

    xfr2vbus_wait_rd_busy_clear:
        xin 	0x60, &R18, 1
        qbbs 	xfr2vbus_wait_rd_busy_clear, R18.b0, 0 ; Wait for RD_BUSY
    
        ldi32   R19, 0x30080000 ; DMEM0 address
        ldi     R20, 0x0
    
    xfr2vbus_read_data:
        ldi32	R18, 0x4
        xout    0x60, &R18, 12
    
    wait_for_read_data_fifo:
        xin  	0x60, &R18, 1
        qbbc 	wait_for_read_data_fifo, R18.b0, 2
    
        xin  	0x60, &r2, 32 ; Perform read
    
        add		R19, R19, 0x20 ; Set address for read
        add  	R27, R27, 0x1 ; Increment count
    
        qbge 	xfr2vbus_read_data, R27, 4 ; repeat until all data is read
    
        halt ; end of program

    From what I observed, the below line of code sets the RD_BUSY bit of R18 and on restarting the program the bit is still set causing the loop to get stuck.

        xout 0x60, &r18, 12 ;stop automatic read

    This code fixes the issue by re-arranging the sequence of operation to prevent any XOUT operation right before the program is re-run.

    Can you please try the above logic in your code and let me know if it resolves the issue.

    Regards,

    Nitika