This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM623: AM623 DMA

Part Number: AM623

Tool/software:

Greetings TI Support Team,


I am looking for information related to the DMA capabilities on the AM62x processor.

In our application, we are connecting AM623 OSPI to an external FPGA. The plan is to retrieve a block of multiple 64bit data from FPGA upon receiving an interrupt from FPGA. We would like to how to trigger DMA transfer upon receiving an interrupt from FPGA.


The following are the questions that we have currently.

  • Is there an interrupt-driven DMA interface available for AM62x?
  • Can this interface be used to efficiently write data to or read data from memory or peripherals?
  • Are there any recommended drivers or examples available in the SDK for implementing such transfers?

Any documentation or guidance you could provide would be helpful.

  • Hello Maneesh,

    I am looking at your queries and you may expect reply in one or two days .

    Regards,

    Anil.

  • Hi Anil,

    Alright! I will be waiting for your response.

    Thanks,
    Maneesh N

  • Hello Maneesh,

    The above feature is intended to be implemented on which core (A53 or DM R5F core ?

    If the feature is going to implement on A53 core , which OS uses on A53 core ?

    Based on my experience, most customers use the GPMC interface to connect to an FPGA.

    On the FPGA side, it can be configured to behave like NOR or PSRAM memory.

    On the SoC side, the GPMC can be configured as either NOR or PSRAM memory.

    To trigger automatic transactions like read or write, it’s possible to use a GPIO input.

    The FAQ below is helpful for understanding how to trigger DMA using a GPIO input on the RTOS/NORTOS side .

    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1378150/faq-how-to-trigger-dma-with-the-help-of-gpio-on-am64x-am243-and-am62x-devices

    For your information, OSPI with DMA is supported in read mode, but write mode is not supported due to some timing issues on the write side.

    Regards,

    Anil.

  • Hello Anil,

    Thank you for your feedback.
    To the first 2 questions, I have requested my BSP team to provide me with their answers as, they are aware of which core they are using. I will respond to those as soon as I have information regarding that.

    Coming to the question of the OS, we are using Yocto Linux. We have customized it to suit our application and hardware.

    I am not aware of the GPMC interface, can you please provide more details regarding that?
    Details such as Driver, example user space code for communication basic interface will be helpful.

    And, we have used OSPI bus as our communication interface between the FPGA and ARM processor.

    So, the FPGA is basically sending 64bit data at 10us with some delays (these delays between them are random, typically from 10us to 100us). The problem is that I am not able to read the data fast enough to clear the FIFO. 

    Hence, we want to check if moving it to DMA helps us read quicker? 

  • Hello Maneesh ,

    I am not aware of the GPMC interface, can you please provide more details regarding that?
    Details such as Driver, example user space code for communication basic interface will be helpful.
    So, the FPGA is basically sending 64bit data at 10us with some delays (these delays between them are random, typically from 10us to 100us). The problem is that I am not able to read the data fast enough to clear the FIFO.

    I am an expert on the RTOS/NORTOS side, and I am routing your query to  Linux expert to help with the above queries.

    Regards,

    Anil.

  • Hi Maneesh,

    First of all, Linux currently doesn't support GPIO interrupts directly trigger DMA transfer.

    So, the FPGA is basically sending 64bit data at 10us with some delays (these delays between them are random, typically from 10us to 100us).

    Does it mean the FPGA generates interrupts in intervals of 10~100us? That is alot of interrupts.

  • Hi Bin Liu,

    I'm sorry for a delayed response. I will have the correct details and respond back ASAP.

    Thank you for being patient.

    Regards,
    Maneesh 

  • Hi Bin Liu,

    FPGA to ARM, Register read / write is through OSPI interface
    ARM reads the FPGA's 32 bit FIFO register through OSPI.

    Now, a register from the FPGA allows me to check the status of FIFO. Details like (Fifo Empty, not empty, full). 

    FPGA's interrupt pin is based out of a status register. When the FIFO has data, and the status register detects FIFO not empty, in the FPGA, it drives a GPIO pin.

    So, when I get the GPIO interrupt I am supposed to read the all the data out of the FIFO until the status register indicates that the FIFO is empty.

    It's basically one interrupt, and in the interrupt callback there are multiple reads until the status register indicates an empty FIFO.

    The data samples could have a variable delay (10us - 100us), and these are random. But the data that we receive is 64 bits in 10us. Hence, we wanted to check if we can have a DMA interface to read / write data from the FPGA and ARM.

  • Hi Maneesh,

    Thanks for the details. It sounds me the ISR flow is like following:

    isr () {
            while (1) {
                    LEN = read GPGA register for data length in FIFO;
                    if (!LEN)
                            return;
                    start DMA to transfter LEN bytes from OSPI/FPGA;
            }
    }

    and the while loop has to finish within 10us.

    Is my understanding correct? If so, I am not sure if this is achievable.

  • Hi Bin Liu,

    The data samples written to FIFO looks something like this.
    | D3 64bit (10us) |---(Delay <random>us)---| D2 64bit (10us) |---(Delay <random>us)---| D1 64bit (10us) |

    The entire while doesn't need to complete in 10us.

    I just need to be able to drain the FIFO fast enough. It is 4096 x 64bit deep.

    I hope I was able to clarify

  • Hi Maneesh,

    In a single 4096x64bit frame, does the GPIO interrupt happen once or 4096 times?

    I don't think the DMA can support either way. If the interrupt happen once per 4096x64bit frame, the DMA is unable to pulse for 10+us until the next 64bit come. If the interrupt happens once per 64bits, I think we can program the DMA cyclic mode and each transfer is triggered by each GPIO interrupt, however currently the AM62x DMA kernel driver doesn't support this feature.

  • Hi Bin Liu,

    Let me try to clarify..

    As soon as there is data in the FIFO, I get an interrupt though a Input pin. The current plan is to implement an ISR (callback function) that drains the FIFO out through OSPI reads.

    Basically,

    uint32_t global_buffer[4096 * 2]; // Just assume
    isr () {

        int i = 0;
            while (1) {
                    reg_status = read FPGA register;
                    
                    if (FpgaDataSamplesEmpty(reg_status)) {
                          break;
                    }

        
                    if (FpgaDataSamplesEmpty(reg_status)) {
                        global_buffer[i++] = ospi_read(); //32 bit reads are stored here..
                    }

      
                    if (FpgaDataSamplesFull(reg_status)) {
                        // I should not hit this point.
                        printf("FIFO full.. \n");               
                   }
            }
    }

    The FIFO is 4096 x 64bit deep. So, I must be able to drain out the FIFO fast enough.

    But after having some read benchmarks tested out, what I observed was that, I am not able to drain the FIFO fast enough..

    So the subsequent plan was trying to understand,
    on arrival of the interrupt,
    Can the DMA be programmed to read the data coming from the OSPI and write it to a location where linux can later read it from?  

    Does this clarify my query?

  • Hi, I am out of office for the next two weeks. Please expect delayed response. 

  • Hi Bin Liu,

    Are you back at the office? Do you have any updates for us?

  • Hi Maneesh,

    Sorry for the delay.

                    if (FpgaDataSamplesEmpty(reg_status)) {
                          break;
                    }

        
                    if (FpgaDataSamplesEmpty(reg_status)) {

    Do both 'if' have the same Empty condition? Is the second one a typo?

    Can the DMA be programmed to read the data coming from the OSPI and write it to a location where linux can later read it from?  

    Yes DMA can read data from OSPI, but it seems to have problems in your usecase.

    In your description, it seems the FPGA generates one interrupt per 4096x16bit data, but the data come to the fifo as 64bits per 10us, so the DMA transfer has to pause in every ~10us. Then what would be the single transfer length to program the DMA? If it is 64bits, then there is no more interrupt to resume DMA for the next 64bits. If it is 4096x16bit, the DMA transfer won't pause and would read garbage data from the FIFO.

  • Hi Bin Liu,

    Sorry a small typo, It was supposed to be

                    if (FpgaDataSamplesEmpty(reg_status) == EMPTY) {
                          break;
                    }

        
                    if (FpgaDataSamplesEmpty(reg_status) == DATA) {


    It's basically not that, when ever the FIFO has data it generates an interrupt for me, the ISRs role is to basically empty the FIFO.
    If there is more incoming data, it again generates an interrupt. My concern here is that using the normal OSPI reads I am not able to empty the FIFO fast enough, hence want to know if the DMA can read from the OSPI and store it, later Linux can empty the data present in the DMA.

    Also, please share some resources (Drivers, Application codes, etc.. ) for me to test it out.

    Thanks.

  • Hi Naneesh,

    Sorry but it is still not clear to me how exactly the FPGA generates data to the FIFO along with the interrupt events. But I believe I have already explained how DMA works on the OSPI interfaces. DMA does one transfer per interrupt event for a preprogrammed transfer size. DMA has two different ways to do so:

    - single slave mode: whenever interrupt is received, kernel driver programs the DMA channel with the transfer data size, then starts the DMA transfer. When the transfer is complete, DMA generates an interrupt to CPU. Evey DMA transfer can be programed with different transfer size. This mode is currently implemented in the OSPI driver. This DMA mode won't work if the "data-ready" interrupts happen in high frequency, because of the DMA channel setup time.

    - cyclic mode: the kernel driver programs DMA channel with the transfer data size, and the 'data-ready' interrupt is routed to DMA directly instead of CPU. Whenever the interrupt happens, DMA automatically transfer the data without CPU involved until the driver specifically tears down the DMA channel. In this mode, the transfer data size is fixed since the DMA channel is only programmed once in the beginning. Currently the kernel driver doesn't support this mode on OSPI interface.

  • Hi Bin Liu,

    |--64bit data @ 10us--|---Delay between samples (10us to 100us)---|--64bit data @ 10us--|...

    - The GPIO interrupt triggers after it sees 1st data in the FIFO. FPGA is interfaced via OSPI to the ARM.


    - Right now, from the benchmarks that I have computed, FYR

    * 32 bit OSPI read times
    Average read time:   0.000038298 seconds (~38.2 us)
    Minimum read time:   0.000022815 seconds (~22.8 us)
    Maximum read time:   0.002297820 seconds (~2.3 ms)

    So, for 64bit it is 2 32bit reads, because the read end of the FIFO is 32bit.

    * Time taken by the Interrupt callback to  to kick in after the GPIO interrupt arrives
    Min Time taken to kick the ISR: 0.004383 seconds (~4.38 ms)
    Max Time taken to kick the ISR: 0.005025 seconds (~5.02 ms)

    So, by the time ISR callback kicks in, I already have atleast 72 samples in the FIFO. And, when I read those samples there are newer samples filling in the FIFO.

    Hence, I want the data coming from the OSPI to be directly written to DMA.
    So, once the ISR kicks in, I just program the DMA to read bulk of data from the OSPI.

    So, please help me with resources that can help me read / write from /to DMA from / to the OSPI, program the DMA.

    Does this clarify my query? 

  • Hi Naneesh,

    As I explained, the DMA transfer has to be at a fixed frame length per interrupt. But in your use case, the OSPI has 64bit data per 10us without interrupt or unknown data length in each interrupt. I don't the DMA can be used in your OSPI/FPGA use case.

  • Ok, Now I understand.

    I can configure the Interrupt to trigger after 2048 such samples fill the FIFO. Let's say I fix the frame size to 2048 x 64bits.

    Is it possible for me to read that data?

  • Hi Maneesh,

    I checked the kernel OSPI driver (drivers/spi/spi-cadence-quadspi.c), it uses DMA MEMCPY mode to read from OSPI, which means the entire 2048x64bits data have to memory mapped. But it seems your OSPI data FIFO are 16bit wide. So it seems the kernel OSPI driver still cannot use DMA to read the OSPI/FPGA data for your use case.

    For your OSPI/FPGA configuration, which has 16bits wide FIFO, the DMA should use DEV_TO_MEM transfer, however the current kernel DMA driver does not support DEV_TO_MEM transfer on AM62x OSPI interface.

  • Hi Bin Liu,

    From what I understand, the current implementation of the 'spi-cadence-quadspi.c' has only Memory to Memory transfer i.e., DMA MEMCPY and no DEV_TO_MEM right now correct?

  • That is correct.

    Also the kernel DMA driver (drivers/dma/k3-udma.c) doesn't support DEV_TO_MEM on peripherals which do not have PDMA module (if you are not familiar with PDMA, please review the TRM), OSPI interface doesn't have PDMA. So even if spi-cadence-quadspi.c was updated with DMAEngine DEV_TO_MEM transfer, k3-udma.c still doesn't support it.

  • Hi Bin Liu,

    If we switch to RT Linux, do you think I can see a significant increase in the performance?

    Aditionally, does 6.1.46-g7d494fe58c kernel version have a patch / support for RT Linux? 

  • Hi Maneesh,

    Yes RT kernel is supported in Processor SDK. For kernel 6.1.46 equivalent, please use branch ti-rt-linux-6.1.y, and release tag 09.01.00.008-rt.

  • Hi Bin Liu,

    https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-rt-linux-6.1.y 
    is the link for this rt-kernel?

    ti-rt-linux-6.1.y

    Also, share some instructions to apply or use the patch.

  • Yes, https://git.ti.com/git/ti-linux-kernel/ti-linux-kernel.git is the kernel repo, ti-rt-linux-6.1.y is the RT branch for 6.1 kernel.

    Also, share some instructions to apply or use the patch.

    I am not sure which patch you talked about it. You just need to close the kernel, checkout ti-rt-linux-6.1.y branch and build the kernel.

  • Bin Liu,

    I have built the Kernel. There was only one defconfig for arm64 there. Any particular changes to be made in menuconfig? Or it straight away should build it? And also, is there a way that I can use my old dts files?

  • Hi Maneesh,

    I don't typically build the RT branch, but does the kernel/configs/ directory have file ti_rt.config?

    You need to create kernel .config using the follow command to build the RT kernel:

    $ make ARCH=... CROSS_COMPILE=... defconfig ti_arm64_prune.config ti_rt.config