This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM62A7: Can SPI use BCDMA in cyclic mode?

Part Number: AM62A7
Other Parts Discussed in Thread: AM4376

Tool/software:

Hello,

I am now more familiar with the DMA in this CPU and it appears that cyclic mode is not supported for PKTDMA but it is supported in BCDMA.  I am using the SPI DMA in PKTDMA mode but it is highly inefficient due to having to start a DMA transfer every 11mSecs.  Are there any reasons why SPI will not work with BCDMA?

Thanks,

Victor

  • Hello Victor,

    I am looking at your queries and you may expect reply in one or two days .

    Regards,

    Anil.

  • Hello Victor ,

    Cyclic mode is achieved if we go with MCSPI + BCDMA channels and BCDMA channels can handle the infinite TR support and this infinite TR loading does not support in PKTDMA.

    But, as per the SOC, either user can go with PKTDMA or BCDMA and there are no problems.

     The above requirement is on which core A53 or  DM R5F or MCU R5F core and which OS's  are you using ?

    In the MCU+SDK and Linux, all peripherals can use the PKTDMA channel to transfer data in between the peripherals and memory or memory to peripherals.

    The SPI with BCDMA channel support is not there in both Linux SDK and MCU SDK ..

    Regards,

    Anil.

  • Hi Victor,

    In the e2e post linked below, you mentioned you have used cyclic mode on AM437x SPI interface, can you please provide some detail about the use case? Which kernel version, which kernel driver is used? what modification has been done to use DMA cyclic mode? As far as I understand, the kernel SPI framework is not suitable for DMA cyclic mode.

    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1476789/sk-am62-dma-has-corrupt-data-reading-using-ospi-controller/5671628#5671628

  • Our AM437x based systems  do not use Linux.  I wrote up the EDMA driver myself to do cyclic mode.

  • Okay, this would make sense.

    In Linux, due to the design in the SPI framework, it is not able to use DMA cyclic mode on AM62Ax McSPI interface. In cyclic mode, the RX buffer is fixed and has been programmed to DMA controller during channel initialization. However the kernel SPI core driver passes in different RX buffers to the SPI controller driver in each SPI RX request.

  • That is correct for current implementation because the kernel can only use PKTDMA for SPI.  But can SPI use BCDMA since it appears that BCDMA supports cyclic mode?

  • How BCDMA could solve the problem if SPI core still provides different RX buffer for each RX request?

  • Below is the code that we use for iMX6 to perform the same function.  It is similar to what you used for the GPMC cyclic mode.   I was hoping to use this exact code on AM62A7.  This code will fail with the dmaengine_prep_dma_cyclic() because k3-udma.c will not allow cyclic mode for PKTDMA.   In the code below, the call to dmaengine_prep_dma_cyclic() has address of destination buffer, size of destination buffer and a notification length.  Basically, this destination buffer is a ring buffer and the notification length is the (size/32).  If the DMA using BCDMA triggers a completion interrupt when the notification length is reached, same as using PKTDMA, then cyclic mode should work.

    I am currently doing this function using PKTDMA and I have to pass in address of a different RX buffer every time the DMA transfer is manually started.  But I am passing in the addresses of the same ring buffer and notification length that I would have used for cyclic mode.  I can avoid all of this if I am using cyclic mode.

    static void spot_init_dma(void)
    {
        int ret;
        struct dma_async_tx_descriptor *desc_rx;
        struct dma_slave_config rx = {};
        rx.direction = DMA_DEV_TO_MEM;
        rx.src_addr = s_priv->base + 0;
        rx.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
        rx.dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
        rx.dst_addr = (dma_addr_t) s_priv->dma_phys_data;
        rx.device_fc = false;
        rx.src_maxburst = 32;
        ret = dmaengine_slave_config(s_priv->dma_chan_rx, &rx);
        if(ret)
            pr_err("--->spot dmaengine_slave_config err\r\n");
        else
            PRINT_DEB("\r\n--->RX Slave Configured.\r\n");

        desc_rx =
            dmaengine_prep_dma_cyclic(s_priv->dma_chan_rx,
            (dma_addr_t)s_priv->dma_phys_data,
            SPOT_BUF_LEN,
            SPOT_NOTIFY_LEN,
            DMA_DEV_TO_MEM,
            DMA_PREP_INTERRUPT);    

        if (desc_rx)
        {
            u32 ret;
            dma_cookie_t cookie;
            desc_rx->callback = spot_dma_rx_callback;
            desc_rx->callback_param = (void *)s_priv;
            cookie = dmaengine_submit(desc_rx);
            s_priv->dma_chan_rx->cookie = cookie;
           
            ret = dma_submit_error(cookie);
            if (ret) {
                PRINT_DEB("\r\n--->cannot submit DMA cyclic\r\n");
                dmaengine_terminate_async(s_priv->dma_chan_rx);
                return;
            }  
            reinit_completion(&s_priv->dma_rx_completion);
            dma_async_issue_pending(s_priv->dma_chan_rx);
        }
        PRINT_DEB("\r\n--->spot_init_dma done!\r\n");
    }

  • Hi Victor,

    Can you please send me the entire driver code? I'd first like to understand how it fits in the kernel SPI framework and what is the dst buffer (s_priv->dma_phys_data).

    What event triggers each cyclic transfer? and what does the dma completion callback do?

  • I will send you the driver code privately.

    The DMA in the iMX6 will send an interrupt upon completion of each buffer in the ring buffer.  No different than the SPI in the AM4376 and AM62A7.

  • The dma completion callback wakes up a thread to process the recently filled buffer.  Right now, the driver in the AM62A7 does the same thing and also starts the next DMA transfer since there is no cyclic mode.

  • Hi Victor,

    I am not sure how you have tried cyclic mode on AM62Ax McSPI, but here is my thought.

    McSPI and McASP on AM62Ax have similar DMA architecture. Refer to the BCDMA channel configuration in mcasp0 devicetree node in k3-am62a-main.dtsi:

    mcasp0: audio-controller@2b00000 {
        ...
        dmas = <&main_bcdma 0 0xc500 0>, <&main_bcdma 0 0x4500 0>;
        dma-names = "tx", "rx";
        ...
    };

    You would need the similar configuration for McSPI. The McSPI BCDMA channel IDs are in the TISCI documentation linked below.

    https://software-dl.ti.com/tisci/esd/latest/5_soc_doc/am62ax/psil_cfg.html

    You wouldn't need the side band GPIO trigger in this case. The McSPI has internal event to trigger DMA to start transfer whenever the RXFIFO reach to it configured water mark.

    Please let me know if this is enough information for you to enable cyclic mode in your McSPI use case.

  • Hello,

    I did what you suggested and change in the device tree the following:

    -                dmas = <&main_pktdma 0x4304 0>;
    +                dmas = <&main_bcdma 1 0x4304 0>;
     

    I changed my driver to call dmaengine_prep_dma_cyclic() instead of dmaengine_prep_slave_sg().   I had changed k3-udma.c to the changes you suggested in the other post for DMA on GPMC since I am using DMA on transferring data using the GPMC  I will private send you a copy of my k3-udma.c

    Those changes caused this error on the console.

    [   10.118599] ti-udma 485c0100.dma-controller: udma_prep_dma_cyclic: chan0 is for MEM_TO_MEM, not supporting DEV_TO_MEM

    I believe you also saw this error when you were developing the code for DMA on GPMC.  Can you tell me what to mod on my version of k3-udma.c to get pass this error?

    Thanks,

    Victor

  • Hi Victor,

    Before we get into the details, the title of this thread is about SPI (McSPI), but

    I had changed k3-udma.c to the changes you suggested in the other post for DMA on GPMC since I am using DMA on transferring data using the GPMC

    are you moving to GPMC now?

    I received a list of 4 requests offline about DMA support a few weeks ago and we will be discussing them tomorrow, but GPMC is not on the list.

    Even if you do have a project using GPMC and need DMA support, can you please create a separate e2e thread for it? I'd like to keep this thread only for SPI DMA discussion.

  • Hello,

    No, the changes that was made to k3-udma.c was to support DMA transfer from GPMC which is working fine in our system.  I mentioned it because I am using an modified version of k3-udma.c due to the changes made to support DMA on GPMC.

    Basically, I do not know if the changes that was made for DMA on GPMC may or may not cause this MEM_TO_MEM issue with SPI trying to use BCDMA.

    Thanks,

    Victor

  • No, the changes that was made to k3-udma.c was to support DMA transfer from GPMC which is working fine in our system.  I mentioned it because I am using an modified version of k3-udma.c due to the changes made to support DMA on GPMC.

    Understood. Thanks for clarification.

    +                dmas = <&main_bcdma 1 0x4304 0>;

    The second parameter '1' should be '0' for McSPI, as the McSPI module has PDMA and it doesn't use global trigger for DMA.

    Please let me know if you still get the "chan0 is for MEM_TO_MEM" error with this change.

  • Victor,

    [   10.118599] ti-udma 485c0100.dma-controller: udma_prep_dma_cyclic: chan0 is for MEM_TO_MEM, not supporting DEV_TO_MEM

    In your driver spot_spi_imx6.c, init_sdma(), please also add

    dma_cap_set(DMA_CYCLIC, mask);

    after

    dma_cap_set(DMA_SLAVE, mask);

    to see if this fixes the dma channel type error. I see the DMA_CYCLIC flag is also set in snd_dmaengine_pcm_request_channel() in kernel sound framework pcm_damengine.c.

  • Hello, 

    I made the above two changes, "dmas = <&main_bcdma 0 0x4304 0>;" in .dts file and the call to dma_cap_set(DMA_CYCLIC, maks);

    I got a different error now.  Got a total of 48 of these errors.

     [   4.881017] ti-udma 485c0100.dma-controller: Only TR mode is supported (psi-l thread 0x4304)
    [    4.889479] ti-udma 485c0100.dma-controller: Only TR mode is supported (psi-l thread 0x4304)
    [    4.897940] ti-udma 485c0100.dma-controller: Only TR mode is supported (psi-l thread 0x4304)

    and followed by this.

    [    5.285961] ti-udma 485c0100.dma-controller: get channel fail in udma_of_xlate.

    Any advice on how to fix this problem?

    Thanks,

    Victor

  • Victor,

     [   4.881017] ti-udma 485c0100.dma-controller: Only TR mode is supported (psi-l thread 0x4304)

    Please apply the following kernel patch to see if it fixes this error.

    diff --git a/drivers/dma/ti/k3-psil-am62a.c b/drivers/dma/ti/k3-psil-am62a.c
    index 4cf9123b0e93..c5777caebb8d 100644
    --- a/drivers/dma/ti/k3-psil-am62a.c
    +++ b/drivers/dma/ti/k3-psil-am62a.c
    @@ -89,7 +89,7 @@ static struct psil_ep am62a_src_ep_map[] = {
            PSIL_PDMA_XY_PKT(0x4301),
            PSIL_PDMA_XY_PKT(0x4302),
            PSIL_PDMA_XY_PKT(0x4303),
    -       PSIL_PDMA_XY_PKT(0x4304),
    +       PSIL_PDMA_MCASP(0x4304),
            PSIL_PDMA_XY_PKT(0x4305),

  • Hello,

    The latest change fixed the previous error and I am now getting this error.

    [    4.939558] ti-udma 485c0000.dma-controller: Descriptor pool allocation failed
    [    4.946994] ti-udma 485c0000.dma-controller: get channel fail in udma_of_xlate.

    Thanks,

    Victor

  • Victor,

    Thanks for all the tests. It seems more code change is needed. Let me test this on my board to see what else is missing.

  • Victor,

    I finally restored my setup for DMA debugging and can check the kernel execution.

    [    4.939558] ti-udma 485c0000.dma-controller: Descriptor pool allocation failed

    I got a different error but further than the execution of this point. Checking k3-udma.c, function bcdma_alloc_chan_resources(), the line prints this error message is wrapped in condition

    if (uc->config.dir == DMA_MEM_TO_MEM  && !uc->config.tr_trigger_type) {

    and uc->config.dir should be '2' (DMA_DEV_TO_MEM) in our McSPI BCDMA case, so this message shouldn't execute. Can you please check your code execution to understand how it runs into this 'if' segment?

      2502         if (uc->config.dir == DMA_MEM_TO_MEM  && !uc->config.tr_trigger_type) { 
         1                 uc->config.hdesc_size = cppi5_trdesc_calc_size(                 
         2                                         sizeof(struct cppi5_tr_type15_t), 2);   
         3                                                                                 
         4                 uc->hdesc_pool = dma_pool_create(uc->name, ud->ddev.dev,        
         5                                                  uc->config.hdesc_size,                                                                                                                                                                                                                                                                                                                                                            
         6                                                  ud->desc_align,                                                                                                                                                                                                                                                                                                                                                                   
         7                                                  0);                                                                                                                                                                                                                                                                                                                                                                               
         8                 if (!uc->hdesc_pool) {                                                                                                                                                                                                                                                                                                                                                                                             
         9                         dev_err(ud->ddev.dev,                                                                                                                                                                                                                                                                                                                                                                                      
        10                                 "Descriptor pool allocation failed\n");

  • Hello,

    I had to rebuild the kernel and now I am able to successfully start the DMA using dmaengine_prep_dma_cyclic().  However, I don't get a RX completion callback.  Any idea on why?

    I will try to dump the Rx buffer tomorrow to see if data is being read from the McSPI.  Hopefully, that will tell us where in the DMA transfer is broken.

    Thanks,

    Victor

  • Victor,

    Well, you get further than I do. The dma channel request still failed on my setup with error:

    [   26.042450] ti-udma 485c0100.dma-controller: Failed to get ring irq (index: 12294)
    [   26.050221] ti-udma 485c0100.dma-controller: get channel fail in udma_of_xlate.

    However, I don't get a RX completion callback.  Any idea on why?

    Our developer told me

    +       PSIL_PDMA_MCASP(0x4304),

    probably is not correct. Please try to use

    +       PSIL_PDMA_XY_TR(0x4304),

    to see if it is related to the dma completion issue.

  • Hello,

    Reverting back to PSIL_PDMA_XY_TR(0x4304) will result in the "Only TR mode is supported" error.

    I will try to see at the very least if the DMA is reading data from the SPI controller or not.

    Thanks,

    Victor

  • Reverting back to PSIL_PDMA_XY_TR(0x4304)

    Reverting back? the original mode was PSIL_PDMA_XY_PKT.

  • My bad.  This latest change has cyclic mode working.

    Thanks so much for all your help on this issue.

    Victor

  • Wow, such a great news!

  • Driver is so much cleaner now.  Lot less code.

    Thanks again,

    Victor

  • Hello,

    I ran into a small problem and I tried to figure it out but I don't know k3-udma.c enough to figure out the problem.

    The cyclic mode works great when running through a boot sequence.  Problem is when I want to run GDB on my unit.  So, my system calls dmaengine_terminate_async() to stop the DMA transfer.  This appears to work.   So, when GDB starts up the system, everything seems to work as normal with no error messages.  However, I don't get the rx completion callback function which probably means that the DMA engine did not start.  I also tried using dmaengine_terminate_sync() but same issue.

    I do know that both function calls udma_reset_chan() with the hard input parameter set to false.

    Any idea on how to fix this problem?

    Thanks,

    Victor

  • Hi Victor,

    I don't understand if the dmaengine flow is different when using or not using GDB?

    If you don't use GDB, but stop your application by calling dmaengine_terminate_async() to clean up DMA, then start your application again, does DMA still work?

  • Bin,

    We don't stop the app unless we are doing a boot cycle or GDB.  For GDB, we don't go through a boot cycle.  We simply kill the app which involves calling dmaengine_terminate_async().  It appears that function calls udma_terminal_all().  In GDB, we then restart the app but it appears that the code that starts the DMA did not return any errors but the DMA did not start.  So, it looks like udma_terminal_all() did not clean up the cyclic bcdma.  I am trying to compare the code that is executed in dma_prep_dma_cyclic() with a normal working startup and a GDB non working startup.

  • Hi Victor,

    I am wondering if you need the following kernel patch. Please let me know if you can manually apply it to your kernel and if it resolves the problem.

    https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.6.y-cicd&id=e0a0ce8c2684c13fab0e65be767d036dfa592ee1

  • Bing,

    I manually applied the patch and it did fix the GDB problem.  My unit does not freeze up any more when stopping my app and then starting it again in GDB.  So the patch did fix the teardown issue.

    Problem now is that my cyclic DMA transfer, which is in RX mode, no longer reads in correct data.  I believe the DMA is only writing to the first buffer of the ring buffer.

    I edit the change in udma_prep_dma_cyclic_tr() to set CPPI5_TR_CSF_EOP only to the last period.  That fixed the problem so that when I booted up the unit, it works fine like before.  I then kill the app by starting GDB and then restart the app in GDB. It continue to run without locking up.  Problem now is that my change is not working 100% like before when running from boot. My cyclic DMA transfer appears to get bad data every so often.

    I pretty sure this change will not work for TX mode but I don't need TX mode.

    I put my changes below.   Can you or have the person who did the original patch look into this?

    Thanks,

    Victor

    @@ -3495,6 +3508,7 @@ udma_prep_dma_cyclic_tr(struct udma_chan *uc, dma_addr_t buf_addr,
            u16 tr0_cnt0, tr0_cnt1, tr1_cnt0;
            unsigned int i;
            int num_tr;
    +       int tr_idx = -1;

            num_tr = udma_get_tr_counters(period_len, __ffs(buf_addr), &tr0_cnt0,
                                          &tr0_cnt1, &tr1_cnt0);
    @@ -3517,8 +3531,17 @@ udma_prep_dma_cyclic_tr(struct udma_chan *uc, dma_addr_t buf_addr,
                    period_addr = buf_addr |
                            ((u64)uc->config.asel << K3_ADDRESS_ASEL_SHIFT);

    +       /*
    +        * For BCDMA <-> PDMA transfers, the EOP flag needs to be set on the
    +        * last TR of a descriptor, to mark the packet as complete.
    +        * This is required for getting the teardown completion message in case
    +        * of TX, and to avoid short-packet error in case of RX.
    +        *
    +        * As we are in cyclic mode, we do  know which period might be the
    +        * last one, so set the flag for each period.
    +        */
            for (i = 0; i < periods; i++) {
    -               int tr_idx = i * num_tr;
    +               tr_idx = i * num_tr;

                    cppi5_tr_init(&tr_req[tr_idx].flags, CPPI5_TR_TYPE1, false,
                                  false, CPPI5_TR_EVENT_SIZE_COMPLETION, 0);
    @@ -3549,6 +3572,14 @@ udma_prep_dma_cyclic_tr(struct udma_chan *uc, dma_addr_t buf_addr,

                    period_addr += period_len;
            }
    +       if (uc->config.ep_type == PSIL_EP_PDMA_XY &&
    +               uc->ud->match_data->type == DMA_TYPE_BCDMA &&
    +               tr_idx >= 0) {
    +               if (!(flags & DMA_PREP_INTERRUPT))
    +                       cppi5_tr_csf_set(&tr_req[tr_idx].flags, CPPI5_TR_CSF_EOP|CPPI5_TR_CSF_SUPR_EVT);
    +               else
    +                       cppi5_tr_csf_set(&tr_req[tr_idx].flags, CPPI5_TR_CSF_EOP);
    +       }

            return d;
     }

  • Victor,

    Problem now is that my cyclic DMA transfer, which is in RX mode, no longer reads in correct data. 

    Is that DMA reads data correctly after firstly boot , but data corruption happens after GDB stopped application and restarted it again? So it means the DMA doesn't work properly after tearing down and starting again?

    Can you or have the person who did the original patch look into this?

    Jai is no longer with TI, but I will take a look at your change to see if I can identify anything.

  • Bing,

    Yes, that is correct.  DMA only doesn't work properly after tearing down and starting again.

    Thanks,

    Victor

  • Bin,

    I just updated my kernel and the GDB problem went away.  So, everything works fine now.  Sorry about the false alarm.

    Now thinking of my change to the patch, I believe the patch would have only allow the cyclic mode to work with only the first buffer in the ring buffer for both TX and RX.  Unfortunately, I have no way to test the TX mode.

    Thanks,

    Victor

  • Hi Victor,

    I just updated my kernel and the GDB problem went away.  So, everything works fine now.  Sorry about the false alarm.

    Glad to hear it is working. But do you still need to apply the change you made which only sets CPPI5_TR_CSF_EOP to the last period?

    Now thinking of my change to the patch,

    Sounds like you still need this change.

    I believe the patch would have only allow the cyclic mode to work with only the first buffer in the ring buffer

    I am not sure what exactly you meant by ring buffer. Your application has

        desc_rx =
            dmaengine_prep_dma_cyclic(s_priv->dma_chan_rx,
            (dma_addr_t)s_priv->dma_phys_data,
            SPOT_BUF_LEN,
            SPOT_NOTIFY_LEN,
            DMA_DEV_TO_MEM,
            DMA_PREP_INTERRUPT);

    it should have just one buffer "s_priv->dma_phy_data", which has size of SPOT_BUF_LEN, and DMA fills this entire buffer after each DMA completion interrupt, isn't it?

    Unfortunately, I have no way to test the TX mode.

    This patch I referred is only for DMA_DEV_TO_MEM mode, so it should be for RX only. Why do you think it impacts TX too?

  • The parameter SPOT_BUF_LEN in this situation is 32x the size of SPOT_NOTIFY_LEN. Thus you would have a ring buffer of 32 buffers and each buffer is  SPOT_NOTIFY_LEN bytes.  So, the original patch probably only write to the first buffer since the CPPI5_TR_CSF_EOP bit is set for that buffer.

    The original patch has comments stating that this will work for both TX and RX.  Since the original patch didn't work for RX, I have to assume it won't work for TX either.

  • With the original patch, in DMA completion callback, have you checked if the second ring buffer has valid data? This would tell if the original patch would only transfer for one buffer.

  • I didn't bother to check since my upper layer software show that the data being read in is corrupted.  Since my change to the original patch was simply to set that bit only for the last buffer, you have to assume only the first buffer was being written to by the DMA.

  • The original patch was to fix a DMA teardown problem for McASP, it is interesting that audio application doesn't have any issue with the patch. I will talk to our dev team about this.

    Anyway, with you change (to set EOP flag only to the last TR) your McSPI application runs fine now, right?

  • While the McASP was using cyclic mode and BCDMA, perhaps it was using only one buffer?  Then you won't see the problem.

    Yes, with my change to the original patch, the McSPI application works like before after a boot cycle and now in GDB.