This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM4377: Linux SPI performance

Part Number: AM4377

Hi

We are developing firmware for a custom AM437x based board. We are not using the TI SDK, but rather the Linux based firmware is created using Buildroot and a standard kernel.

While upgrading our previous kernel 4.19.45 to 5.7.1 I noticed a significant increase in CPU usage, which appeared to be related to SPI operations. Total CPU usage as measured by "top" increased from 5-10% to around 25-30% following the kernel upgrade, with SPI-based drivers (including a dual UART and an small OLED panel) showing up as the dominant users. The SPI unit is not currently configured for DMA.

I tried downgrading just one file drivers/spi/spi-omap2-mcspi.c back to the 4.19.45 version and the CPU usage returned to close to its original level.

From looking at the history of this file, I wonder whether the changes to add support for slave mode SPI in kernel 4.20 may be causing this.

Have you observed this? Can the driver implementation be changed to avoid this increase in CPU usage, or can a config option be added to disable this change? In our application we don't need slave mode SPI.

Thanks!

  • Hi Jeremy,

    Have you checked this function mcspi_wait_for_reg_bit() and tried doing as done in 4.19 kernel ? I see a difference there in non-DMA PIO method. It appears in one, it sleeps for microseconds(>4.19) and in the other case it relaxes CPU in the order of millisconds(4.19)

  • I agree that mcspi_wait_for_reg_bit() could be at play here. Looking at the difference in commits there is a commit that stands out 13d515c796ad ("spi: omap2-mcspi: Switch to readl_poll_timeout()") which is likely suspect. Try reverting that commit (git revert 13d515c796ad) on top of v5.7.1 to see if this fixes things. If it does we can get this fixed in the upstream Kernel.

    $ gitl v4.19.45...v5.7.1 drivers/spi/spi-omap2-mcspi.c
    32f2fc5dc399 spi: spi-omap2-mcspi: Support probe deferral for DMA channels
    e4e8276a4f65 spi: spi-omap2-mcspi: Handle DMA size restriction on AM65x
    8d8584912a43 spi: omap2-mcspi: Remove redundant checks
    c942fddf8793 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157
    baf8b9f8d260 spi: omap2-mcspi: Fix DMA and FIFO event trigger size mismatch
    842aeeac335e spi: omap2-mcspi: Add missing suspend and resume calls
    91b9deefedf4 spi: omap2-mcspi: Add missing suspend and resume calls
    89e8b9cb8465 spi: omap2-mcspi: Add slave mode support
    b682cffa3ac6 spi: omap2-mcspi: Set FIFO DMA trigger level to word length
    13d515c796ad spi: omap2-mcspi: Switch to readl_poll_timeout()
    $ git show 13d515c796ad
    commit 13d515c796adc49a49b0cd2212ccd7f43a37fc5a
    Author: Vignesh R <vigneshr@ti.com>
    Date:   Mon Oct 15 12:08:27 2018 +0530
    
        spi: omap2-mcspi: Switch to readl_poll_timeout()
    
        Use standard readl_poll_timeout() macro for polling on status bits.
    
        Signed-off-by: Vignesh R <vigneshr@ti.com>
        Signed-off-by: Mark Brown <broonie@kernel.org>
    
    diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c
    index 508c61c669e7..985f00d8a964 100644
    --- a/drivers/spi/spi-omap2-mcspi.c
    +++ b/drivers/spi/spi-omap2-mcspi.c
    @@ -33,6 +33,7 @@
     #include <linux/of.h>
     #include <linux/of_device.h>
     #include <linux/gcd.h>
    +#include <linux/iopoll.h>
    
     #include <linux/spi/spi.h>
     #include <linux/gpio.h>
    @@ -353,19 +354,9 @@ static void omap2_mcspi_set_fifo(const struct spi_device *spi,
    
     static int mcspi_wait_for_reg_bit(void __iomem *reg, unsigned long bit)
     {
    -       unsigned long timeout;
    -
    -       timeout = jiffies + msecs_to_jiffies(1000);
    -       while (!(readl_relaxed(reg) & bit)) {
    -               if (time_after(jiffies, timeout)) {
    -                       if (!(readl_relaxed(reg) & bit))
    -                               return -ETIMEDOUT;
    -                       else
    -                               return 0;
    -               }
    -               cpu_relax();
    -       }
    -       return 0;
    +       u32 val;
    +
    +       return readl_poll_timeout(reg, val, val & bit, 1, MSEC_PER_SEC);
     }
    
     static void omap2_mcspi_rx_callback(void *data)
    

    Regards, Andreas

  • Thanks Andreas and Dwarakesh,

    I reverted the change in mcspi_wait_for_reg_bit and the CPU usage is basically back to what it was with the 4.19 kernel. I'm not sure I fully understand the exact issue with readl_poll_timeout, but I'm happy to have the workaround

    cheers

    Jeremy

  • Jeremy,

    thanks for the quick confirmation, glad to hear that solves it. I've already followed up internally to get this behavior improved in the official Kernel.

    Also from your case it sounds like you are constantly/non-stop updating an OLED display which is connected via SPI, correct?

    Regards, Andreas

  • Hi Andreas

    That's right, our device uses SPI fairly heavily, for OLED display updates and also frequently polling other devices on an RS485 backplane via an SPI UART.

    One of the things on my to do list is to try to get DMA working with the SPI unit. But for the moment PIO mode is adequate.

    Thanks for following this up with the standard kernel.

    Jeremy