This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM64X: Poor write performance of mmcblk block device under Linux

Part Number: PROCESSOR-SDK-AM64X

Hello,

based on the SK-AM64 as well as the GPEVM I'm doing some performance analysis of various interfaces by using the block device abstraction layer of Linux.

In particular I'm interested in the PCIe interface (by using NVMe SSDs), the USB 3.0 interface (by using USB flash drives) and the SD/MMC interface (here by using microSD cards).

I'm benchmarking by reading/writing the block devices directly (i.e. /dev/sd..., /dev/nvme..., /dev/mmcblk...) using an own benchmark programm. Right now I'm only interested in strong sequential transfers with rather large block sizes.

While the results for USB 3.0 and PCIe are more or less as expected (speeds between 300-400MiB/s and beyond) I did spot some unexpected behaviour for the SD/MMC interface.

While I'm seeing reading speeds of around 90MiB/s which is as expected for UHS-I media, I'm seeing writing speeds significantly below that level. I did test state-of-the-art SanDisk and Samsung cards here. According to other (independent) testings, these cards should achieve around 90MiB/s. However, I'm getting just around 54MiB/s for the Samsung card and around 63MiB/s for the SanDisk card (with strong sequential write and even 16MiB block size, which is rather large).

The fact that both tested cards show different performances might indicate that these cards are the limiting factor and behave differently. Although I did not (yet) test these cards in other environments, I doubt that they are limiting the performance at that level, because they are reportedly being faster in independent tests. More testing with these cards is to follow here (for instance testing the very same cards in appropriately fast USB card readers on the SK-AM64). However, I wanted to ask in parallel whether there are some other experiences with the performance of the SD/MMC interface known.

Thanks,

Mario

  • Ok, there we go.....

    I did test the very same microSD cards by using an USB 3.0 card reader on the SK-AM64. Intentionally I did not use card readers provided by the respective companies (here SanDisk and Samsung), as they are overclocking the SD interface in order to reach speeds beyound UHS-I. This would have falsified the results resp. would be something like comparing apples with pears.

    In the end, the system is achieving speeds of around 85MiB/s for both cards via USB through the card reader.  This is a reasonable speed one would expect here.

    So we have the situation that we are faster via USB plus card reader than via the native SD/MMC interface. Actually we should be even slower due to the extra-overhead introduced by the card reader. Furthermore it can be stated that this is not a matter of the SD/MMC interface speed as such, as we are achieving the expected 90MiB/s during read. This means that the clock is running at a proper rate.

    So where could be the bottleneck then? To me this is looking like an issue resp. suboptimal handling withing the kernel driver.

  • Maybe one more update:

    Performance analysis is indicating a saturation at a block size of 512kiB. This is looking as if the system is breaking down larger transfers into multiple blocks of 512kiB, hence resulting in a rather low performance.

    I'm not sure whether this is the right place, but I think that something like the entry function for writing the block device is the function mmc_blk_ioctl_copy_from_user() in drivers/mmc/core/block.c. There I find a check against a certain maximum size of the data block by means of comparing against MMC_IOC_MAX_BYTES. MMC_IOC_MAX_BYTES is defined in include/uapi/linux/mmc/ioctl.h and is set to 512L * 1024 - 512kiB, what a coincidence....

    I did raise that value and recompile the kernel. However, at a first glance this is not changing anything. I still have to check whether I made something wrong with the kernel recompilation. Also there is a question what purpose does this default limit of 512kiB serve. Probably it has got its reason and maybe it cannot be simply increased....