This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMAP 3525 Dhrystone Performance

We have some concern about performance measurements we have made. 

Searching the TI website we found the following document on Cortex A8 performance
http://processors.wiki.ti.com/index.php?title=Cortex-A8_Features#OMAP_ARM_Cores_Performance_Dhrystome_V2.1
Which claims 2.01 Dhrystone V2.1 DMIPS per MHz.  Since we are running our OMAP at 500 MHz we should expect about 1005 DMIPS.

We obtained the "official" Dhrystone V2.1 benchmark source code from
http://www.netlib.org/benchmark/dhry-c
and compiled and ran it.  We converted the Dhrystone figure to DMIPS by dividing by 1757 which we understand to be the standard conversion.

On our OMAP 3525 running LynxOS SE at 500 MHz and compiling with LynxWorks cross compiler we get 237.1 DMIPS
To compare we used an older board using the OMAP 3430 (that we used prior to 3525 availability) under Monta Vista mobile Linux and using their cross-compiler and got 258.7 DMIPS.

We realize that Dhrystone measurements will vary some due to differences in compilers and OSs but would not expect our getting around 25% of TI's figure.

Can you suggest why we might be getting such drastically lower figures?

  • Terry,

    We have found Dhrystone can vary 25% and more depending on what compiler you use.  Even different versions of a compiler can produce varying results. We recommend Code Sourcery toolchain for our customers.  We have seen over a 25% differences in performance between different versions of Code Sourcery tools. In some of our recent Software Development Kits (SDKs) we include a statically built Dhrystone based on the arm-2008q1 version of Code Sourcery because it produces a 1.9 DMIPS/MHz for all our Cortex-A8 based parts. We use the exact same dhrystone benchmark you are using.

     

    Side Note:

    Dhrystone is somewhat out dated and does not reflect the current the capabilities of embedded microprocessors. It was designed when ADA and PASCAL were still popular programming languages. TI still quotes Dhrystone numbers only because customers still ask for them. You may want to investigate other benchmarks such as Coremark which is a small subset of the EEMBC benchmark suite. I have found with most other benchmarks, like Coremark, compiler toolchain or toolchain version has much less impact, more like 5%. So far it is only Dhrystone that I have seen this wide variation from toolchain to toolchain.

  • Thanks for the response.  We are certainly aware that Dhrystones are not truly relevant for embedded software like what we are developing, but need to quote it anyway.  And thanks for describing some of the issues we can check.

    But note that our descrepency is far more than 25%.  We are getting only 1/4 the TI figure: 237 rather than 1005.  We were not expecting that great a difference due to compiler alone. 

    We were wondering if the 2.01 DMIPS/MHz (or 1.9) depended on careful tuning, special compiler or compiler options and possibly use of NEON.

  • Terry,

    You should also be able to get 1.9 DMIPS/MHz.  As mentioned, we statically build dhrystone with arm-2008q1. You can find that binary here: https://gforge.ti.com/gf/project/am_benchmarks/frs/

    You can download dhrystone-static.tar.gz and execute it on your board.

    If you want to duplicate this effort, you need to have version arm-2008q1 of  Code Sourcery installed then I used this build line:

    arm-none-linux-gnueabi-gcc -O3 -DTIME -march=armv7-a dhrystone.c -o dhrystone

    We have modified dhrystone, so you can pass in the <number of iterations> and <cpu freq in KHz> as input parameters and you will get back DMIPS/MHz using the same calculation you mentioned, but this would not affect performance.

    You can also find the source code and makefiles we use at this Gforge Repository: https://gforge.ti.com/gf/project/am_benchmarks/