This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LC43x with HalCoGen Dhrystone Benchmark

Other Parts Discussed in Thread: HALCOGEN, TMS570LC4357

I have followed the ARM Application Note 273 (Dhrystone Benchmarking for ARM Cortex Processors), in order to benchmark the TMS570LC43x.
infocenter.arm.com/.../DAI0273A_dhrystone_benchmarking.pdf

According to the datasheet, the Cortex-R5 should be capable of 1.66 DMIPS/MHz (=498 DMIPS @300MHz).
However, my best score was 270 DMIPS @300MHz using the HalCoGen "out-of-the-box" configuration (code from flash).
I have used the latest IAR compiler, and tested various compiler optimization settings.

A) Setup

  • Project created with TI HalCoGen 04.05.00 Released 15.July.2015
  • IAR Embedded Workbench for ARM 7.40.3.8938
  • Hercules TMS570LC43x LaunchPad Development Kit

B) Measurement configuration

  • 1'000'000 dhrystone runs
  • RTI compare interrupt @1ms
  • General Options -> Library Options -> Buffered terminal output
  • IAR Setup as recommended by the HalCoGen PDF (ARM processor mode)

C) Results

Compiler Optimization Configuration DMIPS
None 269

Low

277
Medium 167
High Balanced 232
High Size 189
High Speed 235
High Speed no size constraint 236

DMIPS = Dhrystones per second / 1757

D) Questions

  1. Are there any options I did forget in HalCoGen to configure in order to get the maximum performance of the TMS570LC43x?
  2. What's the maximum DMIPS I can expect when running the code from flash, is it possible to get 498 DMIPS from flash?
  3. Why is the Semihosted printf / scanf so slow (it takes up to 5s to print a single character)?
  4. Before you blame the IAR compiler, can you run the Dhrystone v2.1 benchmark with Code Composer and publish your results?

E) Attachments

I have attached the following files:

  • HalCoGen Project (Dhrystone.hcg)
  • Complete IAR Workspace and projects including generated HAL files

 5618.TMS570LC43x_Dhrystone.zip

  • Hello,

    It certainly looks like there is some issue with the porting.

    I will spend sometime on your codebase and let you know if I can figure out on whats going wrong.

    Some quick pointers from our experts until I go look at your code base,

    a. We should use the Halcogen Project w/o free RTOS as we do not need the RTI in this case
    b. FPU needs to be enabled as well as the runtime library with FPU support is used
    c. Proper MPU settings for enabling cache etc...
  • You must do something wrong on  MCU or compiler setting. For example disabled cache or something else.
    I made same measure on this TMS570LC4357 and measured result is 496 DMIPS  @ 300MHz (1.654 DMIPS / MHz)

    Compiler is ARM GCC 5 2015q3 (patched for big endian) Board TI TMS570LC4357 HDK

    But I am not abble to add complete sources in attachment because it need non-free RTOS to run. Test part of sources is same as yours.

  • Hi Jiri,

    Wow I'm impressed because you're getting something better than 1.6DMIPs/MHz!

    I got 330DMIPs/MHz testing with the TI compiler and checked the memory settings to make sure they were correct.
    I'm finding that the tall pole is in the string copy routines.

    So maybe the runtime lib of GCC has something optimized compared to the TI or the IAR libraries.
    Will be interesting to look into.. Thanks for pointing this out.
  • Frankly speaking measured value was 439 DMIPS, but I must compensate another RTOS task with higher priority (100us period, 13% CPU load). This test was over 111000 repetitions and take 143.8ms therefore I must compensate this higher priority task load and compensated result is 496 DMIPS
    Anthony: what you mean 330DMIPs/MHz ? For DMIPs per MHz it is too big value (datasheet value is "1.66 DMIPS/MHz With 8-Stage Pipeline")
  • Jiri,
    Sorry - I got 330DMIPs at 300MHz.
  • Jiri,

    One and a half years ago I made some benchmarking with the device, I mainly concentrated on CoreMark but I also run Dhrystone.

    Here are my results on Dhrystone:

    Dhrystone Benchmark, Version 2.1 (Language: C)
    Compiler Options Inst. Set DMIPS DMIPS / MHz
    TI ARM v5.2.5 -o4 -mf5 Thumb2 543.78 1.81
    TI ARM v5.1.12 -o4 -mf5 Thumb2 540.33 1.80
    TI ARM v5.2.5 -o4 -mf5 ARM 499.99 1.67
    TI ARM v5.1.12 -o4 -mf5 ARM 490.65 1.64

    Best Regards,
    Christian