This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/CC1312R: c++ compilation optimization

Part Number: CC1312R

Tool/software: TI C/C++ Compiler

Hi,

I try to investigate the difference in generated assembly code when using a simple dispatch function in two versions:

  • switch/case
  • if/else

I use the ARM GNU Linaro 9.2.1 compiler and ARM-M4F processor (CC1312).
To compare, I also checked results using TI v20.2.1.LTS compiler.
I use optimization O3 and O4 respectively.

Version 1 - switch/case

    uint32_t read1(const uint32_t index) const {
        switch (index) {
        case 0:
            return field1.read1();
        case 1:
            return field2.read1();
        case 2:
            return field3.read1();
        ...
        ...
        }
        return 0;
    }

When the dispatch function is created using a switch/case statement, the generated result is optimum. It compiles to the jump table (TBB or TBH) when the number of branches is high enough or to multiple compare statements (CMP) when the number of branches is low.

000013f8:   F2008146            bhi.w      unknown
000013fc:   E8DFF013            tbh        [pc, r3, lsl #1]
00001400:   0142                lsls       r2, r0, #5
00001402:   0140                lsls       r0, r0, #5
00001404:   013E                lsls       r6, r7, #4
00001406:   013C                lsls       r4, r7, #4


Version 2 - if/else

    uint32_t read1(const uint32_t index) const {
        if (index == 0) {
            return field1.read1();
        }
        else if (index == 1) {
            return field2.read1();
        }
        else if (index == 2) {
            return field3.read1();
        }
        ...
        ...
        return 0;
    }

Unfortunately, when the same code is rewritten to use if/else statement it compiles as several compare (CMP) assembler instructions regardless of the number of branches. This approach is extremely inefficient. For 20 branches it gives almost the twice number of cycles when compared to the jump table (switch/case).

289               else if (index == 1) {
0000142c:   2B01                cmp        r3, #1
0000142e:   D06F                beq        unknown
292               else if (index == 2) {
00001430:   2B02                cmp        r3, #2
00001432:   F00080DD            beq.w      unknown
295               else if (index == 3) {
00001436:   2B03                cmp        r3, #3
00001438:   F00080E1            beq.w      unknown
298               else if (index == 4) {
0000143c:   2B04                cmp        r3, #4
0000143e:   F00080EE            beq.w      unknown


The index sequence has no gaps (0,1,2,3,4,5,....) so in the theory jump table is the optimum solution for almost every number of branches.

The results are as follows:

/**
 * comparison - 10000 times invokes:
 *
 * No of cycles (size) - M4F (TI CC1312):
 *             -------------------------------------
 *             |     SWITCH      |       IF        |
 *             -------------------------------------
 *             |       GNU Linaro 9.2.1 - -O3      |
 * -------------------------------------------------
 * 2 elements  |    280,008      |    280,008      |
 *             |    (27873)      |    (27873)      |
 * -------------------------------------------------
 * 20 elements |    660,117      |    973,134      |
 *             |    (28241)      |    (28273)      |
 * -------------------------------------------------
 */


Do I miss something?
The generated code has to be similar.

Why is it so?
Is there any compiler flag to deal with such situations?

What can I do to force the compiler to generate an optimum result in version 2?

The If/else version is crucial for me to introduce more fancy generalization into the project and use c++ template metaprogramming.

What can I do to use if/else version and receive a result based on the jump/branch table?

Below I enclosed the minimal reproducible example.

7043.main.cpp
Any help is appreciated.

/Adam

  • The conditional part of an if statement can much more complex than a comparison for equality with an integer constant.  For that reason, I am not aware of any compiler which, for a series of such if statements, generates a branch table.

    Thanks and regards,

    -George