This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/CC1312R: __delay_cycles() waits too many cycles

Part Number: CC1312R

Tool/software: TI C/C++ Compiler

Hi,

When I call this:

__delay_cycles(1000000ul);

The compiler generates this:

F2416014            movw       r0, #0x1614
F2C00005            movt       r0, #5
$1_$43:
1E40                subs       r0, r0, #1
D1FD                bne        $1_$43
BF00                nop        
BF00                nop        

Why does the compiler think 333332 × 4 + 2 NOPs would take 1000000 cycles? It takes 1333330! I tried other values and it's always one third too many. O0 or O3 makes no difference.

Please help!

I'm using CCS 10.0, TI compiler v20.2.0.LTS.

  • The TI ARM compiler manual documentation on __delay_cycles states the following ...

    Cycle timing is based on 0 wait states. Results vary with additional wait states.

    Do memory wait states, or other memory effects outside the CPU, account for the extra cycles?

    Thanks and regards,

    -George

  • Hi,

    how can I check this? I'm using a CC1312 on custom hardware. There is no other task running, it's all "bare metal" (no TI-RTOS).

    I am however running it through an XDS110 debugger, will that affect things?

  • Hi again,

    I investigated some more and there is something fishy going on:

    If I single step this code with R0 pre-set to 10, it takes exactly 20 clock cycles:

    1E40                subs       r0, r0, #1
    F47FAFFD            bne.w      #-6

    However, if I run it instead of stepping it, it takes 38!? According to the HW clock cycles counter, that is.

    R0 = 20 gives 78
    R0 = 40 gives 158
    R0 = 100 gives 398

    So it's 4n-2 if I run it, 2n if I step. How can it run faster if I single-step it? Should it do that?

    (Also note that in my original question it did BNE (D1xx) instead of BNE.W (F47FAFxx), which probably explains the factor difference. I didn't try single-stepping 1000000 steps but it (the original code) did the same +33% thing for all values.)

    So it looks like the clock cycles counter is somehow broken (?), but I'm now entering voodoo territory

  • Tomas Gradin said:
    how can I check this?

    Normally I would expect the device documentation to mention the number of wait states for the flash memory. However, from searching the CC1312R datasheet and CC13x2,CC26x2 SimpleLink™ Wireless MCU Technical Reference Manual I can't seem to find the number of wait states specified.

    For other Cortex-M4F devices I did create program to measure the number of wait states - see MSP432P401R: Performance with SRAM_CODE at 48 MHz. I don't have any CC1312R devices at the moment to try running the measurement on.