This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Inline ARM Coprocessor Instruction

Hi,

Compiler I'm using is a 5.10 alpha - will try a release version if needed but suspect this isn't something new

I think the problem I'm having is close to this thread's issue:  http://e2e.ti.com/support/development_tools/compiler/f/343/t/213287.aspx  but I'm after a different result and maybe it's a more difficult result.


For benchmarking, we have a built-in cycle counter in the Cortex's PMU unit,  and an assembly function to access this is:

_pmuGetCycleCount_:
    MRC             P15, #0, R0, C9, C13, #0
    BX              R14

Now the function call really isn't necessary since the function itself is only one line of code.  I need to put two of these function calls

around a small body of code (about 5 instructions) that are accessing flash memory, and I don't want to have the error / overhead of the function call itself

(the branches) in the result.

I couldn't find any sort of directive in assembly to tell the compiler that an assembly function may be inlined.

So, I moved the function to 'C' like this:

unsigned int _pmuGetCycleCount_(void) {
    asm("     mrc   p15, #0, r0, c9, c13, #0");
}

and use it like this:

    cyclecount[0] = _pmuGetCycleCount_();
    FunctionToBenchmark();
    cyclecount[1] = _pmuGetCycleCount_();


Now, when the above function is called as is the compiled code is fine and the value returned in R0 is saved in a temporary register and then finally written to memory correctly.

Adding the keyword 'inline' - the mrc instruction is inlined and there isn't any function call.

However, the compiler then writes over the result stored in R0; and doesn't save the result to memory anymore.

The last observations seems like it is a compiler issue - why would the compiler know to save the result of the function to cycle_count[0] and [1] when

not inlined, but then throw away the result when the function is inlined?

-Anthony

PS:  I think it would be a lot simpler if there were generic intrinsics for the mcr, mrc instructions, not just the specific intrinsics hardcoded to interrupt enable/disable.

There are plenty of these coproessor registers;  and for example this thread: 

 http://e2e.ti.com/support/development_tools/code_composer_studio/f/81/t/103040.aspx

Could have benefited too if the intrinsic allowed the complicated mrc ...

  • The ARM compiler manual is clear on this point ...

    ... asm statements that attempt to interface with the C/C++ environment or access C/C++ variables can have unexpected results.

    So, I'm not surprised that inlining _pmuGetCycleCount_ fails.  I am surprised that calling it normally works.  I can see why it works.  But it is really a series of fortunate accidents more than anything else.

    Thanks and regards,

    -George