I have a customer that has some questions about the level of control the programmer has with the compiler to route different instruction types to different processing cores in the ARM subsystem. They are working with the AM3503 and are concerned about floating point performance. Originally they were going to use the VPF, but because the core is not pipelined they cannont get the performance they are looking for. They need both single and double precision float support. Here is their questions:
Here's a possible option, please advise.
Can we compile all of our code with Neon switched on for all floating point ops? Here are the assumptions we would be making and let us know if these are reasonable and would help.
- Any single float would be trapped and executed by the Neon
- Any double float would be executed by the VFP
- Everything else would be executed by the MPU A8 core.
In other words, is the compiler smart enough to match the operation with the most appropriate resource? Is any of this feasible? Does it buy us anything, or do we lose too many cycles forcing some sort of context switching between the A8, VFP and Neon cores? Is there a particular compiler that does a better job with code optimization?
I'm not familiar enough with the compiler to know if this level of control is possible.
Thanks.