Part Number: AMIC110
Hi,
I’ve received a question about a cache of AMIC110. Could you help to answer their question below.
< Questions >
Is it possible for the execution time to vary depending on the placement address of .text when caches are disabled?
If so, could you explain the cause?
< Background >
On the AMIC110 (Cortex-A8), we implemented a simple “for” loop that iterates approximately 30 million times as follows:
for (int i = 0; i < 33554432; i++) {
// simple operation
}
Even though the “for” loop itself has not been modified, changing unrelated parts of the code causes a difference in execution time of the “for” loop.
When the function’s address is separated from .text into a custom section with a fixed address, the execution time no longer varies even if unrelated code is modified.
Upon investigation, we found that if the “for” loop instructions fit within a 32-byte alignment, the execution completes faster; if they cross a 32-byte boundary, it becomes slower.
Enabling the I-cache eliminates this timing difference, so we believe it is related to some hardware-specific behavior when fetching instructions from internal SRAM.
< Conditions >
- No operating system is used.
- Interrupts are disabled.
- I-cache and D-cache are disabled.
- The function’s .text section is placed in internal SRAM.
Thanks and regards,
Hideaki