Hi,
I try to verify the correct setting of the bus-clock of an OMAP3530 using simple assembler functions with a known cycle count.
The idea is to turn off D-cache and run bus&core with the same clock-speed.
'load multiple' from internal memory for this case is expected to have a cycle-count very close to the cycle-count of the d-cache-on-case.
(for 'load multiple' every bus-clock-cycle should provide a data-fetch)
In addition chapter 11.3.3.2 of the OMAP3530 technical reference (spruf98b): "The device-embedded RAM [...] Operates at full L3 interconnect clock frequency"
See cache-measurements below.
p1 seems fine in all cases.
When turning off data-cache profiling number exceed expected values by far!
In p2 profiling using internal memory increases by factor 52(!)
In p3 profiling number increase even though the processor clock only is upped.
How can the effects seen on the hardware be explained? What can drive up the required core/bus-cycles so high up?
p1: D-Cache ON (CPU=250MHz, DPLL3=250MHz)
Function cycles cycles/10^6
testLoop 2000471 2.00
testMem_SRAM (internal) 10000649 10.00
testMem2_SRAM (internal) 7000484 7.00
testMem_SDRAM (external) 10000471 10.00
testMem2_SDRAM (external) 7000306 7.00
p2: D-Cache OFF (CPU=250MHz, DPLL3=250MHz)
Function cycles cycles/10^6
testLoop 2001381 2.00
testMem_SRAM (internal) 520204489 520.20
testMem2_SRAM (internal) 272099033 272.10
testMem_SDRAM (external) 685616405 685.62
testMem2_SDRAM (external) 347525487 347.53
p3: D-Cache OFF (CPU=500MHz, DPLL3=250MHz)
testLoop 2001249 2.00
testMem_SRAM (internal) 760001521 760.00
testMem2_SRAM (internal) 402406243 402.41
testMem_SDRAM (external) 1087827357 1087.83
testMem2_SDRAM (external) 544892391 544.89
p4: D-Cache OFF (CPU=500MHz, DPLL3=332MHz)
Function cycles cycles/10^6
testLoop 2001727 2.00
testMem_SRAM (internal) 654763809 654.76
testMem2_SRAM (internal) 341618941 341.62
testMem_SDRAM (external) 897161839 897.16
testMem2_SDRAM (external) 448581711 448.58
Code:
--------------->
testLoop
;; r0 is loop-counter
subs r0, r0, #1 ; reduce loop counter
bne testLoop
[...]
testMem
;; r0 is loop-counter
;; r1 is address to read from
testMemLoop
ldr r2, [r1] ; execute 10 single loads
ldr r2, [r1, #4]
ldr r2, [r1, #8]
ldr r2, [r1, #12]
ldr r2, [r1, #16]
ldr r2, [r1, #20]
ldr r2, [r1, #24]
ldr r2, [r1, #28]
ldr r2, [r1, #32]
ldr r2, [r1, #36]
subs r0, r0, #1 ; reduce loop counter
bne testMemLoop
[...]
testMem2
;; r0 is loop-counter
;; r1 is address to read from
[...]
testMem2loop
ldmia r1, {r2-r11} ; load multiple (10 registers)
subs r0, r0, #1 ; reduce loop counter
bne testMem2loop
[...]
<-----------------------------
thanks!
jhoff