Hi,
I work on the DevKit8000 development board under U-boot (without Linux). ARM_FCLK is set to 500MHz, L3_CLK = 133MHz and L4_CLK = 66MHz. I have a problem with a program execution time. Program execution is much to slow when it read data from RAM (internal or external).
This code execute fast enough:
0x81000060: E2899001 ADD R9, R9, #1
0x81000064: E1540009 CMP R4, R9
0x81000068: CAFFFFFC BGT 0x81000060
The loop executes in 6ns when ARM_FCLK = 500MHz so here everything is right.
But this code:
0x81000070: E593E000 LDR R14, [R3]
0x81000074: E28EC001 ADD R12, R14, #1
0x81000078: E583C000 STR R12, [R3]
0x8100007C: E5931000 LDR R1, [R3]
0x81000080: E1540001 CMP R4, R1
0x81000084: CAFFFFF9 BGT 0x81000070
This loop executes in 650ns. If we assume that ADD, CMP, BGT instructions executes in 1 ARM_FCLK cycle (2ns) then reading form /writing to RAM lasts for 215ns!! Why it lasts for so long since L3_CLK is equal to 133MHz? I think I turned off the ICLK of all of the peripherals (except UART3), IVA2.2 is off. Execution time doesn’t depend on whether the program runs from internal or external RAM (data is placed in the same RAM as the program).
I turn off all the interrupts. L1 i L2 Cache are turn on (I think according to PM_PWSTCTRL_MPU and PM_PWSTST_MPU). I load program via RS232 under the U-boot. I debug program via XDS100v2 (disassembled codes are copied from CCS during debug).
Reading from peripheral registers is more or less ok.
Test code:
0x810000B0: E592148C LDR R1, [R2, #1164]
0x810000B4: E2833002 ADD R3, R3, #2
0x810000B8: E1540003 CMP R4, R3
0x810000BC: E58D1004 STR R1, [R13, #4]
0x810000C0: E592C48C LDR R12, [R2, #1164]
0x810000C4: E58DC004 STR R12, [R13, #4]
0x810000C8: 1AFFFFF8 BNE 0x810000B0
This loop executes in 480ns. If we assume that ADD, CMP and BNE executes in 1 ARM_FCLK cycle (2ns) and 2 x STR R1, [R13, #4] (writing via L3 bus from RAM) in 215ns then LDR and STR via L4 lasts for ~24ns (~2 cycles of L4 bus clock). Some other test where there were only L4 accesses showed 149ns read time from peripheral register - not so good.
What can cause such a slow operations on L3 bus?
Best regards