This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

The so many delta cycles are believable?

Other Parts Discussed in Thread: AM3352

Hi,

I write come assembly code on SKAM3358 Starter Kit (Cortex A8 CPU) under CCS 6.1 (WIndows 7, 64-bit OS). When I use XDS100 to get some trace data, I find that there are some delta cycles, even for a very simple assembly code. For example, there are 79 delta cycles for a 'ADD r0, r0, r2' instruction:

C:\Users\CCS6_1xxx\TI_workspace_StarterWare_A8a\A8_FromStarterWare_c_call_asm0\Debug\..\asm_func.s,asmfunc(),0x403034BC,"ADD R0, R2"," ADD r0, r0, r2",402,79,,

It is out of my understanding for such a high delta cycle number.

BTW, the project is built with 'Debug' setting (not Release). Could you confirm or not the delta cycles? 

Here is the source code:

		.global asmfunc
		.global gvar
asmfunc:
		LDR r1, gvar_a
		LDR r2, [r1, #0]
		ADD r0, r0, r2
		STR r0, [r1, #0]
;		LDM r0, {r0,r1}
;		LDM r0!, {r1,r2}
		MOV pc, lr
gvar_a .field gvar, 32



#include <stdio.h>

/*
 * hello.c
 */
extern int asmfunc(int a); /* declare external asm function */
int gvar = 0; /* define global variable */


int main(void) {
    int I = 5;
    I = asmfunc(I); /* call function normally */

    if (I<5)
    	printf("Hello World 1!\n");
    else
    	printf("Hello World 2!\n");
	return 0;
}

Thanks

  • Jeff Wong1 said:
    Could you confirm or not the delta cycles?

    I used the hardware trace analyzer to measure the following code on an AM3352 Cortex-A8, using the ETB transport type:

    static void busy_delay (const uint32_t delay)
    {
    	const uint32_t start_time = pmu_get_cycle_count ();
    
    	while ((pmu_get_cycle_count () - start_time) < delay)
    	{
    	}
    }
    

    This is a loop which delays for a number of CPU cycles - by reading the cycle counter in the Cortex-A8. i.e. the loop can be made to delay for a known number of cycles. When called with busy_delay (10000) the Cortex-A8 trace viewer reported a representative total number of cycles (a few hundred more than the requested delay - but this simple delay loop doesn't attempt to account for the overhead in calling the function). Therefore, I trust that the Cortex-A8 trace viewer is reporting a valid number of cycles.

    Jeff Wong1 said:
    It is out of my understanding for such a high delta cycle number.

    Are the Cortex-A8 MMU and caches enabled?

    e.g. with the Cortex-A8 CPU frequency set to 800MHz I ran the same code with and without the MMU and caches being enabled.

    With the MMU and cache off (the default after a system reset) each loop iteration was reporting a high number of delta cycles:

    [Note that while the above screen shot reports "MMU On" the MMU was actually off when trace was captured]

    When the MMU and cache was enabled the first loop iteration the maximum delta cycles was 14 (as opposed to the maximum of 198 seen with the MMU and cache off:

    With MMU and cache enabled after a few loop iterations the maximum delta cycles had reduced to 2:

    There are StarterWare functions to enable the MMU and cache.