This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Should Trace Cycle count match CPU cycle count?

Other Parts Discussed in Thread: SYSBIOS

I am trying to use trace on the DM8148: Cortex A8 to do some low level access timing to optimize performance of certain routines.  I don't seem to be able to get the trace cycle count to match the counter retrieved on A8 using the mcr instruction.

Here is my simple test code to test an access to a CPPI RAM location. The first call to profile_write_time returns 295 ticks (assumed uncached instructions), 160 to 168 ticks for subsequent accesses.  And yet, below is the trace that I am seeing for a ~160 tick iteration.  As you can see, 22 cycles for the STR instruction is clearly not 160. Also, cycle index is occasionally the same number for subsequent instructions.  Its like the cycle index is based on a 100Mhz clock, but I don't know where that would be coming from.  Is there a configuration option for the frequency of the ETB cycle counter?  I found some frequency stuff for STM, but I'm just accessing the ETB through jtag.

unsigned int time32(void){
asm(" mrc p15, #0, r0, c9, c13, #0");
}

int profile_write_time(register int *address){
register int time1;
register int time2;
register int time3;
volatile int retVal;
time1 = time32();
time2 = time32();
*address = (int)address;
time3 = time32();
retVal = (time3 - time2) - (time2 - time1);
return retVal;
}

printf("%d CPU ticks\n",profile_write_time((int *)0x4A102000u));

Instruction Instr Addr Read Addr Write Addr Cycle Index Cycle delta
MOV             R12, R0 0x80106B54 697 0
BL              0x80106B48 0x80106B58 697 3
MRC             P15, #0, R0, C9, C13, #0 0x80106B48 700 0
BX              R14 0x80106B4C 700 5
MOV             R2, R0 0x80106B5C 705 0
BL              0x80106B48 0x80106B60 705 2
MRC             P15, #0, R0, C9, C13, #0 0x80106B48 707 1
BX              R14 0x80106B4C 708 4
MOV             R1, R0 0x80106B64 712 22
STR             R12, [R12] 0x80106B68 734 1
0x4A102000 735
735 8
BL              0x80106B48 0x80106B6C 743 1
MRC             P15, #0, R0, C9, C13, #0 0x80106B48 744 3
SUB             R12, R0, R1, LSL #1 0x80106B70 747 1
ADD             R12, R2, R12 0x80106B74 748 0
  • Hi,

    I was trying to run your testcase but the MRC instruction is throwing an exception (both in ARM and in Thumb modes); could you check the attached test project I did and see if I am missing anything, so we could try to investigate here? I am using a DM8148EVM with the GEL file provided with CCSv5.2.1.

    Also, my board does not have anything mapped at 0x4A102000, thus I changed it to a simple DDR address (my code is in L3 OCM RAM).

    Did you also try to measure the operations using the profile clock from CCS? It reads directly from the CortexA counter register.

    Regards,

    Rafael

    Trace_CPPI_test.zip
  • Are you sure it is the Mrc instruction giving you grief?  I used a sysbios hello world as my starting point.  It looks like you may have used bare metal which will give you problems with printf won't it?  The code snip should work without the printf if you use the debugger to get the data rather than printing it.  I'll have to try out your project on Monday.

  • Hi,

    Yes, I am doing assembly stepping (not run) and it the MRC instruction throws the core off. I am using TI's CGT4.9.6, thus the same as used in any Sysbios project.

    Please let me know if you are able to run the project successfully. That would greatly help the investigation.

    Regards,

    Rafael

  • You either need to be at the privileged level, or add a function like the following to enable the timer for user mode access, but it must also be called from a privileged level. When the timestamp feature is enabled in sysbios, it takes care of setting code like the following so the user can access the cycle count register.

    void time_init(void)
    {
    asm (" mov r0, #1");
    asm (" mcr p15, #0, r0, c9, c14, #0");
    asm (" mov r0, #0x8000000f");
    asm (" mcr p15, #0, r0, c9, c14, #2");
    }

  • Here is an adjustment to time_init.  Also, a quick hack to get it called from privileged mode.  I didn't bother looking into how to actually install a swi handler on this bare metal platform, if there is even a method, so I just debugged a swi event and hijacked it for this build.  And I ave no trouble accessing 0x4A102000 in your project on my dm8148 dev kit.  It is the Ethernet descriptor RAM.  Maybe your gel file you're using isn't initializing it.  If thats the case, the point of this forum post is about trace, so using ddr3 address should be sufficient.


    void time_init(void)
    {
    asm (" mov r0, #1");
    asm (" mcr p15, #0, r0, c9, c14, #0");
    asm (" mov r0, #0x00000013");
    asm (" mcr p15, #0, r0, c9, c12, #0");
    asm (" mov r0, #0x80000000");
    asm (" mcr p15, #0, r0, c9, c12, #1");
    asm (" mov r0, #0x8000000f");
    asm (" mcr p15, #0, r0, c9, c14, #2");
    }

    unsigned int * hijacked_swi_vector = (unsigned int *)0x4031D028;
    unsigned int old;
    unsigned int dummy;

    void main(void)
    {
    old = *hijacked_swi_vector;
    *hijacked_swi_vector = (unsigned int)time_init;
    asm( " swi #0" );
    *hijacked_swi_vector = old;

    printf("DDR %d CPU ticks\n",profile_write_time((int *)0x80000000));
    printf("CPPI %d CPU ticks\n",profile_write_time((int *)0x4A102000u));

    for(;;);
    }

  • I also added the following to the timi_init function if I wanted to enable instruction cache since I'm not sure if it is enabled by default in your bare metal example.

    asm (" mov r0, #0x00001800");
    asm (" mcr p15, #0, r0, c1, c0, #0");

  • Hi,

    I added all the modifications (including a procedure to maintain the device in SPV mode) but for some reason I am still getting the exception in the device at the time32 function.

    Still doing some additional testing, but I will try to perform some cycle count measurements with whatever I have.

    Regards,

    Rafael

  • I don't know, I've attached your project with my modifications to get it to time the access of CPPI and DDR3.  If that still gives you an exception I don't know.  

    On a side note, I'm curios if you know the "proper" way to enter privileged mode in this context of a bare metal project?

    8267.Trace_CPPI_test.zip

  • Rafael,

    Have you made any progress on this yet?  Are you at least able to run the example now and take a trace?