This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
I am trying to use trace on the DM8148: Cortex A8 to do some low level access timing to optimize performance of certain routines. I don't seem to be able to get the trace cycle count to match the counter retrieved on A8 using the mcr instruction.
Here is my simple test code to test an access to a CPPI RAM location. The first call to profile_write_time returns 295 ticks (assumed uncached instructions), 160 to 168 ticks for subsequent accesses. And yet, below is the trace that I am seeing for a ~160 tick iteration. As you can see, 22 cycles for the STR instruction is clearly not 160. Also, cycle index is occasionally the same number for subsequent instructions. Its like the cycle index is based on a 100Mhz clock, but I don't know where that would be coming from. Is there a configuration option for the frequency of the ETB cycle counter? I found some frequency stuff for STM, but I'm just accessing the ETB through jtag.
unsigned int time32(void){ int profile_write_time(register int *address){ printf("%d CPU ticks\n",profile_write_time((int *)0x4A102000u)); |
Instruction | Instr Addr | Read Addr | Write Addr | Cycle Index | Cycle delta |
MOV R12, R0 | 0x80106B54 | 697 | 0 | ||
BL 0x80106B48 | 0x80106B58 | 697 | 3 | ||
MRC P15, #0, R0, C9, C13, #0 | 0x80106B48 | 700 | 0 | ||
BX R14 | 0x80106B4C | 700 | 5 | ||
MOV R2, R0 | 0x80106B5C | 705 | 0 | ||
BL 0x80106B48 | 0x80106B60 | 705 | 2 | ||
MRC P15, #0, R0, C9, C13, #0 | 0x80106B48 | 707 | 1 | ||
BX R14 | 0x80106B4C | 708 | 4 | ||
MOV R1, R0 | 0x80106B64 | 712 | 22 | ||
STR R12, [R12] | 0x80106B68 | 734 | 1 | ||
0x4A102000 | 735 | ||||
735 | 8 | ||||
BL 0x80106B48 | 0x80106B6C | 743 | 1 | ||
MRC P15, #0, R0, C9, C13, #0 | 0x80106B48 | 744 | 3 | ||
SUB R12, R0, R1, LSL #1 | 0x80106B70 | 747 | 1 | ||
ADD R12, R2, R12 | 0x80106B74 | 748 | 0 |
Hi,
I was trying to run your testcase but the MRC instruction is throwing an exception (both in ARM and in Thumb modes); could you check the attached test project I did and see if I am missing anything, so we could try to investigate here? I am using a DM8148EVM with the GEL file provided with CCSv5.2.1.
Also, my board does not have anything mapped at 0x4A102000, thus I changed it to a simple DDR address (my code is in L3 OCM RAM).
Did you also try to measure the operations using the profile clock from CCS? It reads directly from the CortexA counter register.
Regards,
Rafael
Are you sure it is the Mrc instruction giving you grief? I used a sysbios hello world as my starting point. It looks like you may have used bare metal which will give you problems with printf won't it? The code snip should work without the printf if you use the debugger to get the data rather than printing it. I'll have to try out your project on Monday.
Hi,
Yes, I am doing assembly stepping (not run) and it the MRC instruction throws the core off. I am using TI's CGT4.9.6, thus the same as used in any Sysbios project.
Please let me know if you are able to run the project successfully. That would greatly help the investigation.
Regards,
Rafael
You either need to be at the privileged level, or add a function like the following to enable the timer for user mode access, but it must also be called from a privileged level. When the timestamp feature is enabled in sysbios, it takes care of setting code like the following so the user can access the cycle count register.
void time_init(void)
{
asm (" mov r0, #1");
asm (" mcr p15, #0, r0, c9, c14, #0");
asm (" mov r0, #0x8000000f");
asm (" mcr p15, #0, r0, c9, c14, #2");
}
Here is an adjustment to time_init. Also, a quick hack to get it called from privileged mode. I didn't bother looking into how to actually install a swi handler on this bare metal platform, if there is even a method, so I just debugged a swi event and hijacked it for this build. And I ave no trouble accessing 0x4A102000 in your project on my dm8148 dev kit. It is the Ethernet descriptor RAM. Maybe your gel file you're using isn't initializing it. If thats the case, the point of this forum post is about trace, so using ddr3 address should be sufficient.
void time_init(void)
{
asm (" mov r0, #1");
asm (" mcr p15, #0, r0, c9, c14, #0");
asm (" mov r0, #0x00000013");
asm (" mcr p15, #0, r0, c9, c12, #0");
asm (" mov r0, #0x80000000");
asm (" mcr p15, #0, r0, c9, c12, #1");
asm (" mov r0, #0x8000000f");
asm (" mcr p15, #0, r0, c9, c14, #2");
}
unsigned int * hijacked_swi_vector = (unsigned int *)0x4031D028;
unsigned int old;
unsigned int dummy;
void main(void)
{
old = *hijacked_swi_vector;
*hijacked_swi_vector = (unsigned int)time_init;
asm( " swi #0" );
*hijacked_swi_vector = old;
printf("DDR %d CPU ticks\n",profile_write_time((int *)0x80000000));
printf("CPPI %d CPU ticks\n",profile_write_time((int *)0x4A102000u));
for(;;);
}
I also added the following to the timi_init function if I wanted to enable instruction cache since I'm not sure if it is enabled by default in your bare metal example.
asm (" mov r0, #0x00001800");
asm (" mcr p15, #0, r0, c1, c0, #0");
Hi,
I added all the modifications (including a procedure to maintain the device in SPV mode) but for some reason I am still getting the exception in the device at the time32 function.
Still doing some additional testing, but I will try to perform some cycle count measurements with whatever I have.
Regards,
Rafael
I don't know, I've attached your project with my modifications to get it to time the access of CPPI and DDR3. If that still gives you an exception I don't know.
On a side note, I'm curios if you know the "proper" way to enter privileged mode in this context of a bare metal project?
Rafael,
Have you made any progress on this yet? Are you at least able to run the example now and take a trace?