Hello,
I use the TMDXRM46HDK's RTI module as timer. I am wondering, that a Read-Modify-Write at the timer register takes extremely long.
For measuring, I used an output pin that I set high and low. With my oszilloscope I measured that Read-Modify-Write including pin toggling needs 340ns.
//340ns
SET_TEST_PIN_1_HIGH(); //(gioPORTA->DCLR = 1 << 1)
STOP_CPU_TIMER_2(); //rtiREG1->GCTRL &= ~(1U << (rtiCOUNTER_BLOCK1 & 3U))
SET_TEST_PIN_1_LOW(); //(gioPORTA->DCLR = 1 << 1)
//\340ns
Just toggling the pin needs 100ns
//100ns
SET_TEST_PIN_1_HIGH(); //(gioPORTA->DCLR = 1 << 1)
SET_TEST_PIN_1_LOW(); //(gioPORTA->DCLR = 1 << 1)
//\100ns
Even if I subtract the 100ns, the Read-Modify-Write still needs 240ns, 53 cycles at 220MHz!
I had a look into the assembler listing, Read-Modify-Write should take only 4 Cycles = 18ns.
||$C$CON38||: .field -541644,32 //Pointer to structure gioPORTA=0xFFF7BC34
MOV A1, #0 ; |862| //A1 = 0
LDR V9, $C$CON38 ; |866| CYC=1,LDR=2 0+1=1 //V9 = gioPORTA = 0xFFF7BC34
MOV A2, #2 ; |866| CYC=1,LDR=1 1+1=2 //A2 = 1<<1 = 2
STR A2, [V9, #12] ; |866| CYC=1,LDR=2 2+1=3 //V9[12] = A2 (LDR of V9 already finished, so no waitstates)
LDR A3, [A1, #-1024] ; |867| CYC=1,LDR=2 3+1=4 //A3 = A1[0xFFFFFC00] = *0xFFFFFC00
4+1=5 //1Waitstate for A3
BIC A3, A3, #2 ; |867| CYC=1,LDR=1 5+1=6 //A3 &= ~2
STR A3, [A1, #-1024] ; |867| CYC=1,LDR=2 6+1=7 //A1[0xFFFFFC00] = *0xFFFFFC00 = A3
STR A2, [V9, #16] ; |868| CYC=1,LDR=2 7+1=8 //A2 = V9[16]
Where do the extra waitstates for the RTI register come from?
In the datasheets I couldn't find anything. My information that I took from the ARM datasheet (Cortex-R4-white-paper.pdf) doesn't fit.
Is there any possibility to avoid the waitstates?
Is there a document whith description of the RM46l852 Instruction Set (similar to spru430d.pdf "TMS320C28x DSP CPU and Instruction Set Reference Guide" for F28xx)?
Thank you,
Norbert
