This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

unusually long time for a loop execution

Other Parts Discussed in Thread: MSP430F6736

Hi anyone...

i have the following function being called from INSIDE a timerA ISR on a MSP430F6736 every 100mS.

In that function, var keyIsPressed is a global declared as "uint8_t  keyIsPressed[4][3]" and only 1 element ever has a value (1..4) at any one time (so the remaining matrix elements would be zero).

With Version A, calling this function causes a later function in the same ISR to execute 500ms later, whereas with version B the later function executes a few micro secs later.

What i do not understand is why there should be ANY difference in execution time....ie, what is it that when i have the 2 compares in version A it takes 500ms yet  switching the compares around in version B and thus executing only 1 compare only results in a few micro secs....i would have expected both versions to just take a few micro secs.

(i realise i could improve this by just saving the one element's value, but the app will be catering for more than 1 element having a value)

VERSION A:

uint8_t keyIsPressed[4][3];

void CheckStartRepeatKeys(void)
{
uint8_t i,j;
for(i = 4; i> 0;)
{
for(--i, j = 3; j > 0;)
{
--j;
if(keyIsPressed[i][j] < 4) // every element EXCEPT one will be < 4
{
if(keyIsPressed[i][j] != 0) // only 1 will be != 0
{
if(++keyIsPressed[i][j] == 4)
kpdTickCounter300mS = 9599;
else
kpdTickCounter100mS = 0;
}
}
}
}
return;
}

VERSION B:

void CheckStartRepeatKeys(void)
{
uint8_t i,j;
for(i = 4; i> 0;)
{
for(--i, j = 3; j > 0;)
{
--j;
if(keyIsPressed[i][j] != 0) //  only 1 will be != 0 
{
if(keyIsPressed[i][j] < 4) // only executed until the one element reaches value 4
{
if(++keyIsPressed[i][j] == 4)
kpdTickCounter300mS = 9599;
else
kpdTickCounter100mS = 0;
}
}
}
}
return;
}

thanks in advance....

  • Avoid making subroutine calls inside a ISR, they have to be fast as other IRQ can not happen until it's done (other IRQ flags are still set though)
    Why Two-Dimensional Array?, try to use bits as if all 16bits in a word is zero one single cmp will check for that.

    instead of just for loops, learn how to use: do{} until/while

    As what is it your are trying to do?

    And why do you have return;?
    a function starting with void states that is does not want to return anything

  • Hi Tony
    this is part of a keypad capture module...i have other modules being called from that same ISR that set variables pertaining to themselves, so the use of subroutines is for ease of maintenance down the line (like in a few months when i've forgotten what is going on).

    the reason for the 2 dim array is also ease of maintenance in a couple of years..the array mimicks the keypad dimensions and i have a built-in delay by using an elt value...

    the reason for the for loop rather than 'do until' or 'while' is that it is a fixed number of iterations every time.

    the reason for using the return keyword at the end of a function is purely so i can easily see where the end of a function is and that the code is laid out correctly.

    However, this above does not address the cause of why 24 compares take 500ms yet 12 compares take the expected few micro secs (if not nano secs).
  • unroll the loop, you only have 12 bytes in the array

     kpdTickCounter300mS=0;
     if(keyIsPressed[0][0] && ++keyIsPressed[0][0] > 3) kpdTickCounter300mS = 9599;
     if(keyIsPressed[0][1] && ++keyIsPressed[0][1] > 3) kpdTickCounter300mS = 9599;
     if(keyIsPressed[0][2] && ++keyIsPressed[0][2] > 3) kpdTickCounter300mS = 9599;
     if(keyIsPressed[1][0] && ++keyIsPressed[1][0] > 3) kpdTickCounter300mS = 9599;
    ....

    good part it will stop when it rolls around 255 and lands on 0 again, depending on how your Port IRQ uses edges.
    but beware some compilers do everything inside the if (... &&...) and don't exit at first untrue.

    Enable disassemble window and look how complicated the machine code gets, to compare.

     00C01E    4382 0200          clr.w   &kpdTickCounter300mS
     00C022    93C2 0202          tst.b   &keyIsPressed  // [0][0]
     00C026    2408               jeq     0xC038
     00C028    53D2 0202          inc.b   &keyIsPressed
     00C02C    92E2 0202          cmp.b   #0x4,&keyIsPressed
     00C030    2803               jnc     0xC038
     00C032    40B2 257F 0200     mov.w   #0x257F,&kpdTickCounter300mS
    
     00C038    93C2 0203          tst.b   &0x203         // [0][1]
     00C03C    2408               jeq     0xC04E
     00C03E    53D2 0203          inc.b   &0x203
     00C042    92E2 0203          cmp.b   #0x4,&0x203
     00C046    2803               jnc     0xC04E
     00C048    40B2 257F 0200     mov.w   #0x257F,&kpdTickCounter300mS
     00C04E   

  • thanks..i like your unrolling.

    But the real purpose for my original question is "how can it possibly take so long to execute 24 compares yet take the expected nano secs to perform 12 compares" (depending on how the if statements are arranged)???

    On another note, i had spent hours looking ELSEWHERE for the cause until i homed in on the offending if statement...is there any TI tool that would have zeroed in on it faster? And, because it happened in the interrupt my timing count indicated everyhing was all ok (ie 100ms) except my watch told me it was 500mS!!
  • I never seen for loop used like this:
    for(i = 4; i> 0;)
    {
    for(--i, j = 3; j > 0;)
    {

    Try instead
    for(i = 0; i < 4; i++)
    {
    for(j = 0; j < 3; j++)
    {

    Though it's preferred to use pre inc/dec when you do evaluated  ++/--value,
    for() don't have this problem of having to use temp storage to make the evaluation when using post inc/dec as any good compiler gets it.

  • the way you show is the way i've always done the for loop until recently when i started with the texas instrument and they wrote that the for loop should be tested against zero(ULP adviser rule 13.1), which meant i now had to change my original way and decrement rather than increment...but when using arrays and needing to decrement to element zero you cant decrement just before the test as the value wont go to -1 but be 255 (as i'm using unsigned values) and thus the loop will be endless....hence my code.
  • Testing against zero saves a cycle if you program in asm, you can then do: dec R15, jnz loop
    But is not safe if R15 was set to zero (mistake or calculations), C always uses a cmp lower/higher right away as to not get stuck in a loop.
     
    C compiler's have their own mind how to make loops.

    for (int i = 4; i>0; --i){ // errors in that it stay between 4 and 1, when it should been 3 to and including 0)

     00C022    422F               mov.w   #0x4,R15
     00C024    931F               cmp.w   #0x1,R15
     00C026    3BFB               jl      0xC01E
     ...
     00C02C    533F               add.w   #0xFFFF,R15
     00C02E    3FFA               jmp     0xC024

    for (int i = 0; i<4; i++){  // correct 0-to-3 for four elements

     00C022    430F               clr.w   R15
     00C024    922F               cmp.w   #0x4,R15
     00C026    37FB               jge     0xC01E
     ...
     00C02C    531F               inc.w   R15
     00C02E    3FFA               jmp     0xC024

    if you pre-dec the first use of i, it should work but I don't see the speed savings
    for (int i = 4; i>0;) {kpdTick= --i;  //just and example

     00C022    422F               mov.w   #0x4,R15
     00C024    931F               cmp.w   #0x1,R15
     00C026    3BFB               jl      0xC01E
     00C028    533F               add.w   #0xFFFF,R15
     ...       4F82 0200          mov.w   R15,&kpdTick
     00C02E    3FFA               jmp     0xC024

    As you can see they all use same amount of op-words = same speed. 
    pre or post ++/-- no difference how a compiler sees it as a loop and always put the inc/dec last before it jumps back.
    P.S IAR always like to ADD #FFFF instead of SUB #1, not sure why maybe due to C flags get set different,
    SUB is not an emulated instruction, though I guess you could live without a real SUB in C as it compiles it as a overflow-add

  • What is stack size? Perhaps IRQ causes stack overrun. Try increasing stack size ad check results
  • Hi Ilmars, good thought but as the problem is one of which way the if statement is written its not the stack size.

    Interestingly if i do the following:

    Instead of writing the following
    if(keyIsPressed[i][j] < 4) // every element EXCEPT one will be < 4
    {
    if(keyIsPressed[i][j] != 0) // only 1 will be != 0
    {
    if(++keyIsPressed[i][j] == 4)
    kpdTickCounter300mS = 9599;
    else
    kpdTickCounter100mS = 0;
    }
    }
    i write this (effectively the SAME 24 compare operations, same order of evaluation:
    k = keyIsPressed[i][j];
    if(k < 4)
    {
    if(k != 0)
    {
    if(++keyIsPressed[i][j] == 4)
    kpdTickCounter300mS = 9599;
    else
    kpdTickCounter100mS = 0;

    }

    }

    I then have absolutely NO timing issues!!!

    cheers 

  • for (unsigned int i = 0; i<4; i++){
      if(keyIsPressed[i][0] < 4 && keyIsPressed[i][0] && ++keyIsPressed[i][0] == 4) kpdTickCounter300mS = 9599;
      if(keyIsPressed[i][1] < 4 && keyIsPressed[i][1] && ++keyIsPressed[i][1] == 4) kpdTickCounter300mS = 9599;
      if(keyIsPressed[i][2] < 4 && keyIsPressed[i][2] && ++keyIsPressed[i][2] == 4) kpdTickCounter300mS = 9599;
    }

    How about a partial unrolled loop above, below what middle line assembles to (medium optimization).

     00C04E    92EE 0203          cmp.b   #0x4,0x203(R14)
     00C052    2C0D               jc      0xC06E
     00C054    93CE 0203          tst.b   0x203(R14)
     00C058    240A               jeq     0xC06E
     00C05A    4E5D 0203          mov.b   0x203(R14),R13
     00C05E    535D               inc.b   R13
     00C060    4DCE 0203          mov.b   R13,0x203(R14)
     00C064    926D               cmp.b   #0x4,R13
     00C066    2003               jne     0xC06E
     00C068    40B2 257F 0200     mov.w   #0x257F,&kpdTickCounter300mS
     00C06E

    P.S. the first two cmp's could be rolled in to one in this case (as 0 and 4 can both evaluate to 0 with the use of AND 3)
    if(keyIsPressed[i][0]  &3  && ++keyIsPressed[i][0] == 4) kpdTickCounter300mS = 9599;

  • hi Tony
    As they say...there are many ways to skin a cat. But for THIS instance, the skinning is not the issue.

    i found a very strange, unusual, inexplicable, and concerning anomaly....I need an answer to THAT. Why? because NEXT time i might not realise it is happening again and spend fruitless days searching for the problem in all the wrong places.

    cheers

**Attention** This is a public forum