This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Sporadic floating point division/_cast32uto32f errors

Other Parts Discussed in Thread: MSP430F2350

Hi,

I'm experiencing sporadic calculation failures with one MSP430F2350 processor. I've tried the same code on a different board with another MSP430F2350 and it never fails. My question is if the problem relies completely on the specific processor or maybe on inputs from that particular board.

I have limited the code to do nothing than a useless floating point division as follows:

 ...
 unsigned long ulTemp = 90000;
 while ( 1 )
    {   
        float fTest = (float)ulTemp / 1.0f;
  
        if ( fTest == 8.10288043E+31 )
        {           
            if ( P2IN & 0x04 )
                P2OUT &= ~0x04;
            else
                P2OUT |= 0x04;
        }
    }

The watchdog timer is disabled and the external crystal (14.7456 MHz) is set as main and sub clock.

fTest is 80% of the time 90000.0 as it should be, but the rest of the time it is 8.10288043E+31. That behaviour shows up with the debug and the release version.

If I change ulTemp to a float variable, the result is always correct (also with more complicated divisions ;). On the other hand, if I just cast a variable from unsigned long to float, it never fails either. Only the combination of floating point division and the unsigned long to float cast seem to cause the problem.

Can anyone think of a reason for the unusual result, so that I can narrow the problem down? Could it be that a register like R6 or R7 is faulty on that particular processor or is it possible that some other condition on that particular board can cause that behaviour?

Maybe I should also mention that the processor seems to work perfectly otherwise. I have a 9KB code running on it and with exception of the division/cast part, everything works fine.

I'm thankful for any kind of ideas!

  • The MSP has no floating point calculation unit.

    This means that floating point operations are slooooooooooo(...)ow. There are other means of handlign fractional calculations like Q-numbers (see a different thread I wrote to yesterday)

    Anyway, since ther eis no hardware flowting point unit, it cannot fail. The calcultations are done by plain integer CPU work. So either the CPu itself is faulty (and then you should see errors anywhere else too) or it is not a CPU fault.

    Things like stack overflow or such shouldn't be the reason too, since you tested with a different board with same processor Provided you don't have additional code in the application which does different things on the two CPUs. So strip down your test code to jus thtis one funciton and nothing else

    What's left is maybe faulty ram that happens to exist right at the point where the data for the calculation is stored (not very likely) or a hostile environment. Maybe you're gitting spikes on the VCC or so which cause the unit to fail temporarily and undetected.

    Anyway, I don't really understand you LED code. If the function fails, you invert the P2.4 pin, but only if it is set. Why don't you just toggle the bit? If it is clear, it will be set, if it is set, it will be cleared. The C instruction is P2OUT^=0x04. Moreover, you can read from PxOUT what you wrote in it last time. P2IN shows the real pin state in case the output driver is not strong enough to overcome any external signal. This may be useful sometimes, but in this case, if P2IN doesn't correspond to P2OUT, your pin toggling would have no effect anyway.

    Alex said:
    Could it be that a register like R6 or R7 is faulty

    Not really. Since the code is fixed and the processor does not dynamically assign a register to the code during the loop, it should then fail always or never. Well, on the original sparc processor, there were register sets which were dynamically switched on each function call - made the register saving on stack unnecessary. There a function could work with different physical registers on different calls, but not on the MSP.

  • Thx for your help!

    It turned out that the output line to the external crystal was the reason for the weird behaviour. We could detect slight fluctuations that shouldn't be there. I guess it was a good thing that I had such a time consuming operation like the floating point division, otherwise that fault on the board might have not been detected in time.

    The floating point division happens only once a second and there are no hard real-time requirements, that's why I use it. But thx for the tip with the Q-numbers.

     

**Attention** This is a public forum