Modulus Efficiency

Eric Winokur

Hello,

I am currently writing code to convert 12 and 14 bit integers to characters using the CC430. I am currently doing this by:

                    output[2] = '0'+((var/1000)%10);
                    output[3] = '0'+((var/100)%10);
                    output[4] = '0'+((var/10)%10);
                    output[5] = '0'+((var/1)%10);

where the array output is just a char array. I then ship this data off to MATLAB, where it is reconverted to integers. The problem I am having is that it seems the modulo operand is taking too many clock cycles. Is there a more efficient way of doing this, or does anyone know of a good MATLAB function that can handle integers instead of chars coming into the serial port? I am trying to fread function but to no avail at this point.

Regards,

-Eric

over 14 years ago

0 Jeff Tenney over 14 years ago

Guru 12160 points

Hi Eric,

I would use hexadecimal notation instead of decimal. Still four characters, but the characters are 0 through F instead of just 0 through 9.

output[2] = '0' + ((var >> 12) & 0xF);
output[3] = '0' + ((var >> 8) & 0xF);
output[4] = '0' + ((var >> 4) & 0xF);
output[5] = '0' + (var & 0xF);
for (int i = 2; i < 6; i++) if (output[i] > '9') output[i] += 'A' - ('9' + 1);

or something like that.

Surely Matlab can be convinced to parse the data as hex instead of decimal.

Jeff

0 Jens-Michael Gross over 14 years ago

Guru 227245 points

Eric Winokur said:
output[2] = '0'+((var/1000)%10);

Two divisions in one line. That's in the top 3 of the performance killing league. :)

If oyu use the board search function, you might find an older thread where I wrote a complete conversion function based on successive subtraction.
The code is somewhat larger, but the execution was by some magnitudes faster.

Basically it counts, how often you can subtract 1000 from your value (and adds this to '0'), then it checks how often you can subtract 100 from the value (which is <1000 after the last step) and so on.

0 Norman Wong over 14 years ago in reply to Jens-Michael Gross

Guru 26430 points

Usually the underlying intrinsic division function that the compiler calls will give quotient and remainder. In theory, you could reduce the two divisions to one. On the Linux and Windows, it's supposed to be accessed via the div() function. Lazy implementations would define this as a macro of div and mod rather than a low level intrinsic. As you say, best to avoid the divisions entirely. I implemented your subtraction algorithm as a function. Tested on a PC. Should work on the MSP with some modification for 16 bit int.

/*-----------------------------------------------------------------------------
Lookup table for units for each position. Order is from largest to smallest.
Example below assumes "int" is 32 bits. Last value must always be one.
-----------------------------------------------------------------------------*/
static const unsigned int g_units[]=
{
1000000000, /* 10^9 */
   100000000, /* 10^8 */
    10000000, /* 10^7 */
     1000000, /* 10^6 */
      100000, /* 10^5 */
       10000, /* 10^4 */
        1000, /* 10^3 */
         100, /* 10^2 */
          10, /* 10^1 */
           1 /* 10^0 */
};

/*-----------------------------------------------------------------------------
Converts signed integer to base 10 string. Caller must allocate enough
space for result. Returns pointer to beginning of string. Same as passed in.
Assuming 32 bit signed integers and value is 1999999999, there will be 82
subractions.
-----------------------------------------------------------------------------*/
char *itod(int v, char *s)
{
register char        *p = s;
register unsigned int vabs;
register unsigned int unit;
register int          n;
register int          i;

/* Handle 0 as special case. Code below assumes a non-zero value. */
if(v == 0)
{
    *p++ = '0';
    *p++ = '\0';
     return(s);
}

/* Make the value absolute and emit a '-' is required. */
if(v < 0)
{
   *p++ = '-';
    vabs = -v;
}
else
    vabs = v;

/* Find first unit */
i = 0;
for(;;)
{
    unit = g_units[i];
    if(vabs > unit) break;
    i++;
};

/* Loop through each unit. */
for(;;)
{
    unit = g_units[i];
    if(unit==1) break;

    /* Subtract units until no more of this unit. */
    n = 0;
    while(vabs > unit)
    {
      vabs -= unit;
      n++;
    }

   *p++ = (char)('0' + n); /* Emit character. */
    i++;                   /* Advance index to next unit. */
}

/* Last unit is always one. Special case. Finish up. */
*p++ = (char)('0' + vabs); /* Emit last character. */
*p++ = '\0';               /* Null terminate.*/

return(s);
}

0 Jens-Michael Gross over 14 years ago in reply to Norman Wong

Guru 227245 points

Norman Wong said:
static const unsigned int g_units[]=
{
1000000000, /* 10^9 */

You don't get a compiler warning?

Your constants are definitely out of range for an int array, at least on the MSP, where an int is 16 bit.
Otherwise, nice.

Would be even nicer if you could define a fixed-point position for fiexed point arithmetics.
Well, you can add the point later after the conversion, by inserting the point at the correct position.

0 Norman Wong over 14 years ago in reply to Jens-Michael Gross

Guru 26430 points

I don't actually have a MSP430 platform. Sounds like nice processor to work with. Stuck doing horribly complicated stuff on Linux. I do expect my code to get an error on any platform where "int" is less than 32 bits. Code would produce incorrect results on a processor where "int" is larger than 32 bits.

Fixed point conversion would require a bit of tweaking. The function would need to support leading zeroes and fixed width. I don't think it is quite as simple as inserting '.' after conversion because fixed point is base 2 not base 10. Pseudo code would be something like this:

itod(string, value >> FRACTION_BITS, No_leading_zeroes_flag);
move to end of string
add '.' to string
itod(sting, value & FRACTION_MASK, Fixed_width_with_leading_zeroed);

0 Jens-Michael Gross over 14 years ago in reply to Norman Wong

Guru 227245 points

You're right, fixed point arithmetics usually is base 2, but base10 is also often used.
I try to convince people to use base10 fixed point rather than floats on the MSP. Just multiply one of your operands by 1000 (preferrably the one that has the fraction), or even with larger values (e.g. 4096000) and then do a >>12 a tthe end and you ahve your result in zero time with three decimal digits. Without float.
Only to insert the decimal point when converting to a string.
Of course you have to be careful when writing the formula. base2 is more comfortabel for general use, but for a single formula (as it is often the case on microcontrollers)...
Q-numbers are even better, but for someone who used to use floats and doubles...

0 Norman Wong over 14 years ago in reply to Jens-Michael Gross

Guru 26430 points

It's never ocurred to me use base 10 for fixed point. I guess I've implicitly used it by switching units, eg from Volts to micro-Volts. Base 2 has the slight advantage of of easily separating the whole part from the fraction. Also avoids a division in a couple of places. First is in "casting" from fixed point to integer. Second is in normalization after a multiplication of two fixed point numbers.

Not quite with you on the "x 1000" and ">>12". Is that 1000 or 0x1000?

I'm not familar with Q-numbers. I assume that's TI's IQ Math library.

0 Jens-Michael Gross over 14 years ago in reply to Norman Wong

Guru 227245 points

The x1000 is decimal, for three decimal digits. The >>12 belongs to the additional *4096 I used to increase precision. Sorry, I compacted it too much.

Example:

float result = 0.0139 * ADC12MEM0 + 7.3222;

Multiply the two constants with 4096 and 1000, then add 4096/2 for rounding and then shift the result back by 12 bit. Gives the result with three digits in plain integer arithmetic:

int result = ((56934UL*ADC12MEM0)+29991731UL+2048)>>12;

The intermediate multiplication result is 28 bit only (ADC12MEM0 is a 12 bit conversion result), so we could go as well for 4 decimal digits without need to do 64bit multiplication, but the result with 3 digits will conveniently fit into 16 bit.

The difference in speed and size is huge. The formula only needs to be calculated once, uses the hardware multiplier (if available), doesn't need the float library and gives the same result.

Norman Wong said:
I'm not familar with Q-numbers. I assume that's TI's IQ Math library.

It has nothing to do with TI. It is base2 fixed point integer arithmetic. Just some 'annotation rules' around the core math: http://en.wikipedia.org/wiki/Q_(number_format)
The MSPs hardware multiplier has AFIK some support for fractional numbers and might be compatible. I must admit I never tried. When I worked with the MPY32 hardware multiplier, I didn't care for fractional numbers and ignored thsi section. Later I didn't come back to it. Maybe if I ever really need them. Currently, all fractional math comes down to a few formulas, and the decimal fixed point performs well enough.

0 Norman Wong over 14 years ago in reply to Jens-Michael Gross

Guru 26430 points

Okay...I think I understand.

Floating point calculation (Units)
float result = 0.0139 * ADC12MEM0 + 7.3222;

Floating point calculation (milli-Units)
float result = 13.9 * ADC12MEM0 + 7322.2;

Integer conversion of floating point calculation (milli-Units) with rounding. The end result will need to use a 32 bit variable.
long result = (long)(13.9 * ADC12MEM0 + 7322.2 + 0.5);

Change to base 2 fixed point of 20 bit whole and 12 bit fraction. A 12 bit fraction means every 4096 is one whole number. Smallest number that can be represented is 1/4096 or 0.000244140625. Multiply above formula. Essentially a <<12.
long result = (long)(56934UL * ADC12MEM0 + 29991731UL + 2048);
long result = (long)(0xDE66 * ADC12MEM0 + 0x1C9A333UL + 0x800);

Rewrite above formula in imaginary hexadecimal with '.' to show the split between whole and fraction.
long result = (long)(0xD.E66 * ADC12MEM0 + 0x1C9A.333UL + 0x0.800);

Convert the fixed point result to a integer (milli-Units) by dividing by 4096 or >>12. This is truncation operation where the fraction is shifted out or the whole part is shifted down. The end result will fit into a 16 bit integer?
int result = ((56934UL * ADC12MEM0) + 29991731UL + 2048) >> 12;

0 Jens-Michael Gross over 14 years ago in reply to Norman Wong

Guru 227245 points

Your description of the process is way better than mine. Yes, you got it as I meant it.

Norman Wong said:
The end result will fit into a 16 bit integer?

In this particular case, yes. Because ADC12MEMx is 0x0.FFF at max, so the maximum value is 0xDE59.19A+0x0.800+0x1c9a.333 = 0xFAE8.CCD.
(quick check: 16bit*12bit = 28Bit, +28bit = 28 or 29bit >> 12 bit = 16 or 17 bit. 16 bit with these values.)

It is pure coincidence and I swear I didn't pick the values so it works :) It was an actual formula present in a project I was consulted for just a few days ago. (It turned out that the formula itself was wrong, but that's another story)

If it doesn't fit for a different formula (different constants), then either only 2 decimal digits (*100) or a long result can be used.

Checking the value range is of course an important and integral part of any compute rmath.
Even with doubles: in a PC project, we used doubles for counting energy pulses. Suddenly the counter stopped counting - we reached a point where the minimum step was >1, so any increment by 1 was simply ignored. Nobody had imagined this limitation before (and it turned out that the counts were becoming unprecise quite soem time before that point).

0 Norman Wong over 14 years ago in reply to Jens-Michael Gross

Guru 26430 points

Yeah, avoiding overflow is always a problem with fixed point. Big headache with complicated formulas. Floating point does cope better with changes in magnitude much better.

A 32 bit fixed point of 0xFAE8.CCD will become a 16-bit integer of 0xFAE8. Unsigned value has intruded into the sign bit. For those that might be referencing your formula, add "unsigned" to variable declaration:

unsigned int result = ((56934UL * ADC12MEM0) + 29991731UL + 2048) >> 12;

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

Modulus Efficiency