DSP library square root

abhishek seth9849

Hi,

I was going through the documentation for square root code used by DSP library. It is written in documentation that DSP library uses the formula y(new) = y(old) - ( y(old)^2 - x )/2 to calculate the square root. IT also mentions that this is Newton's method. But when I searched for Newton's method for calculating square root I came up with the formula y(new) = ( y(old) + x/y(old) )/2 or y(new) = y(old) (3 - x*y(old)^2 )/2.

Can anyone please help me find how Newton's method resulted in formula: y(new) = y(old) - ( y(old)^2 - x )/2 for calculating square root?

Thanks

Abhishek Seth

over 15 years ago

0 Mark M over 15 years ago

TI__Mastermind 30180 points

Abhishek,

Let me give this a shot...

Let's start with the Newton's Method Wikipedia article: http://en.wikipedia.org/wiki/Newton%27s_method#Square_root_of_a_number

Take the square root function. We want to solve for x, which is the square root of INPUT.
x^2 = INPUT

Supply this function to Newton's method:
f(x) = x^2 - INPUT

With derivative:
f'(x) = 2x

Plug-in our prediction y_old for x to get a better prediction y_new:
y_new = y_old - f(y_old) / f'(y_old)

This yields:
y_new = y_old - (y_old^2 - INPUT) / (2 * y_old)

With an apparent trick, the y_old from the denominator can be removed...
y_new = y_old - (y_old^2 - INPUT) / 2

After this "trick", the equation still converges to the correct SQRT result, but it may take more iterations. However, it avoids possible overshoot of the first computed y_new. We don't want y_new to grow larger than 1 because our Q.15 fixed-point numbers cannot hold numbers >1. They are bounded by -1 and +1. Finally, this trick avoids the cost of computing a division, which requires many CPU cycles.

In the assembly code for sqrt_16, the equation is optimally rearranged to take advantage of C55x Complex Instructions like SQSM (Square and Subtract) and automatic bit shifts for divide-by-2:
y_new = INPUT / 2 - (y_old^2) / 2 + y_old

Play around with the attached spreadsheet to observe the convergence behavior with and without the y_old division. And please let me know if you find a better explanation for this "trick". I did not write this code.

4377.SQRT_Newtons_Method_Convergence.xls

Hope this helps,
Mark

0 abhishek seth9849 over 15 years ago in reply to Mark M

Intellectual 560 points

Hi Mark,

Thanks for your explanation. But I guess a better accuracy can be achieved in lesser number of iterations if the eqns y(new) = ( y(old) + x/y(old) )/2 or y(new) = y(old) (3 - x*y(old)^2 )/2 are used instead of eqn. y(new) = y(old) - ( y(old)^2 - x )/2. Is there any reason to use the eqn. y(new) = y(old) - ( y(old)^2 - x )/2 ?

Thanks.

0 TommyG over 15 years ago in reply to abhishek seth9849

TI__Genius 13045 points

The reason for the modified version of the equation being implemented is as Mark stated:

"However, it avoids possible overshoot of the first computed y_new. We don't want y_new to grow larger than 1 because our Q.15 fixed-point numbers cannot hold numbers >1. They are bounded by -1 and +1. Finally, this trick avoids the cost of computing a division, which requires many CPU cycles."

If you can guarantee that intermediate computations won't overflow the numerical limits of the device, it would be ok to use the equation you specify. However, the fact that you need to perform division could actually make your version of the square root calculation slower. Since division takes many cycles to implement on this architecture you could spend more CPU cycles to perform fewer iterations of the equation.

Regards.

0 abhishek seth9849 over 15 years ago in reply to TommyG

Intellectual 560 points

Hi Mark,

In the previous post you stated that:

"With an apparent trick, the y_old from the denominator can be removed...
y_new = y_old - (y_old^2 - INPUT) / 2

After this "trick", the equation still converges to the correct SQRT result, but it may take more iterations."

Why does the equation for y_new without y_old in the denominator works? Is there any proof that the equation without y_old in the denominator converges to correct square root value?

Thanks

Abhishek

Processors

Processors forum

DSP library square root