F AST INVERSE SQUARE ROOT
CHRIS LOMONT
Abstract.Computing reciprocal square roots is necessary in many
applications,such as vector normalization in video games.Often,
some loss of precision is acceptable for a large increa in speed.
This note examines and improves a fast method found in source-
code for veral online libraries,and provides the ideas to derive
similar methods for other functions.1
远程电脑控制
1.Introduction
Reading the math programming forum on [1],I
ran across an interesting method to compute an inver square root.
The C code was esntially(my comments):
float InvSqrt(float x)
{
float xhalf=0.5f*x;
int i=*(int*)&x;//get bits for floating value
i=0x5f3759df-(i>>1);//gives initial guess y0
x=*(float*)&i;//convert bits back to float
x=x*(1.5f-xhalf*x*x);//Newton step,repeating increas accuracy return x;
}
The interesting part was the constant0x5f3759df:where did it come from and why does the code work?Some quick testing in Visual
C++[2]showed the code above to be roughly4times faster than
the naive(float)(1.0/sqrt(x)),and the maximum relative error
吃螃蟹能喝茶吗over allfloating point numbers was0.00175228,so the method ems
very uful.Three immediate questions were:1)how does it work,2)
can it be improved,and3)what bit master designed such an incredible hack?
A quick arch on Google for0x5f3759df returned veral hits,
素描头骨
the most relevant being a reference to a thread about this code aphics.algorithms from Jan9,2002[3],and an(incorrect, 12000Mathematics Subject Classification.Primary65G99,65Y99;Secondary
68W40.
Key words and phras.Inver square root,speed tradeoffs,IEEE754.
1
2CHRIS LOMONT
but clo)explanation by D.Eberly[4].However his explanation is illu-minating.Further digging found no correct explanation of this code.It appears in the sourcecode for Quake3,written by legendary game pro-grammer John Carmack,but someone on the net(I cannot nowfind the reference)credited it to Gary Tarolli who was at Nvidia.Can anyone confirm authorship?It also appears in the Crystal Space sourcecode [5],the Titan Engine[6],and the Fast Code Library,although each ems to derive from Quake3.
The motivation to try such an algorithm is more clearly explained in Eberly[4],where he assumes the shift creates a linear interpolation to the inver square root.Note there are veral ways to speed up this code,but this note will not go into further optimizations.There are also faster methods,perhaps table lookup tricks,but most of them have less accuracy than this method.
This note assumes PC architecture(32bitfloats and ints)or similar. In particular thefloating point reprentation is IEEE754-1985[7]. This C code has been reported to be endian-neutral(suppodly tested it on a Macintosh).I have not verified this.Since the method works on32bit numbers it ems(surprisingly)endian-neutral.It is easy to extend the code to other situations and bit sizes(such as type double) using the ideas in this note.Anyway,on to the problems1),2),and 3).
2.Background
Floating point numbers are stored in the PC as32bit numbers;
s E M
bit3130←bits→2322←bits→0
where s is a1bit sign(1denotes negative),E is an8bit exponent,and M is a23bit mantissa.The exponent is biad by127to accommodate positive and negative exponents,and the mantissa does not store the leading1,so think of M as a binary number with the decimal point to the left,thus M is a value in I=[0,1).The reprented value is
x=(−1)s(1+M)2E−127
The bits can be viewed as thefloating point reprentation of a real number,or thinking only of bits,as an integer.Thus M will be considered a real number in I or as an integer,depending on context. M as a real number is M as an integer divided by223.
FAST INVERSE SQUARE ROOT3
3.The Algorithm
The main idea is Newton approximation,and the magic constant is ud to compute a good initial guess.Given afloating point value
x>0,we want to compute1√
x .Define f(y)=1
y2
−x.Then the value
we ek is the positive root of f(x).Newton’s rootfinding method, given a suitable approximation y n to the root,gives a better one y n+1 using
无法原谅简谱y n+1=y n−f(y n)
f (y n)
,n≥0
For the f(y)given,this simplifies to y n+1=1
2y n(3−xy2
n
)which
corresponds to the line of code x=x*(1.5f-xhalf*x*x),where x is the initial guess,which hereafter will be called y0.
The line of code i=0x5f3759df-(i>>1)computes this initial
guess y0,roughly by multiplying the exponent for x by−1
2,and then
picking bits to minimize error.i>>1is the shift right operator from C,which shifts the bits of an integer right one place,dropping the least significant bit,effectively dividing by2.Eberly’s explanation was that this produced linear approximation,but is incorrect;we’ll e the guess is piecewi linear,and the function being approximated is not the same in all cas.However I would guess his explanation was the inspiration for creating this algorithm.
4.The Magic Number(s)
Thus we are left withfinding an initial guess.Suppo we are given afloating point value x>0,corresponding to the32bits
0E M
as above.Let the exponent e=E−127.Then x=(1+M)2e,and the
desired value y=1√
x =1
√
1+M
魔方六面
2−e/2,treating e and M as real numbers,风景短句
NOT as integers.For the general ca we take a magic constant R, and analyze y0=R−(i>>1),where the subtraction is as integers,i is the bit reprentation of x,but we view y0as a real number.R is the i
nteger reprenting afloating point value with bits
0R1R2
<,R1in the exponentfield and R2in the mantissafield.When we shift i right one bit,we may shift an exponent bit into the mantissa field,so we will analyze two cas.
For the rest of this note,for a real numberαlet α denote the integer less than or equal toα.
4CHRIS LOMONT
4.1.Exponent E Even.In this ca,when we u i>>1,no bits from the exponent E are shifted into the mantissa M,and E/2 =E/2. The true exponent e=E−127is odd,say e=2d+1,d an integer. The bits reprenting the initial guess give the new exponent
R1− E/2 =R1−E/2
=R1−e+127
2
=R1−2d+1+127
2
小说总裁=R1−64−d
We require this to be positive.If it were negative the resulting sign bit would be1,and the method fails to return a positive number.If this result is0,then the mantissa part could not borrow from the exponent, which we will e below is necessary.Since this must be true for any even E∈[0..254],we require R1≥128.
Since the exponent E is even,no bits from it shift into the mantissa under i>>1,so the new mantissa is R2− M/2 (as integers),assuming R2≥ M/2 .The initial guess is then
y0=(1+R2−M/2)2(R1−64−d)−127
=(1+R2−M/2)2R1−191−d
=(2+2R2−M)2R1−192−d
where M/2replaced M/2 ,since the resulting error is at most2−24 in the mantissa(as a real number),which is insignificant compared to other errors in the method.
If R2<M/2,then upon subtracting R−(i>>1)as integers,the bits in the mantissafield will have to borrow one from the exponentfield (this is why we needed the new exponent positive above),thus dividing y0by two.The bits in the mantissafield will then be(1+R2)− M/2 , which are still in I.Thus if R2<M/2
y0=(1+(1+R2−M/2))2(R1−64−d)−127−1
=(2+R2−M/2)2R1−192−d
where we write the exponent the same as in the non-carrying ca.
If we define
βR2 M =
2R2−M:R2≥M/2
R2−M/2:R2<M/2
FAST INVERSE SQUARE ROOT 5
then we can combine the two y 0equations into one:
y 0=(2+βR 2M )2
R 1−192−d Note that βR 2M is continuous,and differentiable except along R =
M/2.Substituting e =2d +1,the value y =1√x we are trying to approximate is 1√
1+M 2−e/2=1√1+M 2−d −1/2=1√2√1+M 2−d
The relative error y −y 0y ,which is how we will measure the quality of the method,is 1√2√1+M 2−d −(2+βR 2M )2R 1−192−d 1√2√1+M 2−d which simplifies to |ε0(M,R )|,ε0(M,R )=1−√2√1+M (2+βR 2M )2
R 1−192Note that this no longer depends on d ,and that M can be any value in I =[0,1).This formula is only valid if E is even,thus has a hidden dependence on E .
Suppo we want R 2so that the relative error max M ∈I |ε0|<1/8.Then
隐性饥饿
18
>max M ∈I |ε0|≥|ε0(0,R 2)|= 1−√2(2+2R 2)2R 1−192 = 1−(1+R 2)2
R 1−190.5 Expanding,
−1/8<1−(1+R 2)2R 1−190.5<1/898≥98(1+R 2)>2R 1−190.5>78(1+R 2)>716
where the last step ud the fact that R 2∈I ⇒(1+R 2)∈[1,2).Taking log 2and adding 190.5gives 190.7>R 1>189.2,so R 1=190=0xbe is the unique integer that has any chance of keeping the relative error below 0.125for even E .In bit positions 24through 30,this gives the leading (0xbe>>1)=0x5f part of the magic constant R (as well as requiring bit 23to be 0).