This Math Trick Makes Game Graphics So Much Faster

Division and exponentiation are extremely expensive for a CPU to directly compute. However, a method that was put forth in a paper published in the 1990s called the Fast Inverse Square Root calculates an inverse square root $\frac{1}{\sqrt{x}}$ about four times faster. You may wonder what the significance of this is. Well, in game or physics engines, the computer CPU processes billions of trillions of mechanics calculations every second. From things like motions of objects to angles of incidence and reflection for lighting and shading, the $\frac{1}{\sqrt{x}}$ calculation is ubiquitous. Therefore, any slight improvements in the speed of such calculation can result in huge benefits!

Newton's method

To understand how this works, we must first understand Newton's method. Newton's method offers an approximation for the root of a function, or to solve for $f(x)=0$. We start with an initial approximation. Call that $x_0$. Most likely, this quick guess is probably not going to be the actual root. We then draw a tangent line at $x_0$, represented by the blue line in the picture below. You may notice that the x-intercept of that tangent is closer to the root than $x_0$ does. Now, simply use the x-intercept of the tangent line as the initial approximation and repeat the process. How do you do this mathematically? Well, the tangent line of any point of a graph is the derivative. Given the gradient $m$ of a line, and a point $(x_0, y_0)$, the line has equation $y=m(x-x_0)+y_0$. In this situation, the line has gradient $f′(x_1$), and passes through $(x_1, y_1)$, so it has equation $y=f'(x_1)(x-x_1)+y_1$. Finally, setting $x = 0$, the new x-intercept becomes $x = x_1 - \frac{f(x_1)}{f'(x_1)}$. To implement this in iterations, the Newton's method can take in the form of $x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$.

The function

The overarching idea of this method is the usage of Newton's method to approximate the root of the function. The first few lines convert the number back and forth between 32-bit and float number, in order to process the calculation to set the initial "guess" of Newton's method. The power of this function is that great initial guess.

float Q_rsqrt( float number )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;   
    i  = 0x5f3759df - ( i >> 1 ); 
    y  = * ( float * ) &i;
    y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
//  y  = y * ( threehalfs - ( x2 * y * y ) );   // 2nd iteration,
                                           
    return y;
}

Bits and floating number

Floating-point numbers are stored by computers in mantissa-exponent form (which is just a computer name for decimal point numbers) But instead of explicitly doing division (expensive for the CPU), the code uses another clever hack: it shifts bits. Right-shifting by one position is the same as dividing by two (you can try this for any power of 2, but it will truncate the remainder). And if you want to get a negative number, instead of multiplying by -1 (multiplications are expensive), just subtract the number from "0" (subtractions are cheap).

So, the code converts the floating-point number into an integer. It then shifts the bits by one, which means the exponent bits are divided by 2 (when we eventually turn the bits back into a float). And lastly, to negate the exponent, we subtract from the magic number 0x5f3759df.

What's 0x5f3759df, the initial guess

Fun fact, the person who contributed to this algorithm, Chris Lomon, said:

Which actually is doing a floating point computation in integer - it took a long time to figure out how and why this works, and I can't remember the details anymore

I actually don't really understand what this really does. I just know that it minimizes some sort of error when shifting the bits. I am going to quote an article that explains this process.

When shifting the number’s bits to the right by one, there are two possible cases: either the number’s exponent is even or it is odd. The magic hex number, 0x5f3759df, was chosen such that it minimizes the error between the two cases. Thus, whether the bit-shifted number has an odd or even exponent, subtracting the number from the magic number yields an accurate approximation of the inverse square root of the original number (once the result, i , has been reinterpreted as a float)

Also, there are several numbers who can probably serve to have a similar effect, but this just happens to be chosen.

Lastly, here is an image for you to visualize how much faster this method is compared to the direct inversion of square root

Sources

Steven Oh 2021/05/12

This author is mysterious!

This Math Trick Makes Game Graphics So Much Faster

Newton's method

The function

Bits and floating number

What's 0x5f3759df, the initial guess

Sources

More From Us

Internship at RIKEN 7: Graphical User Interface

The Schwarzschild Radius: a Glimpse of Relativity

Internet of things: establishing wifi connection with your ESP8266 board