Resource The best way to compute sin/cos

Discussion in 'Spigot Plugin Development' started by Schottky, Jan 13, 2020.

?

Was this helpful?

  1. Yes

    4 vote(s)
    36.4%
  2. No

    1 vote(s)
    9.1%
  3. I like Bacon

    6 vote(s)
    54.5%
  1. When it comes to cosmetic effects, a lot of people want to render geometric forms like a particle-circle or a particle-sphere. This might look something like this:

    Code (Java):
    for (double rad = 0; rad <= 2 * PI; rad += 2*PI / 360) {
    Location location = player.getLocation().add(
        Math.sin(rad) * radius,
        1,
        Math.cos(rad) * radius);

    Particles.display(ParticleEffect.FLAME, location, 1);
    }
    What this piece of code does is to compute the location of each particle that has to be rendered. In this thread, I do not want to go into detail on the mathematical basics behind this calculation, rather, I want to discuss the computational efficiencies of sin/cos and their alternatives.
    When it comes to the computation of such non-linear functions, there are 3 ways you could approach that problem

    1) Use a lookup-table. This approach is really simple. You don't have enough time to compute complex functions? Simply store pre-computed values in an array and get them whenever they are needed.
    2) Use standard libraries. This approach is (probably) the most common one and in most cases, there is no reason not to do that.
    3) Use an approximation of your function. This, combined with point 1, is what programming languages do behind the scenes.
    In mathematics, there exists Taylor's theorem that says (in a very simplified version) that you can approximate any function at least locally by a polynomial of degree k.
    Take for example the function f(x) := sin(x). You can compute this function in the interval [0...PI], by computing a much simpler function g(x) := a + b * x + c * x^2 - with suitable values a, b and c, and you will get a pretty good result as long as you stay inside a suitable area. This is the concrete example:
    [​IMG]

    You can see the approximation of f(x) by g(x) using a = 1; b = 0; c = -1 / 2 at the point pi/2. In this case, k = 2, but you can increase k to get a better result. Interestingly, when you increase k and let it approach infinity, the function g(x) will fit sin(x) perfectly.

    However, your computer does all of this already using some fancy hardware acceleration so there is no need to do it yourself. Why post this thread then?
    Well, because in most cases you do not need the sin of every value that can be expressed by a double-precision floating-point. In most cases (and especially in the case mentioned above), you only want a discrete number of N points, for example, N=360. The question that arises is whether there exists a more efficient way to compute these 360 discrete steps, without using Math.sin().

    First, I want to summarize the answer to this Stack-Overflow question:
    https://stackoverflow.com/a/35400679/12190301

    The answer to the question of what the best way computing the sin and the cosine of something is can be expressed by the following statements:
    If the quarried angle is inside the bounds of [0...45] degrees (or from 0 to Pi divided by 8), Math.sin() and Math.cos() yield the best results.
    If the quarried angle is inside the bounds of [-360...360] degrees, a lookup-table is around 20 times (!) faster. So already, you should use a lookup-table instead of Math.sin() or Math.cos() inside something that might have a fairly high tick-rate. But can we go faster? Yes, we can.

    In the answer to the StackOverflow-question above, the timesteps are still not as discrete as what we require. The question was an answer to the most efficient way of computing the sin(x) while being able to insert any number and get a valid result.
    However, what happens if we only want to compute 360 or so discrete time-steps?
    Let's test that!

    This is how I test the options. I will note, that this benchmark is not ideal, but it should suitably predict the quality:

    Code (Java):

    static void doMeasure(@NotNull Runnable testingFunc) {
        // warmup
        for (int i = 0; i < 1000; i++) {
            testingFunc.run();
        }
        // actual meassurement
        int N = 1_000_000;
        long start = System.nanoTime();
        for (int m = 0; m < N; m++) {
            testingFunc.run();
        }
        long end = System.nanoTime();

        System.out.println((double) (end - start) / 1_000_000_000D + " seconds elapsed");
    }
     

    An insertion of Math.sin(), evaluated at 0 to 360 degrees results in a time elapsed of approximately 3.3 seconds. Note that I did not convert the degrees to radians "on the go", but pre-computed them.

    I want to do 2 tests, one that takes a Lookup-table that only contains 90 values, and another lookup-table that contains 360 values. My program only handles values in the interval of integers from 0 to 360, because more is not required in this case. If a quarried value is greater than 90, the result will be adjusted properly. This is the code:

    Code (Java):
        static void sinlut90() {
            double sin;
            for (int inDeg = 0; inDeg < 360; inDeg++) {
                if (inDeg < 180) {
                    if (inDeg < 90) {
                        sin = SIN_LUT_90[inDeg];
                    } else {
                        sin = SIN_LUT_90[180 - inDeg];
                    }
                } else {
                    if (inDeg < 270) {
                        sin = -SIN_LUT_90[inDeg - 180];
                    } else {
                        sin = -SIN_LUT_90[360- inDeg];
                    }
                }
            }
        }

    The memory usage here is approximately 8 bytes * 91 entries = 728 bytes (under 1 kB).
    The benchmark gives us a result of 0.5 seconds. Not too bad, we are already six times faster than Math.sin()
    Now what happens, if we simply use a pre-computed table with 360 entries?

    Code (Java):

    static void sinLut360() {
        double sin;
        for (int deg = 0; deg < 360; deg++) {
            sin = SIN_LUT_360[deg];
        }
    }
     

    memory usage (again, approximated): 8 bytes * 361 entries = 2888 bytes (~2.9 kB)
    The results of this test are pretty neat. The test reports that one million times 360 sin-values are "computed" in a time of 0.005 seconds, in other words, we are over 600 times faster than Math.sin()!

    So, to sum all of this up and to answer the first question; should one use Math.sin() or a lookup-table?
    Well, that depends on what you need.
    If you simply want to compute a small number of sin-values every once in a while, use Math.sin(). It is way more flexible and the best general-purpose solution. Many CPU's have hardware-accelerators that java uses. An example in the "Minecraft-world" could be the computation of something using the player's pitch and yaw.

    If you can be more specific, for example, you can guarantee that your values will be in the range of [-360...360], and you want to get maximum efficiency, use a lookup-table.

    However, if you can be even more specific and you can guarantee that your values are inside the interval of [0...45], use Math.sin() instead.

    If you need 360 predefined values to compute something like a "fire-circle" around a player, definitely use a lookup-table. Even if you might complain about the testing-envireĆ³nment, it still shows an increase in performance that is independent of the environment.

    Thank you for reading, please give me all of your feedback and have a nice day!
     
    • Like Like x 1
    • Friendly Friendly x 1
    • Useful Useful x 1
  2. Lookup tables are definitely the way to go if you're calling trig functions frequently within your program, with that, the computations can be significantly faster depending on the implementation you're using. I use, and recommend, Riven's sin/cos lookup table.

    Code (Java):

        public static final class Riven {

            private static final int SIN_BITS, SIN_MASK, SIN_COUNT;
            private static final float radFull, radToIndex;
            private static final float degFull, degToIndex;
            private static final float[] sin, cos;

            static {
                SIN_BITS = 12;
                SIN_MASK = ~(-1 << SIN_BITS);
                SIN_COUNT = SIN_MASK + 1;

                radFull = (float) (Math.PI * 2.0);
                degFull = (float) (360.0);
                radToIndex = SIN_COUNT / radFull;
                degToIndex = SIN_COUNT / degFull;

                sin = new float[SIN_COUNT];
                cos = new float[SIN_COUNT];

                for (int i = 0; i < SIN_COUNT; i++) {
                    sin[i] = (float) Math.sin((i + 0.5f) / SIN_COUNT * radFull);
                    cos[i] = (float) Math.cos((i + 0.5f) / SIN_COUNT * radFull);
                }

                // Four cardinal directions (credits: Nate)
                for (int i = 0; i < 360; i += 90) {
                    sin[(int) (i * degToIndex) & SIN_MASK] = (float) Math.sin(i * Math.PI / 180.0);
                    cos[(int) (i * degToIndex) & SIN_MASK] = (float) Math.cos(i * Math.PI / 180.0);
                }
            }

            public static final float sin(float rad) {
                return sin[(int) (rad * radToIndex) & SIN_MASK];
            }

            public static final float cos(float rad) {
                return cos[(int) (rad * radToIndex) & SIN_MASK];
            }
        }

    Here is where it originates from: http://www.java-gaming.org/topics/extremely-fast-sine-cosine/36469/view.html


    Some benchmarks:
    Code (HTML5):
    math_default_sin   : Average Error 0.00000 / Largest Error 0.00000 - Performance 207.574 ns/op
    math_riven_sin     : Average Error 0.00060 / Largest Error 0.00224 - Performance   7.054 ns/op

    math_default_cos   : Average Error 0.00000 / Largest Error 0.00000 - Performance 206.086 ns/o
    math_riven_cos     : Average Error 0.00060 / Largest Error 0.00224 - Performance   7.306 ns/op
     
    • Useful Useful x 2