Resource Machine Learning Killaura Detection in Minecraft

Discussion in 'Spigot Plugin Development' started by NascentNova, Feb 6, 2018.

  1. This is a detailed explanation of Snow Leopard, an open-source project which recognizes player aim patterns based on machine-learning. In this article, you will find the most detailed research result about detecting killaura using Machine Learning in this forum.

    Maybe after this article is released, everything described below will be cracked and become outdated, but I promise they really work so far.

    I may never know whether the decision to release this is right, but I'm sure there's a positive impact on it: It can draw a lot more attention to the average developer and it will most likely increase the quality of ML anti cheats we have.

    Machine Learning Killaura Detection in Minecraft

    Nascent Nova, 1 February 2018

    Abstract
    Cheating is really an issue on modern Minecraft servers. However, current existing anti-cheat solutions mainly rely on hard-coded checks, which is difficult to maintain and update. This thesis illustrated an alternative way to detect cheaters based on learning vector quantization algorithm in detail, which is more flexible and yields better results.

    Catalog

    1. Introduction
    2. Characterization of player’s behavior
    ....2.1 Hitbox
    ....2.2 Movement: Head rotation
    3. Feature Engineering
    ....3.1 Collecting data
    ....3.2 Design of dataset
    4. Artificial Neural Network
    5. Conclusion
    6. Bibliography & Appendix

    Note: This post is extracted from an unpublished paper I wrote previously. It may contain redundant content and too formal expression.
    Note: Some items are expanded or deleted due to the average programming level of the forum.
    Ps. Don't forget to leave a rating ;)


    The github link:
    https://github.com/Nova41/SnowLeopard
    Train: /eac train <category-name>
    Test: /eac test <player-name> <seconds> I prefer 15 seconds
    Tested on spigot 1.8.8
    There is only one permission in the plugin: encanta.ac
    It is responsible for all sub commands under /eac. Give urself op to bypass the permission


    ** JAR DOWNLOAD LINK **
    https://www.spigotmc.org/resources/snowleopard.55185/
    Download from here only if you do not have a compiler to build the source! This may be not up-to-date.
     
    #1 NascentNova, Feb 6, 2018
    Last edited: Jun 7, 2018
    • Winner Winner x 8
    • Useful Useful x 8
    • Like Like x 5
  2. 1. Introduction

    The discussion of detecting killaura based on machine learning never ends since konsolas demonstrated his research result. However, at least in this forum, there are very few published feasible schemes or sporadic attempts. Some of them even illustrated unclearly. It is sad that even some people developed premium plugins based on a public idea and sold them to those outsiders. What is more ridiculous is that some plugins even do not work...

    Under such situation, I started to make my own killaura-detecting plugin. After a long period of debugging it finally got results reached the level I accepted. I have posted several threads about it, they can be located at below.

    [1] Machine Learning Anti-Killaura Ideas & Discussion. - Discussion in 'Spigot Plugin Development' started by NascentNova, Apr 9, 2017. https://www.spigotmc.org/threads/machine-learning-anti-killaura-ideas-discussion.231396/

    [2] A future-extraction algorithm for ML Anti-Killaura. - Discussion in 'Spigot Plugin Development' started by NascentNova, Jan 1, 2018. https://www.spigotmc.org/threads/a-future-selection-algorithm-for-ml-anti-killaura.294033

    [3] Killaura Detection Based on Neural Network - Discussion in 'Spigot Plugin Development' started by NascentNova, Jan 20, 2018. https://www.spigotmc.org/threads/killaura-detection-based-on-neuron-network.298042/
     
    #2 NascentNova, Feb 6, 2018
    Last edited: May 18, 2018
    • Winner Winner x 2
    • Useful Useful x 2
  3. 2. Characterization of player’s behavior

    2.1 Hitbox

    We all know the hitbox in Minecraft. You can find the definition of hitbox on Minecraft Wiki: Hitboxes are regions that describe how much space an entity takes up, which can be shown by pressing the F3+B keys. When they are shown, a white outline will be seen showing the location of the entity and the space it takes up. There is also a flat red rectangle near its "eyes". This is the "line of sight", which is where its in-game eyes are, which could be independent of where they appear to be. Notice how the line of sight wraps all the way around the entity. This explains how it is impossible to sneak up behind a mob without it seeing the player. There is also a straight blue line that extends away from the line of sight. This line points in the direction the mob is facing.

    hitbox_illustrate.png

    The relationship of hitbox with PVP in Minecraft is, the click within the hitbox will be confirmed valid.

    The picture indicates that the range of Hitbox is large. The detection based on the fact that it is impossible for humans to click at the same position of the hitbox continuously in common PVP conditions.

    Therefore, according to the fact above, the use of killaura could be detected. This is the classical Angle Check. To deal with the detection, developers of hacked clients have come up with a variety of ways to bypass it, for example, they can let the cursor click the random positions within the hitbox when using killaura or move smoothly when the targeted player moves.

    Analysis based on large data indicates even in such situation, the clicked positions are completely random, the killaura is still detectable. To implement detection the Machine Learning is introduced.

    When attacking entities, the real humans have a tendency in clicking which differs from totally random click of killaura. Based on pattern recognition, machine learning has the ability to learn the behavior of the player and abstract the behavior. This abstraction can be expressed by vectors, which provides possibilities of constructing high dimensional Euclidean spaces and classifying.

    2.2 Movement: Head rotation

    Minecraft server does not provide a native method to access the position of one click within the hitbox. However, the position could be obtained by calculating the player’s head movement when attacking.

    For the head movement, the server provides the following attributes:

    serverbound_packetfields.png
    (From wiki.vg: Entity Look and Relative Move)


    The looking direction could be characterized by player’s yaw and pitch. Combine yaw and pitch, a two-dimensional Cartesian coordinate system is presented: pitch determines the player’s looking direction in vertical (how much a player is looking up or down) and yaw determines the player’s looking direction in the horizon (how much they are looking left and right).


    The yaw and pitch could be converted vector. By contrast, in order to reduce the difficulty of analysis, I transformed the vector into a sort of the scalar. That is, the angle between the player's looking direction and the center of the hitbox of the entity.

    The implementation is quite simple using Bukkit API. Codes are below:
    Code (Text):
    Player player = (The player which attacks the entity);
    Entity entity = (The entity damaged by player);
    Vector playerLookDir = player.getEyeLocation().getDirection();
    Vector playerEyeLoc = player.getEyeLocation().toVector();
    Vector entityLoc = entity.getLocation().toVector();
    Vector playerEntityVec = entityLoc.subtract(playerEyeLoc);
    float angle = playerLookDir.angle(playerEntityVec);
    Thus, the definition of the player's head movement has been completed.
     
    #3 NascentNova, Feb 6, 2018
    Last edited: Feb 6, 2018
    • Winner Winner x 2
    • Useful Useful x 2
  4. 3. Feature Engineering
    3.1 Collecting data


    The previous pages give a definition of head movement of a player.

    If we keep recording the angles when a player is continuously attacking an entity, we will finally get sequences of angles.

    Currently, we don’t know what kind of tendency the sequences have, so I imported the sequences to the Excel and made some fold-line diagrams:

    anglesequence_vanilla1.png

    It’s a time series. We need further experiments to explore whether there is a rule.

    anglesequence_vanilla2.png anglesequence_vanilla3.png anglesequence_killaura1.png anglesequence_killaura2.png anglesequence_killaura3.png

    It is found that there is a certain similarity in the trend of time series of the same category. e.g. the fluctuation and the average value of the data is dissimilar, which is supported by the first three images representing the same category. The conclusion that the series can partly match the pattern could be drawn.
     
    #4 NascentNova, Feb 6, 2018
    Last edited: May 6, 2018
    • Winner Winner x 2
    • Useful Useful x 2
  5. 3. Feature Engineering
    3.1 Design of dataset


    For time series analysis, there are a couple of common similarity measures in the context of time series like Dynamic Time Warping (DTW), Euclidean Distance (ED) and SFA (Symbolic Fourier Approximation).

    However, these approaches would not be applied in this scenario since the classification does not mainly rely on the shape of the curve, so they are not going to be considered.
    Therefore, adopting some classical statistical methods to measure the other features of the sequence could be considered before applying the complicated time series prediction model.

    In order to detect the fluctuation and the average value of the data, two fundamental
    statistical feature was selected: the mean and the mean square of the number of the sequences. Moreover, the mean and the mean square of the delta between two adjacent items Math.abs(array – array[i-1]), representing the derivative of the movement (speed), should be considered simultaneously under the circumstance.

    Thus, our dataset consists of four dimensions.
     
    #5 NascentNova, Feb 6, 2018
    Last edited: May 6, 2018
    • Winner Winner x 1
    • Useful Useful x 1
  6. 4. Artificial Neuron Network
    This section may contain unclear descriptions. Please comment below if you think elaborating is required.

    From the previous conclusions, the datasets which indicate the pattern are a series of scattered points in the high dimensional space. Or, we say, they are a couple of vectors with a length of 4. Which classification algorithm should be used depends on the data distribution. Therefore, in this section, I will use Origin, a mathematical graphics software, to generate the image of vector series in order to help to understand the data distribution. The data below is generated according to the code provided previously:

    Code (Text):
    Attacking (vanilla):
    Angles = [0.39267087, 0.5936949, 0.2308002, 0.34687847, 0.43413088, 0.44350803, 0.5333573, 0.3297529, 0.7547668, 0.68121463, 0.6634815, 0.6259528, 0.65484965, 0.6027457, 0.3744836, 0.31012928, 0.35455334, 0.40250704, 0.46551096, 0.50967205, 0.5407446, 0.52624315, 0.5052259]
     
    Convert to the dataset in previous design:
    Code (Text):
    [0.017942158191815917, 0.01234174087304676, 0.4902988937885865, 0.09822073791708265]
    Following the model, finally I collected those data applied feature scaling, such linear transformation maps the result to the region [0,1]. The formula of this is below:

    Feature-scaling.PNG
    You can find further information here: https://en.wikipedia.org/wiki/Feature_scaling so not tired of words here.

    The result:
    Code (Text):
    Angle-Dev    Delta-Dev    Angle-Mn.    Delta-Mn.    Cat.
    0.26132    0.202    0.51776    0.06388    1
    0.54116    0.39396    0.56992    0.09275    1
    0        0        0.48289    0        1
    0.5274    0.34419    0.73548    0.18434    1
    0.4186    0.28227    0.68327    0.15277    1
    0.29168    0.18672    0.68625    0.13386    1
    0.3764    0.19483    0.89253    0.12497    1
    0.39363    0.18878    0.63966    0.12141    1
    0.90618    0.59891    1        0.25338    2
    0.61731    0.3373    0.82288    0.16811    2
    0.46318    0.3419    0.98292    0.20877    2
    0.86285    0.43888    0.97375    0.15722    2
    0.92229    0.41521    0.81605    0.20533    2
    0.5648    0.58676    0.88434    0.30708    2
    0.62043    0.29958    0.6869    0.16525    2
    1        0.59892    0.89155    0.25784    2
    0.6894    0.48014    0.62448    0.18107    2
    (1 stands for killaura and 2 stands for vanilla)
    I selected the first three dimensions and made a graph. It looks like this:
    3D2.png
    (Black-Killaura, Red-Vanilla)

    Choose other 3 dimensions and draw the graph again, the final result states the attempt is effective and efficient as a small amount of data already has the ability to show clusters, depicting the outline of the data distribution. A couple of algorithms could classify them, e.g. kNN, LVQ, SVM or BP. Among these options, LVQ is more appropriate as it is not sensitive to outliers and it is capable of update the network in real-time without massive calculation.

    LVQ (Learning Vector Quantization)
    LVQ is a prototype-based supervised classification algorithm. It updates the cluster centers in an incremental manner, leading to a better accuracy of recognition. The algorithm is at below:
    (descriptions are quoted from Data Clustering and Pattern Recognition, Roger Jang)
    1. Set representative centers for each class. Suppose that we have 3 clusters for a 4-class problem, then the number of centers are 12 in total. These cluster centers can be obtained by k-means clustering over each individual class. (In practice, the number of clusters for each class can be set to be proportional to the size of the class.)
    2. For each data point x, find the nearest centers yk. Based on the class labels of x and yk, update centers as follows:
      • If the class labels are the same, then move yk toward x:
    yk = yk + α [x - yk]​
    • If the class labels are different, then move yi away from x:
    yk = yk - α [x - yk]​
    1. Update the learning rate α.
    2. Back to step 2 until all the centers converge.
    Once the centers of clusters are converged, kNN classification could be applied to categorize the class of the data.

    The Java implementation could be found here: https://github.com/Nova41/SnowLeopard/blob/master/src/org/encanta/mc/ac/LVQNeuralNetwork.java
     
    #6 NascentNova, Feb 6, 2018
    Last edited: May 18, 2018
    • Winner Winner x 3
    • Useful Useful x 3
  7. 5. Conclusion
    Not completed

    Final result

    under combo


     
    #7 NascentNova, Feb 6, 2018
    Last edited: Feb 8, 2018
    • Winner Winner x 3
    • Useful Useful x 3
    • Winner Winner x 2
    • Useful Useful x 1
  8. MiniDigger

    Supporter

    * waits patiently for the github link *


    thats the idea of open source, come together, work together, make it better.
     
    #9 MiniDigger, Feb 6, 2018
    Last edited: Feb 6, 2018
    • Like Like x 3
  9. Finally, an open source ML project. What a time to be alive.
     
    • Like Like x 6
  10. :p Yeah, the whole project is pretty much done except some integration with spigot. The github link is coming soon.
     
    • Like Like x 3
  11. maldahleh

    Wiki Team

    With my brief time experimenting with the theory, I think the issue with detecting Kill Aura especially is that any of the advanced hack clients are able to somewhat mimic a client. Every user has a unique PVP style, and I believe that clients are able to at least mimic the edge cases of a normal client, therefore while detecting aura it's very hard to tell whether it's a hacked client or a player's unique PVP style. If you end up classifying all "un-normal" behaviour as a hacked client, you risk lots of unhappy players complaining they were banned without hacking.

    You may have already accounted for this though so I'm curious to see how your final product turns out.
     
    • Like Like x 1
  12. MiniDigger

    Supporter

    well, if a aura is nerved to the point that another player could play similar, there is really not much you can do about it (and well, its not really worth it either since the player doesn't really gain a clear advantage)
     
    • Agree Agree x 1
  13. The graphs for killaura and human are basically identical, you also never mentioned any kind of data normalization.

    The way you collect data doesn't make sense, you are collecting data only when an entity is hit. A player could hit an entity, spin 10 times and hit another entity and you would only collect the relative move between the hits and the 10 spins would be ignored.
     
    • Winner Winner x 1
  14. Thank you for your feedback! I haven't mentioned the normalization in feature engineering section as it's unnecessary in designing datasets. Data without feature scaling will interfere the algorithm stage, the next section.

    Yes, they are actually identical so that's why I am inclined to not to use time series prediction model.

    The 10 spins seem to make no sense in combat. Why should a hacked client do that? The player has to attack the entity in the end and this is the place where I began to discuss. :)

    And, you are trying to steal the concept. Let the spins be ignored as I don't utilize them.
     
    #15 NascentNova, Feb 8, 2018
    Last edited: May 18, 2018
  15. The ten spins wasn't meant to be a realistic situation, it was meant to explain how your data collection is flawed. The majority of the data that allows for classification happens before the hit of the entity, which is why only collecting data for the one tick the entity is hit won't yield any success.
     
    • Like Like x 1
  16. Yeah, I agree with you. I certainly record the data for a long time and analyze their tendency instead of just collect data for the one tick
     
    • Funny Funny x 1
  17. Oh, stealing the concept and negating other's work without taking a deep look at it

    Anyway, your work makes no sense at all.
    • The KillAura you utilize is EXTREMELY old. I don't see a modern hacked client moving player's heads in a Straight line?
    • You packed the data into image and used a convolutional neural network to classify, which cost substantial system resources and not effective as it losses information (e.g. the moving speed of the mouse). Relatively speaking, my classification algorithm requires a little system resource because it is based on LVQ Neural Network.
    • Furthermore, my classification is capable of updating the network at a real time.
    • ...
     
    #19 NascentNova, Feb 8, 2018
    Last edited: May 18, 2018
  18. All of this looks pretty promising and with the help of the open source community, this can become a great project. How's open sourcing going?
     
    • Like Like x 1

Share This Page