Resource Machine Learning Killaura Detection in Minecraft

Discussion in 'Spigot Plugin Development' started by NascentNova, Feb 6, 2018.

  1. This is a detailed explanation of Snow Leopard, an open-source project recognizing player's aim patterns based on machine-learning. In this article, you will find the most exhaustive research result about machine-learning based killaura detection in this forum.

    Maybe after this article is released, mechanisms described below will be cracked by hackers and become outdated, but I promise they really work so far (Feb 6, 2018).

    I may never know whether the decision to release this is right, but I'm sure there's a positive impact on it: It can draw a lot more attention to the average developer and it will most likely increase the quality of ML anti cheats we have.

    Edit (3/5/2019)
    People are using my code without giving credits... I don't know what to say

    Machine Learning Killaura Detection in Minecraft

    Nascent Nova, 1 February 2018

    Abstract
    Cheating is really an issue on modern Minecraft servers. However, current existing anti-cheat solutions mainly rely on hard-coded checks, which is difficult to maintain and update. This thesis illustrated an alternative approach to detect cheaters based on a supervised learning algorithm called Learning Vector Quantization, which is more flexible and yields better results.

    Catalog

    1. Introduction
    2. Characterization of player’s movement
    ....2.1 Hitbox, Hit Registration Area, and their relationship with Hit Registration
    ........2.1.1 Hitbox and Hit Registration
    ........2.1.2 Hit Registration Area
    ........2.1.3 Exploitation
    ....2.2 Movement: Head rotation
    3. Feature Engineering
    ....3.1 Collecting data
    ....3.2 Designing dataset
    4. Artificial Neural Network
    5. Conclusion
    6. Bibliography & Appendix

    Note: This post may contain redundant content and too formal expression.
    Note: Some items are expanded or deleted due to the average programming level of the forum.
    Ps. Don't forget to leave a rating ;)


    The github link:
    https://github.com/Nova41/SnowLeopard
    Train: /eac train <category-name>
    Test: /eac test <player-name> <seconds> I prefer 15 seconds
    Tested on spigot 1.8.8
    There is only one permission in the plugin: encanta.ac
    It is responsible for all sub commands under /eac. Give urself op to bypass the permission


    ** JAR DOWNLOAD LINK **
    https://www.spigotmc.org/resources/snowleopard.55185/
    Download from here only if you do not have a compiler to build the source! This may be not up-to-date.

    ** Edits **
    3/5/2019: Improved the wording and phrasing and elaborated on some concepts
    I am now a college student! XD (received offers from Purdue University and Ohio State University, and I am confused where to go)
    I read a lot of materials these months and when I read this report again, I have no idea what I was writing a year ago lol. So I rewrote some part of it. Hopefully this helps you about machine-learning based killaura detection.
     
    #1 NascentNova, Feb 6, 2018
    Last edited: Mar 5, 2019
    • Winner Winner x 15
    • Useful Useful x 11
    • Like Like x 6
  2. 1. Introduction

    The discussion of machine-learning based killaura detection never ends since konsolas demonstrated his research result. However, at least in this forum, there are very few published feasible schemes described in detail. Besides, it is sad that some people even developed premium plugins based on a public idea and sold them to outsiders. What is more ridiculous is that some plugins even do not work, rendering the exploration worse.

    Under the circumstances described above, I started to explore the possibilities of machine-learning based killaura detection. I designed a series of mechanisms and debugged them for a long time, until they started to yield acceptable results. I have posted several threads here regarding to my exploration which can be located at below.

    [1] Machine Learning Anti-Killaura Ideas & Discussion. - Discussion in 'Spigot Plugin Development' started by NascentNova, Apr 9, 2017. https://www.spigotmc.org/threads/machine-learning-anti-killaura-ideas-discussion.231396/

    [2] A future-extraction algorithm for ML Anti-Killaura. - Discussion in 'Spigot Plugin Development' started by NascentNova, Jan 1, 2018. https://www.spigotmc.org/threads/a-future-selection-algorithm-for-ml-anti-killaura.294033

    [3] Killaura Detection Based on Neural Network - Discussion in 'Spigot Plugin Development' started by NascentNova, Jan 20, 2018. https://www.spigotmc.org/threads/killaura-detection-based-on-neuron-network.298042/
     
    #2 NascentNova, Feb 6, 2018
    Last edited: Mar 4, 2019
    • Winner Winner x 6
    • Useful Useful x 2
    • Like Like x 1
  3. 2. Characterization of player’s movement

    2.1 Hitbox, Hit Registration Area, and their relationship with Hit Registration

    2.1.1 Hitbox and Hit Registration

    We all know the Hitbox in Minecraft. You can find the definition of it on Minecraft Wiki: Hitboxes are regions that describe how much space an entity takes up, which can be shown by pressing the F3+B keys. Minecraft client displays the Hitbox by drawing a white outline around the entity to manifest the location of the entity and the space it takes up. Also, there is a flat red rectangle near its "eyes", which is called "line of sight". It marks where the entity's eyes are and could be independent of where they appear to be. Moreover, there is a straight blue line that extends away from the line of sight pointing in the direction the mob is facing.

    hitbox_illustrate.png
    Notice how the line of sight wraps all the way around the entity

    A commonly misconception of the Hitbox is that if a player produces a hit within the hitbox, the hit would be confirmed valid and the entity may take damage; otherwise not. The process is known as "Hit Registration". Yet this is only half-right once you take the Hit Registration Area into account.

    2.1.2 Hit Registration Area

    Probably, you have already found that if you hit an entity a little bit outside of its Hitbox, your hit would still be registered - In fact, there is a thing I called Hit Registration Area, which means hit located in it would be registered. It could be understood as another Hitbox located at the same place as the regular Hitbox, but slightly bigger than it.

    2.1.3 Exploitation

    The picture indicates that the range of Hitbox is large. The detection based on the fact that in common PVP conditions, it is nearly impossible for humans to continuously hit at the same spot of the Hit Registration Area.

    Therefore, according to the fact above, player using killaura could be detected by checking whether one is continuously hitting the same spot of another. This is the classical Angle Check. To bypass the check, developers of hacked clients have come up with a variety of ways, for example, they can intentionally make the cursor to hit random positions within the hit registration area when using killaura or move smoothly when the targeted player moves.

    But, analysis based on mass data indicates even in such situation--hitting spots are completely random--the use of killaura is still detectable. When attacking, real humans have a specific tendency in clicking that differs from totally random click produced by KillAura. To implement the automatic detection, the Machine Learning is introduced.

    Based on pattern recognition, machine learning has the ability to learn the player's movements and abstract them. The outcome of abstraction can be expressed by a set of vectors noting the amplitudes of different aspects regarding to player's movements, and vectors provides possibilities to construct high dimensional Euclidean spaces and classify different patterns.

    2.2 Movement: Head rotation

    Minecraft server does not provide a native method to access the hit spots produced by players. However, there are some useful data wrapped in the packet sent from players, and they could be used to compute the player’s head movement and the hit spots.

    For the head movement, the server provides the following attributes:

    serverbound_packetfields.png
    (From wiki.vg: Entity Look and Relative Move)
    The looking direction of a player could be characterized by his/her yaw and pitch: Combine yaw and pitch, a two-dimensional Cartesian coordinate system is presented--(pitch, yaw) where pitch determines the player’s looking direction in vertical (how much a player is looking up or down) and yaw determines the player’s looking direction in the horizon (how much they are looking left and right).

    The sporadic exact values of yaw and pitch are not much useful; However, the magnitude of the change(delta) in yaw and pitch in player's movement, the angle between the player's looking direction and the center of the Hitbox of his/her current target, is important.

    Using Bukkit API, the computation of the angle could be easily implemented. Codes are provided below:
    Code (Text):
    Player player = (The player which attacks the entity);
    Entity entity = (The entity damaged by player);
    Vector playerLookDir = player.getEyeLocation().getDirection();
    Vector playerEyeLoc = player.getEyeLocation().toVector();
    Vector entityLoc = entity.getLocation().toVector();
    Vector playerEntityVec = entityLoc.subtract(playerEyeLoc);
    float angle = playerLookDir.angle(playerEntityVec);
    The dataset in the next section is based on the data from computation above.
     
    #3 NascentNova, Feb 6, 2018
    Last edited: Mar 5, 2019
    • Winner Winner x 4
    • Useful Useful x 2
  4. 3. Feature Engineering
    3.1 Collecting data


    The previous pages give a definition of head movement of a player.

    If we keep recording the angles when a player is continuously attacking an entity, we will finally get sequences of angles.

    Currently, we don’t know what kind of tendency the sequences have, so I imported the sequences to the Excel and made some fold-line diagrams:

    anglesequence_vanilla1.png

    It’s a time series. We need further experiments to explore whether there is a rule.

    anglesequence_vanilla2.png anglesequence_vanilla3.png anglesequence_killaura1.png anglesequence_killaura2.png anglesequence_killaura3.png

    It is found that there is a certain similarity in the trend of time series of the same category. e.g. the fluctuation and the average value of the data is dissimilar, which is supported by the first three images representing the same category. The conclusion that the series can partly match the pattern could be drawn.
     
    #4 NascentNova, Feb 6, 2018
    Last edited: May 6, 2018
    • Winner Winner x 4
    • Useful Useful x 2
  5. 3. Feature Engineering
    3.1 Design of dataset


    For time series analysis, there are a couple of common similarity measures in the context of time series like Dynamic Time Warping (DTW), Euclidean Distance (ED) and SFA (Symbolic Fourier Approximation).

    However, these approaches would not be applied in this scenario since they mainly rely on the shape of the curve to yield effective results, but the shape of the curve of different patterns are not obvious enough.
    Therefore, adopting some classical statistical methods to measure the other features of the sequence could be considered before applying the complicated time series prediction model.

    In order to detect the fluctuation and the average value of the data, two fundamental
    statistical feature was selected: the mean and the mean square of the number of the sequences. Moreover, the mean and the mean square of the delta between two adjacent items Math.abs(array – array[i-1]), representing the derivative of the movement (speed), should be considered simultaneously under the circumstance.

    Thus, our dataset consists of four dimensions.
     
    #5 NascentNova, Feb 6, 2018
    Last edited: Mar 5, 2019
    • Winner Winner x 3
    • Useful Useful x 1
  6. 4. Artificial Neuron Network
    This section may contain unclear descriptions. Please comment below if you think elaborating is required.

    From the previous conclusions, the datasets which indicate the pattern are a series of scattered points in the high dimensional space. Or, we say, they are a couple of vectors with a length of 4. Which classification algorithm should be used depends on the data distribution. Therefore, in this section, I will use Origin, a mathematical graphics software, to generate the image of vector series in order to help to understand the data distribution. The data below is generated according to the code provided previously:

    Code (Text):
    Attacking (vanilla):
    Angles = [0.39267087, 0.5936949, 0.2308002, 0.34687847, 0.43413088, 0.44350803, 0.5333573, 0.3297529, 0.7547668, 0.68121463, 0.6634815, 0.6259528, 0.65484965, 0.6027457, 0.3744836, 0.31012928, 0.35455334, 0.40250704, 0.46551096, 0.50967205, 0.5407446, 0.52624315, 0.5052259]
     
    Convert to the dataset in previous design:
    Code (Text):
    [0.017942158191815917, 0.01234174087304676, 0.4902988937885865, 0.09822073791708265]
    Following the model, finally I collected those data applied feature scaling, such linear transformation maps the result to the region [0,1]. The formula of this is below:

    Feature-scaling.PNG
    You can find further information here: https://en.wikipedia.org/wiki/Feature_scaling so not tired of words here.

    The result:
    Code (Text):
    Angle-Dev    Delta-Dev    Angle-Mn.    Delta-Mn.    Cat.
    0.26132    0.202    0.51776    0.06388    1
    0.54116    0.39396    0.56992    0.09275    1
    0        0        0.48289    0        1
    0.5274    0.34419    0.73548    0.18434    1
    0.4186    0.28227    0.68327    0.15277    1
    0.29168    0.18672    0.68625    0.13386    1
    0.3764    0.19483    0.89253    0.12497    1
    0.39363    0.18878    0.63966    0.12141    1
    0.90618    0.59891    1        0.25338    2
    0.61731    0.3373    0.82288    0.16811    2
    0.46318    0.3419    0.98292    0.20877    2
    0.86285    0.43888    0.97375    0.15722    2
    0.92229    0.41521    0.81605    0.20533    2
    0.5648    0.58676    0.88434    0.30708    2
    0.62043    0.29958    0.6869    0.16525    2
    1        0.59892    0.89155    0.25784    2
    0.6894    0.48014    0.62448    0.18107    2
    (1 stands for killaura and 2 stands for vanilla)
    I selected the first three dimensions and made a graph. It looks like this:
    3D2.png
    (Black-Killaura, Red-Vanilla)

    Choose other 3 dimensions and draw the graph again, the final result states the attempt is effective and efficient as a small amount of data already has the ability to show clusters, depicting the outline of the data distribution. A couple of algorithms could classify them, e.g. kNN, LVQ [2], SVM or BP. Among these options, LVQ is more appropriate as it is not sensitive to outliers and it is capable of update the network in real-time without massive calculation.

    LVQ (Learning Vector Quantization)
    LVQ is a prototype-based supervised classification algorithm. It updates the cluster centers in an incremental manner, leading to a better accuracy of recognition. The algorithm is at below:
    (descriptions are quoted from Data Clustering and Pattern Recognition, Roger Jang)
    1. Set representative centers for each class. Suppose that we have 3 clusters for a 4-class problem, then the number of centers are 12 in total. These cluster centers can be obtained by k-means clustering over each individual class. (In practice, the number of clusters for each class can be set to be proportional to the size of the class.)
    2. For each data point x, find the nearest centers yk. Based on the class labels of x and yk, update centers as follows:
      • If the class labels are the same, then move yk toward x:
    yk = yk + α [x - yk]​
    • If the class labels are different, then move yi away from x:
    yk = yk - α [x - yk]​
    1. Update the learning rate α.
    2. Back to step 2 until all the centers converge.
    Once the centers of clusters are converged, kNN algorithm [3] could be applied to categorize the class of the data.

    The Java implementation could be found here: https://github.com/Nova41/SnowLeopard/blob/master/src/org/encanta/mc/ac/LVQNeuralNetwork.java
     
    #6 NascentNova, Feb 6, 2018
    Last edited: Mar 5, 2019
    • Winner Winner x 5
    • Useful Useful x 3
  7. 5. Conclusion
    Not completed

    Final result

    under combo


     
    #7 NascentNova, Feb 6, 2018
    Last edited: Feb 8, 2018
    • Winner Winner x 6
    • Useful Useful x 3
    • Winner Winner x 4
    • Informative Informative x 1
    • Useful Useful x 1
  8. MiniDigger

    Supporter

    * waits patiently for the github link *


    thats the idea of open source, come together, work together, make it better.
     
    #9 MiniDigger, Feb 6, 2018
    Last edited: Feb 6, 2018
    • Like Like x 4
  9. Finally, an open source ML project. What a time to be alive.
     
    • Like Like x 6
  10. :p Yeah, the whole project is pretty much done except some integration with spigot. The github link is coming soon.
     
    • Like Like x 4
  11. maldahleh

    Wiki Team

    With my brief time experimenting with the theory, I think the issue with detecting Kill Aura especially is that any of the advanced hack clients are able to somewhat mimic a client. Every user has a unique PVP style, and I believe that clients are able to at least mimic the edge cases of a normal client, therefore while detecting aura it's very hard to tell whether it's a hacked client or a player's unique PVP style. If you end up classifying all "un-normal" behaviour as a hacked client, you risk lots of unhappy players complaining they were banned without hacking.

    You may have already accounted for this though so I'm curious to see how your final product turns out.
     
    • Like Like x 1
  12. MiniDigger

    Supporter

    well, if a aura is nerved to the point that another player could play similar, there is really not much you can do about it (and well, its not really worth it either since the player doesn't really gain a clear advantage)
     
    • Agree Agree x 1
  13. The graphs for killaura and human are basically identical, you also never mentioned any kind of data normalization.

    The way you collect data doesn't make sense, you are collecting data only when an entity is hit. A player could hit an entity, spin 10 times and hit another entity and you would only collect the relative move between the hits and the 10 spins would be ignored.
     
    • Winner Winner x 1
  14. Thank you for your feedback! I haven't mentioned the normalization in feature engineering section as I think it's unnecessary in designing datasets. Data without feature scaling will interfere the algorithm stage, the next section.

    Yes, they are actually identical so that's why I am not inclined to use time series prediction model.

    The 10 spins seem to make no sense in combat. Why should a hacked client do that? The player has to attack the entity in the end and this is the place where I began to discuss. :)
     
    #15 NascentNova, Feb 8, 2018
    Last edited: Mar 4, 2019
  15. The ten spins wasn't meant to be a realistic situation, it was meant to explain how your data collection is flawed. The majority of the data that allows for classification happens before the hit of the entity, which is why only collecting data for the one tick the entity is hit won't yield any success.
     
    • Like Like x 1
  16. Yeah, I agree with you. I certainly record the data for a long time and analyze their tendency instead of just collect data for the one tick
     
    • Funny Funny x 1
  17. Oh, negating other's work without taking a deep look at it... And sorry I have some criticisms about your work.
    • The KillAura you utilize is Extremely old. I don't see a modern hacked client moving player's heads in a Straight line?
    • You packed the data into image and used a convolutional neural network to classify. It works but costs substantial system resources and not effective as it losses information (e.g. the moving speed of the mouse).
    • Relatively speaking, my classification algorithm requires a little system resource and capable of updating the network at real-time because it is based on LVQ Neural Network.
    Edit:
    That's an approach but not all hacked clients will dance like that, I mean shaking your heads like drunk, so I think my approach has a wider application
     
    #19 NascentNova, Feb 8, 2018
    Last edited: Mar 4, 2019
  18. All of this looks pretty promising and with the help of the open source community, this can become a great project. How's open sourcing going?
     
    • Like Like x 1

Share This Page