Questions about handling mysql data

Discussion in 'Spigot Plugin Development' started by hotpatooky, Jun 30, 2018.

Thread Status:
Not open for further replies.
  1. When my plugin is enabled, I load all player data from my mysql db into corresponding data structures (ex/ player kills count goes into a hashmap). I am concerned that this practice may result in unwanted affects on the memory/speed of my plugin/server, especially when the numbers go up and I will have hashmaps with thousands of key mappings.

    Should I be just pulling player data whenever I need it (make a query everytime I need to access player info), or should I load all of a player's data once they join?

    Thanks.
     
  2. You can just be pulling the data whenever you need it async and then if you know the data isnt going to change cache it so you get the cached data
     
  3. What you're talking about is caching, cached data retrieving is way faster than retrieving from a database each time you need it. Your way is fine, Java can handle way way more than that, in fact I believe it's something like 3.1 x 10^56 or something similar? Anyway, massive number you'll be right.


    Sent from my iPhone using Tapatalk
     
    • Optimistic Optimistic x 1
  4. If it’s player data, then grab it when the player joins and cache it whilst they are online. That way you can modify values. Then when the player leaves you can put all the data back I rock your database.

    If your making something that needs stats to update live for a website or multiple server so then you could have a method to upload stats/data at an interval or run it at important times. Say at the end of a round (as long as it’s done async you shoudnt see any performance drop)
     
  5. Just because you can doesn't mean you should
     
  6. Personally I would load the players data on join and save it when they quit, as is the general practice on the forums if you would do a survey.

    The pros of this setup are:
    • Speed: Accessing a object (of an online player) is way faster then doing a query. (Also loading every single entry could take a very, very long time if you get high player counts).
    • Memory: Loading all players (even those that are not online) will take up a lot of memory you could be using for other stuff as well.
    As far as I can tell there are only 2 small cons:
    • You will have to find a way to modify offline players data without hanging the main thread. This however is very minor, and if you program it right, should not be a problem at all.
    • Live data: If you need live data, this approach can still work, but you need to make sure you save the data from time to time at important intervals (score update for example).
    You are completely right about this and it would be a bad practice to do so if you can avoid it. As far as I can tell you seem to know where the issue is and I am sure you will figure out what's the best way to handle the data you need.
     
  7. I really love how MassiveCore manages their database connection. Though it could use a tiny improvement, it basically works like this (note that this is done with MongoDB instead of MySQL):

    On startup of the plugin, they started 2 additional threads. Not runnables, actual normal threads. One of the thread constantly spams the database and asks the database for any updated collections, while the other constantly compares the loaded data (from within the Java process) to the database, to see if it's up-to-date. Instead of constantly trying to load or save data, they synchronize. If data is modified locally (by the server), it's uploaded by thread 1. If the data is directly changed in the database (by an admin for example), the data is automatically (without having to do any commands) loaded into the server by thread 2.

    Reason why I love this:
    • When using the MassiveCore as API, you're never directly interacting with the database to load data. The data is constantly synchronized asynchronous. When accessing data from the database, you're not requesting data from the database directly, but you're using the locally loaded data from within the Java process. This means you can literally fully manage your database on the main thread, without having to worry about any blocking actions.
    • It allows for super stable and easy data management. Change something on the remote database directly? All servers will instantly load and apply this change so ALL servers connected to the database are instantly up to date with whatever data you changed, without having to do any commands.
    • It leaves out the overhead of having to use dozens and dozens of runnables or callbacks to constantly load data. You wouldn't have to worry about threading anymore (at least because of data fetching).
    The above described, as mentioned, is implemented for MongoDB, but I'm sure you could possibly make something similar for MySQL (are there any queries to efficiently check when a row was last updated, or anything similar?). It may not fully help you solve this issue directly, but it may inspire you to see how other people managed to achieve some great things.

    If you just want something quick and simple, I'd simply load the data when the player joins and save the data when he leaves (or at least at certain actions, such as completing a minigame).
     
    • Informative Informative x 1
  8. Actually, it means you should, why ping a database you dont need to all the time to just retrieve data? Save and upload when necessary, but have a local cache.
     
  9. There is no "max" java can handle. The resources java has, are given by the computer running the VM. If you have a beast computer, with millions of GPUs, CPUs and ram cards, you will probably have a limit that is way beyond our knowledge. Moreover, if Java runs out of RAM, it can always fall back on normal HDD's and SSD's to store information. The exact quantity it can store on this is specified by the drivers of that memory bank.

    And now my opinion on the question: when loading the plugin, cache the stuff you need a lot, and only cache the rest when you need it for the first time. While some believe that using async threads may not hold the main thread, it is only partially true. Yes, the main thread can still run, but you will get a big performance impact as the CPU cards we have nowadays normally only have 4 or 8 threads available, and every computer probably runs 99 programs, including already 8 processes per open tab on chrome. On the other hand, storing useless information on the server is also bad. That is why I almost always chose for this solution. Everything you know for sure you will need will get loaded to the server on the start, as no one is gonna care about a slower startup time. Everything you are not certain you are gonna needs before a restart, you should keep locked away in your database, until you actually need it.

    While this may be a solid solution for some, there are way more things about it! Do you need to save it again? What types of data are u pulling? Is the MySQL database hosted on the same pc? Or is it hosted somewhere on the other side of the world? What type of server do you have? Is it a gaming pc with way to much memory? Or is it a real calculator with some 16 core CPU running 4 GHz each? If you run out of memory on your ram stick, do you have an SSD or an HDD? Do you need the query features when fetching data?

    Keep in mind that you are using Java, a language that is made with security and stability in mind, giving up on some fields of performance and memory usage. Generally, you should not worry about memory and performance in Java, there is just not enough to get out of it, so whatever solution you choose, as you keep it a little bit in proportions, and don't start a new connection every second, you should be fine for most stuff.
     
  10. I must have that confused with the max value of a number in a 64 bit floating point environment, my bad. I was suggesting that the data be cached once, and all loads and saves go to that cache, and then periodically save chunks of that data to the database, or at certain milestones/events, portion it out into batches and save it that way if the data load is too heavy and will slow the server on the main thread, or asynchronous saving would be fine too. Personally, thats how I would go about it, especially considering the users use case wherein they intend to load player data, which is no where near as resource intensive as say Googles database where all the minute details matter. The caching and periodic saving sounds fine to me.
     
  11. You are right that at the current state of computer technology there is a max because of the 64-bit addresses. This indeed means that the computer can only understand locations up to 2 ^ 64. However, there is a big chance that, in the far feature, chips will come available with 128-bit address, or 256-bit addresses. With those new nanotubes and other technologies - there is even technology to use only one atom to save a bit, compared to the thousands of atoms we use now per bit - the need of higher architectures will probably rise gigantic. And that is only for normal computers. With new servers that make use of light, we can use multiple computers to save big chunks of data, keeping great performance as light is way faster than the electricity that runs in our current PC's. And again, this technology is not far either. TU/e (Eindhoven University of Technology) has a lot of research and design going on on those chips, together with ASML.

    And yes, for most servers, this is way too much. They would probably spend more performance on algorithms to get everything working, than the performance the server costs. But keep in mind that there are some big networks out there, that have to pull information every millisecond. Networks that have thousand's of players every day, and with new minigames being added every month, those databases become gigantic way too fast. And keep in mind that Java is not nice to your memory... the useless data it stores, it's insane. Or at least, it is insane compared to C or C++, where only the data your specified gets stored.

    Another thing I want to add is that you also must not forget the optimization that went into those database programs. That is optimization you can never reach yourself. If you have hashmaps containing millions of entries, it would probably turn out that a database is faster even with creating the connection and such.

    Edit:
    I was looking, and apparently, there are already chips that can perform 512-bit integer calculations . Some graphics cards also use a 512-bits memory bus for faster transfer speed, and you may know from game experience that that is something that is hardly needed.
     
    #11 Remceau, Jul 1, 2018
    Last edited: Jul 1, 2018
  12. Good to know, and yeah in terms of memory, C / C++, or any of the C languages really (The languages I believe Microsoft use?) are way better in terms of memory, and Minecraft wouldve been a better game overall in terms of efficiency if it was developed in those languages. Also, irregardless of efficiency when it comes to maps vs database connections, we're talking milliseconds difference, either way this most likely goes beyond the scope of what the OP intended. In other words, do what you want, but dont be stupid @hotpatooky
     
    • Agree Agree x 1
  13. i now know what approach i will take. thanks for the advice and the computer thought adventure.
     
Thread Status:
Not open for further replies.