1.15.2 Storage. Please give insights!

Discussion in 'Spigot Plugin Development' started by Joeldesante, Feb 8, 2020.


What is the better option.

  1. Attempting to load from disk.

    0 vote(s)
  2. Loading everthing into memory and then saving changes to disk.

    5 vote(s)
  1. I have been working on a small side project called RegionLib for about a week now. Essentially it is a plugin I am developing for my server that is in charge of creating WorldEdit like regions, saving them, managing them, etc...

    I have reached a point where I need to save created regions to file, I've chosen SQLite as my database of choice. My question is:

    When dealing with regions, should I go through the extra work of setting up an intricate system for detecting what region the user is referencing or standing in (and then loading that one region from the disk), or am I better of loading every single saved region into memory at boot up and then saving any changes to disk?

    I ask this because I want to make sure that my plugin is lightweight, and I know, "Pre-optimization is the root of all evil" but I want to make sure that I am inline with best practices when making these plugins. So please, bestow your advice upon me.
  2. Depends on what you’re aiming for. Generally memory and CPU time are inversely related. Optimize for low memory usually means keeping few regions in memory at the cost of needing to essentially brute force search or optimize for low latency at the cost of a large cache. That is as sure of an answer you’ll get.

    Hypothetically, memory is cheaper than CPU time. Servers are designed to have a lot of memory to store state, CPU usually comes secondary, especially if resources are being artificially constrained by containerization or virtualization.

    In practice, it honestly doesn’t actually matter all that much. You can get away with a brute-force style search probably 99% of the time, even big servers (1000+ concurrent) have few issues under the right circumstances. If you aren’t trying to do it over an N>100 collection on PlayerMoveEvent but on a periodic timer, the effort is far from being worth it. The strength of abstraction is that you can swap it out if you realize that your use-cases change.

    The thing with SQLite is that it requires a blocking operation to disk basically every time you make a call. If you don’t load the entire database into memory, the performance of looking up something from SQLite depends heavily on indexing. I’m not sure if there is some sort of stateful row prioritization going on but I wouldn’t be surprised if there wasn’t, considering that SQLite prioritizes consistency above pretty much everything else. If you aren’t running an SSD, you can get like 200 msec delay on a simple SELECT so it is so so important for you to cover all your bases here.

    If low-latency read takes precedence over everything else, you NEED to preload. Whether its through heuristic analysis or batch loading or preload the entire database into memory. Otherwise, if you need true, random on-demand loading, you must sacrifice availability (i.e. trigger lookup and do a callback or manually recheck). From a practical standpoint, you shouldn’t have any issues whatsoever loading up to hundreds of thousands of regions, depending on what other data is present in each region. Even on large servers, remember that the thousands of chunks being loaded concurrently probably store vastly more data than you’ll ever need for your regions on the server, so this is something that is really crucial for you to actually measure and understand if it is an issue.

    The most efficient way to lookup values is to pair X/Z locations or chunks (even “chunks” of regions greater than 16 blocks on a side, MCA files are 32*16 on a side if I recall) depending on how much memory and how much processing time you are willing to compromise on. You can pack integer coordinates into a 64 bit long key or use a Pair key. Again, it makes no practical difference but if you can’t compromise go with long keys in a primitive key hash map like Long2ObjectMap.

    TL;DR there a lot of factors at play here. If you are really pressed for low-latency ONLY, Long2ObjectMap some collection of regions as values, map them to keys of X/Z chunks that they occupy. Preload what you can, the more you load at initialization without compromising other plugins, the better.

    Otherwise, everything else requires you to measure to ensure you’re working within your parameters, there’s no way to tell how much memory or how much you want to trade-off one for another or if you even want to put the effort in at all for something that you might not think is worth thr hassle.
    #2 xTrollxDudex, Feb 8, 2020
    Last edited: Feb 8, 2020
    • Winner Winner x 1
  3. Wow! This is one of the best responses I have ever received on this forum. Thank you very much, this answered my question and then some. It never really clicked in my head that thousands of chunks were already being stored in memory as is, and I feel as if that point really made it clear that I will opt for the preload route.

    Once again, thank you!
  4. md_5

    Administrator Developer

    Keep it in memory. Even 100,000 regions would only take a couple megabytes and will be several orders of magnitude faster than reading off disk (especially on hard drives)
    • Like Like x 1