Weired problem! Urgent

Discussion in 'Performance Tweaking' started by matgsan, May 22, 2017.

  1. Hey guys,
    I own a network with around 250 players. I have a dedicated server with i7-4790K and 32Gb of RAM.
    In my dedicated server i host a bungeecord, hub server, minigame hub server, a pvp server and 20 minigame servers. I do this for 2 months and never had problem.
    i am always monitoring my server CPU and Memory usage using HTOP and it`s always around 70% of CPU usage and 50% of RAM.
    This weekend i went to a trip with my gf and when i got back my staff and my players said that the server was always lagging.
    I made some tests and the conclusion is that:
    When i join any server it tooks around 10 seconds so i am able to use for example the teleport compass with interact event, for my scoreboard start updating and etc
    After those 10 seconds it works without any problem.
    When i choose to go to my pvp server it tooks around 30 sec for downloading terrain and after i join it tooks around 10 seconds again so i things start working for me.

    The TPS of the servers are all between 19.90 to 20, the ping is the same as always.
    So i think is something in the dedicated server and not in the spigot or bungee cord.

    Someone knows why this is happening?
    Suggestion of type of diagnostics?

    Sorry for my bad english
  2. electronicboy

    IRC Staff

    TPS is generally a harsh mesurement of TPS, especially if the tps command happens to be being provided by a plugin (some plugins loved to provide their own tps command "back in the day", which isn't really efficient), beyond that, timings is generally nice but won't cover CPU usage as a whole, RAM isn't something that you should really concern yourself over unless you understand how the JVM works.

    Beyond that, it's really about looking into stuff, e.g. connection between your player and the server, is bungeecord on the same machine? are you using the loopback interface (e.g. servers on localhost)?

    diagnosing performance issues can be a bit complicated and long-winded
  3. Well i made more test since i created the thread and is probably a machine problem because the connection is the same since i bought this dedicated server, the tps is spigot tps and it is 19.90~20, the bungee and servers are in the same machine, i just created a bungee instance without plugins and a server without plugins and i am getting the lag when join. It seams that the player just turn into a player after 10 seconds
  4. If you have that many servers and run things like MySQL storing of data for leader boards, or rollback stuff etc, .. maybe run optimize & repair on the database tables. There might be too much overhead and it's creating corruption etc.

    In phpmyadmin you could select all the tables from a database and at the bottom is a dropdown for optimize, once that's finished running, do it again on repair.

    Try again after that.

    It could also be that other things like a /tmp directory quota might be reached or sectors on the drive might be write/read way too much and might need a bit of partitioning (far fetching now). But don't forget to check the quotas on your dirs and partitions, etc.

    Perhaps also check for some plugins directory, check how big each directory is, maybe one of them has an issue or just a huge cache that takes way too long to read in.

    Lag could also be a connection issue, maybe the communication between the servers and their services/daemons might be on hold or timing out or slow or some reason. Double check the services and daemons are running and their ports are open and reachable internally and that the servers can use them, and data is actually stored or retrieved?

    I dunno, maybe a few of these things make you by chance find something you didn't about before and helps solve it. Sorry if it was useless.
  5. Did the SQL thing and still the same
    "some plugins directory" no plugins server have the same problem.

    Do you understand about traceroute? I think is connection issue but i cant be sure
  6. Yeah, I understand networking and tpc/ip stack, etc.

    a traceroute is tough, because usually a internet exchange point lets packets go through and return a response, but you get a timeout or * notice a lot of times, once it is about to hit the transit in front of the hosting provider you're on. and it shows up again once it hits internally and is able to respond. So you get a false positive as if the end point is not responsive. while it is. And there can be confusion like cloudflare allowing it through but the endpoint isn't actually there.
  7. So how can i find out if is a connection problem?
    Btw can you add me on skype ?
  8. You can try a triangle approach, get a connection from various places within the same country as well as a say australia and europe, or other continents, and do a short big packet ping that takes time (just ping a few times, not 100x times haha) and traceroute from different locations.
    You might notice where it chokes. Do a verbose method if you can so you can see the route (where possible)

    Sometimes switching to a vpn that lets you change location might also help, sometimes routing to eu might be screwing and if you are in the us and pick an amsterdam connection and try to check the amsterdam->eu connection it might not lag.

    The bottleneck in a choking or congested route might be the cause, but if everybody experiences this and it's only on join, it does sound like the issue is with the server itself, not the router to it.
  9. Well please add me on Skype so we can talk more constantly and I will allow you to join in my server so you can see what is happening.
    When you say "it does sound like the issue is with the server itself" you mean the minecraft server or the dedicated server? Cuz isn t the minecraft server because i am also having lag when connecting to my ssh
  10. I am sorry, I do not add people on Skype,
  11. Discord then or can you enter on my teamspeak?
  12. i sent a PM on here, and i have to get on a bus soon, i can't be everybody their personal IT guy, but i dont mind taking a quick look if i have the server host