Blog

Running Folia on a 96-Core EPYC CPU

Recently, I had the privilege of deploying Folia, a Minecraft server software, on a 96-core EPYC CPU. Check out my experience and observations in this blog post.

Introduction

On July 14th, I hosted a Minecraft Bingo Event using Folia for SpeedSilver's community. This was by no means a realistic test, unlike the last test I conducted. However, I had the hopes to push Folia to its limits and see what's possible.

Configuration

Warning: Configurations shown in this article can only be used at your own risk. Conduct your own research before applying them!

Basic overview of the infrastructure.

Basic overview of the infrastructure.

UltraServers kindly provided me with the hardware and infrastructure for this test. Three dedicated servers were set up for the test.

Disclaimer: UltraServers has no association or affiliation whatsoever with either myself or SpeedSilver.

Main Server

Hardware

Software

Minecraft

config/paper-global.yml:

chunk-loading-basic:
  player-max-chunk-generate-rate: 40.0
  player-max-chunk-load-rate: 40.0
  player-max-chunk-send-rate: 40.0
chunk-system:
  io-threads: 30
  worker-threads: 10
misc:
  region-file-cache-size: 512
proxies:
  velocity: # 'secret' is redacted
    enabled: true
    online-mode: true
thread-regions:
  threads: 70

config/paper-world-defaults.yml:

environment:
  treasure-maps:
    enabled: false

spigot.yml:

settings:
  netty-threads: 50

bukkit.yml

settings:
  connection-throttle: -1
spawn-limits:
  monsters: 9
  animals: 7
  water-animals: 4
  water-ambient: 7
  water-underground-creature: 3
  axolotls: 3
  ambient: 4
ticks-per:
  monster-spawns: 30
  water-spawns: 30
  water-ambient-spawns: 30
  water-underground-creature-spawns: 30
  axolotl-spawns: 30
  ambient-spawns: 30

server.properties:

hide-online-players=true
max-players=5000
network-compression-threshold=-1
spawn-protection=0
simulation-distance=5
view-distance=8
online-mode=false

Network compression was disabled on the Folia server since Velocity was used.

JVM Flags

-Xms500G
-Xmx500G
-XX:+AlwaysPreTouch
-XX:+UnlockDiagnosticVMOptions
-XX:+UnlockExperimentalVMOptions
-XX:+HeapDumpOnOutOfMemoryError
-XX:+UseLargePages
-XX:LargePageSizeInBytes=2M
-XX:+UseShenandoahGC
-XX:ShenandoahGCMode=generational
-XX:-ShenandoahPacing
-XX:+ParallelRefProcEnabled
-XX:ShenandoahGCHeuristics=adaptive
-XX:ShenandoahInitFreeThreshold=55
-XX:ShenandoahGarbageThreshold=30
-XX:ShenandoahMinFreeThreshold=20
-XX:ShenandoahAllocSpikeFactor=10
-XX:ParallelGCThreads=10
-XX:ConcGCThreads=5
-Xlog:gc*:logs/gc.log:time,uptime:filecount=15,filesize=1M
-Dchunky.maxWorkingCount=3600

2x Proxy Servers

Hardware

Software

Velocity

show-max-players = 5000
player-info-forwarding-mode = "modern"
kick-existing-players = true

[advanced]
compression-threshold = -1

Network compression was disabled to save CPU resources. The proxy nodes were equipped with substantial network capacity, surpassing CPU capabilities. In the end, the players will be connecting through TCPShield, which will handle the compression.

Methodology

The server was prepared with a 200k x 200k block pre-generated world, a fourfold increase from the last test. This was done to accommodate more players, but unfortunately, that didn't matter.

Diagram showing the 81 spawn points.

Diagram showing the 81 spawn points.

The players were spread into 81 different spawn areas. The premise of the event was an achievement bingo, allowing all players to compete against one another. The server's difficulty was set to easy.

Results

The event started around 17:00 UTC. The server had a peak of 501 players on the Folia server, which is approximately half of the players on the previous test. As expected, the server was able to handle those players with no lag.

A Grafana screenshot showing basic server performance and player flow.

A Grafana screenshot showing basic server performance and player flow.

Folia used around 37 threads at its peak. Considering that the AMD EPYC™ 9654 has 96 cores and 192 threads, it should be able to handle at least 1336 players if the performance scales linearly. It should be able to handle up to 2582 players if Folia can take full advantage of SMT, but this is just hypothetical and it will most likely be different in practice.

A Grafana screenshot showing CPU usage across the servers.

A Grafana screenshot showing CPU usage across the servers.

Chunk IO and Netty threads are mostly untouched. Unsurprisingly, tick threads consumed the most amount of resources, peaking at around 20 threads. Chunk workers came close, peaking at 10 threads, despite the world being pre-generated. I speculate that it is mainly caused by region saving.

A Grafana screenshot showing detailed CPU usage for the Folia server.

A Grafana screenshot showing detailed CPU usage for the Folia server.

Folia's network usage peaked at around 1.21 Gb/s outbound and 38.12 Mb/s inbound. The number of packet/s peaked at 153.77 kp/s outbound and 73.19 kp/s inbonud.

A Grafana screenshot showing network consumption.

A Grafana screenshot showing network consumption.

Using the metrics gathered throughout this test, the estimated minimum thread counts are calculated as follows:

Be aware that these numbers are specific to this test and the hardware used. This is not a general recommendation and your mileage may vary.

A Grafana screenshot showing the correlation between player count and thread usage.

A Grafana screenshot showing the correlation between player count and thread usage.

Conclusion

Everything ran smoothly without any hiccups, which was expected. Determining whether the estimated figures remain accurate in practice is a challenging task. Further testing may reveal a more accurate figure. The point of diminishing returns for Folia remains uncertain, and further testing is required to pinpoint it accurately.

Want to learn more? Take a look at the previous test.

Links

Supporting Folia & PaperMC

Interested in supporting the development of Folia and PaperMC software? See sponsors.

Cubxity

Written by Cubxity

Full-stack developer

Contact