What’s Important for Kubernetes Performance
The way Kubernetes works “under the hood” dictates which components are more important for improving performance and which are less important. So let’s talk about the Kubernetes internals first.
The most important thing you need to know about Kubernetes is it follows “hub-spoke” architecture. We have a control-plane component, which includes an API and runs on master nodes, and we have nodes constantly communicating to the API. But that’s not all.
Kubernetes is much more than just an API server. To keep track of all the resources and configurations, it uses (by default) etcd server. It also has a few more components—for example, a scheduler, which needs to constantly keep track of pods and nodes; kubelet, which acts as a “node manager”; and kube-proxy, which acts as a “router” for the nodes. We don’t need to talk about all of them here, but the point is Kubernetes needs to do a large number of small tasks.
Improving Kubernetes Together With Your Application
Improving the performance of your cluster must include changes to the way you build your containers. Otherwise, you won’t see much of an improvement by only focusing on Kubernetes itself. Kubernetes is a distributed system, so if you have it perfectly optimized but your container image is very heavy, for example, the deployment will take a long time anyway.
Now that you get the general idea of Kubernetes internals, let’s start thinking about how to improve its performance.
Disk IOPS and Networking
The first thing to consider is the server type for Kubernetes nodes. Of course, the number of CPU cores and the amount of RAM will be important, but you shouldn’t neglect the storage and networking, especially when running Kubernetes in the cloud. Cloud providers often offer “basic” storage options if not specified otherwise.
“Everything is a file” in the Linux world, even some network operations. And since Kubernetes needs to keep track of a large number of pods, nodes, iptable rules, and configurations, it does a lot of I/O operations. Some cloud providers scale disk IOPS based on the disk size, for example. Therefore, for Kubernetes performance, it’s beneficial to use bigger disks even if you don’t need the space. At the very least, don’t use the smallest disks possible.
The same applies to networking. Whenever possible, try to use upgraded/premium networking options. The less latency between the nodes and the API server, the better, especially in bigger clusters.
Node Operating System
You may be thinking the operating system (OS) doesn’t have much impact on Kubernetes performance. And in fact, for day-to-day Kubernetes operations, it doesn’t. But running Kubernetes also includes adding and removing nodes (due to the cluster autoscaler) or restarting them (to perform an upgrade, for example). If we’re talking about clusters with dozens of nodes, the time needed for the nodes to boot up can quickly add up.
If you want to quickly react to spikes in traffic, you should pick a minimalistic container-optimized OS (for example k3OS or Flatcar Linux). These are significantly smaller than traditional operating systems. Since all you care about is running in containers, you don’t need anything from your OS other than the ability to run containers. And these operating systems are optimized to do this.
Fewer But Larger or More But Smaller
The next thing to consider is to either choose fewer bigger servers or more smaller servers. One of the advantages of running a containerized application is you can deploy more than one pod per application. By default, Kubernetes will try to spread these pods evenly across all nodes. So, if a node failure occurs, your application shouldn’t experience downtime (because there should be another pod of the application running on a different node). This brings us to the conclusion many small nodes are better because they decrease the number of pods affected by a node’s downtime. If you have, for example, 300 pods in total and you spread them across three big nodes, you’ll lose around 100 pods on a node failure. If you spread them across 30 nodes instead, you’ll only lose 10.
This calculation, however, is too simplified and isn’t taking something important into account. Remember, each node will use some amount of RAM and CPU to run its OS and Kubernetes components. Therefore, the more nodes you have, the more CPU and RAM you “waste.” To summarize, for high availability and reliability, it’s better to run more smaller nodes rather than fewer larger nodes, but you shouldn’t go too extreme with this rule.
On top of the abovementioned tips about cloud optimization (premium disks and networking), you should try to find options specific to your cloud. For example, if you’re using other services from the same cloud provider, it’s probably possible to avoid connecting to them via the internet and use your cloud provider backbone network instead. This is useful for container registry connections, for example (if you’re using one from your cloud provider).
Now that we have the infrastructure level covered, let’s focus on Kubernetes.
Optimized Docker Images
Optimizing docker containers is a topic of its own. But the most significant improvement can be made by using slim base images for your container. For example, you can use ruby-alpine instead of ruby. This will make all the deployments faster, which can be important when coping with a suddenly increased load, for instance. Also, make sure the application inside the container properly handles the SIGTERM signal. If it doesn’t, your pod will get stuck in the “Terminating” state for quite a while (30 seconds by default). In some cases, this can block your deployments.
Requests and Limits
You may be wondering what setting requests and limits has to do with performance. Quite a lot, actually. Basically, the main job of Kubernetes is to find an appropriate node for your pod and instruct the node to run the pod and keep track of it (for example, to restart it when it crashes). Without having requests and limits set, the Kubernetes scheduler will be “blind” and will only randomly assign pods to nodes. Imagine the following example. You have one pod, which requires a lot of RAM, but it takes a while for it to allocate all of the RAM. Without requests set, Kubernetes won’t know how much RAM the pod will need and may schedule it on a node with little free RAM left. This will cause the pod to crash after a short time. In general, Kubernetes will shuffle your pods between the nodes much more often, something you want to avoid. If you set requests, on the other hand, Kubernetes will know exactly how to distribute your pods between the nodes in the most efficient way. Additionally, without limits set, one pod may simply kill all the other pods on the node if it has a memory leak.
Separate the Most Resource-Consuming Pods
Setting requests and limits is a great first step, but it doesn’t solve all the problems the Kubernetes scheduler may run into. What if you have a few important and resource-consuming applications and a lot of smaller microservices? The best strategy is to separate these two. You can do this with node affinities and pod affinities.
With node affinities, you can instruct the Kubernetes scheduler to run the resource-hungry pods on specific nodes. This will help you avoid situations where these pods would cause eviction of other pods. You can even add different servers to your cluster (with faster CPU, for example) to use with these applications.
Similarly, pod priorities allow you to specify the “importance” of the pods. So when Kubernetes has to evict some pods from the node, it will first try to do so with pods that have a lower priority.
What if you have two (or more) pods needing to talk to each other often? You can use pod affinity, which allows you to specify some pods should be scheduled closer together (in the same zone or group of nodes, for example). Putting them closer together helps decrease the network latency between them.
Taints and Tolerations
Imagine you have some important pods running long-term bulk operations. You don’t want any disruptions to these pods; therefore, it would be best to keep the nodes where they’re running “explicit” for them. Sound like node affinity? Not really. With node affinities, you make sure specific pods run on specific nodes, but it doesn’t prevent other pods from being scheduled on the same nodes (if there are enough free resources). Taints, on the other hand, prevent pods from being scheduled on the nodes unless the pods have toleration for the specific taint.
Every time a pod needs to be scheduled, Kubernetes will check all the nodes for their resources. When you have a cluster with hundreds of nodes, this can become problematic. Imagine you have 100 nodes, and you want to schedule a pod needing one CPU and 1GB of RAM. If the first node has more available, Kubernetes won’t pick it before checking all the other nodes. With percentageOfNodesToScore, you can instruct the scheduler to only check a certain percentage of nodes (for example, only check 25% of all nodes).
You now know how to improve the performance of your Kubernetes cluster. To actually see the improvement, though, you need to have a good monitoring system. You’ll see decreased deployment times, quicker reactions to load spikes, fewer pod restarts, etc.
But it may be a bit difficult to see all the improvements because some of them are on the infrastructure level, some are on Kubernetes, and some are on the application level. Traditionally, this would require you to gather data from different monitoring systems. Here’s an extra performance tip for your productivity: invest in a system capable of monitoring all the layers. SolarWinds® Observability can do this for you. It offers infrastructure and application performance monitoring in one place. Due to the correlation possibilities, it can help you find out where your performance is lacking and where it’s good.
Kubernetes drives innovation and helps you be more productive. To innovate faster and be more productive, you should make sure there are no bottlenecks in your Kubernetes cluster performance. An easy way to find these bottlenecks is to use Solarwinds Observability, a monitoring solution designed to combine all the layers of your cluster in one system.
This post was written by Dawid Ziolkowski. Dawid has 10 years of experience as a network/system engineer, has DevOps experience, and has recently worked as a cloud-native engineer. He’s worked for an IT outsourcing company, a research institute, a telco, a hosting company, and a consultancy company, so he’s gathered a lot of knowledge from different perspectives. Nowadays, he’s helping companies move to the cloud and/or redesign their infrastructure for a more cloud-native approach.