Scaling in Kubernetes
Scalability is the characteristics of an application to handle a growing amount of load without any effect on its performance and with no requirement of any sort of architectural and software redesigning. This load is primarily due to network latency, network congestion, multiple users using the application simultaneously, exceeding the maximum number of transactions handled.
Coming back to the topic, Kubernetes plays a great role as far as scaling of application is concerned.
Applications can be both horizontally scaled and vertically scaled.
Vertical scaling implies increasing the strength of the application by adding more resources or upgrading he existing resources.
Horizontal scaling on the other hand implies, the capability of the software to grow and thereby distributing the workload evenly among the grown infrastructure.
If the application is stateless, the application can be horizontally scaled.
Stateless applications are the ones that don’t have any state, doesn’t have local files or doesn’t keep local sessions. For more on Stateless v Stateful applications, you can read one of my previous blogs, here.
All traditional databases are stateful, they have db files that can’t be split over multiple instances
Most web applications can be made stateless.
Session management needs to be done outside the container.
Any files that can be saved shouldn’t be saved locally on the container.
Scaling in K8s can be done using Replication Controller
The replication controller(kind: ReplicationController) will ensure a specified number of pod replicas will run at a time.
A pod created with replica controller will automatically be replaced if they fail, get deleted or are terminated.
Using the replication controller is recommended if you just want to make sure 1 pod is always running, even after reboots.
kubectl scale — replicas=4 -f <yaml_file>
The above command can be used to scale pods as well.
Again emphasising, pods can only be horizontally scaled if they are stateless.
If they are stateful, they can’t be scaled.
Scaling operations are saved in K8s as backend in etcd.
Replica Set is the next-generation Replication Controller
It supports a new selector that can do selection based on filtering according a set of values.
e.g. “environment” either “dev” or “qa”
Replica Set are used by Deployment objects.
K8S has the possibility to automatically scale pods based on metrics.
K8S can automatically scale a Deployment, Replication Controller or ReplicaSet.
Autoscaling will periodically query the utilization for the targeted pods
By default 30 sec, can be changed using the “ — horizontal-pod-autoscaler-sync-period” when launching the controller manager
Autoscaling will use a monitoring tool, to gather its metrics and make scaling decisions.