Restart a Kubernetes Cluster in a Practical Way
As cloud-native technologies continue to gain momentum, developers are focusing more on transforming conventional applications into cloud-native applications, hoping to take advantage of the flexibility and scalability that cloud-native technologies like Kubernetes offer.
Powerful as Kubernetes is, it could still bring difficulties in practice. For example, it might be a puzzle when it comes to restarting a Kubernetes cluster. In this article, we'll look into how to restart a Kubernetes cluster in a practical way.
What is a Kubernetes Cluster
A Kubernetes cluster is a combination of nodes that run containerized applications. These nodes can be virtual machines if the cluster is deployed in a cloud environment, or physical machines if the cluster is running in an on-premises environment. A Kubernetes cluster includes at least one control plane and a number of worker nodes. The control plane exposes the Kubernetes API so that the worker nodes can communicate with the control plane.
As the control plane oversees the state of a Kubernetes cluster, worker nodes handles tasks assigned by it to actually run containerized applications in pods. Moreover, the pods are not tied to any specific worker nodes. Kubernetes can schedule them around the cluster according to the declarative YAML manifests to improve stability and efficiency. To learn more about the concept of Kubernetes cluster, see Cluster Architecture.
Restart a Kubernetes Cluster
This article assumes that you set up your Kubernetes cluster through kubeadm or KubeKey.
You have to make sure that you at least finish the backup for ectd before restarting your cluster, which would prevent you from the loss of critical data. Next, let's go into details about the process of restarting a Kubernetes cluster.
Shut down worker nodes
Connect to a worker node through SSH.
Run the following commands to stop pod scheduling and drain existing pods on the node.
kubectl cordon <worker node name> kubectl drain <worker node name> --ignore-daemonsets --delete-emptydir-data
Run the following command to stop kubelet.
sudo systemctl stop kubelet
Run the following command to stop Docker.
sudo systemctl stop docker
Run the following command to shut down the worker node.
sudo shutdown now
Perform the same operations on other worker nodes (if any) to shut them down.
Shut down control planes
Connect to a control plane through SSH.
Run the following commands to stop pod scheduling and drain existing pods on the node.
kubectl cordon <control plane name> kubectl drain <control plane name> --ignore-daemonsets --delete-emptydir-data
Run the following command to stop kubelet.
sudo systemctl stop kubelet
Run the following command to stop Docker.
sudo systemctl stop docker
(Optional) If your etcd is deployed on the control plane, you need to run the following command to stop etcd service. If your etcd runs in the form of pod in your Kubernetes cluster, you can skip this step.
sudo systemctl stop etcd
Run the following command to shut down the control plane.
sudo shutdown now
Perform the same operations on other control planes (if any) to shut them down.
(Optional) Shut down ectd nodes
For a Kubernetes cluster deployed by kubeadm, etcd runs as a pod in the cluster and you can skip this step. If you set up your Kubernetes cluster through other methods, you may need to perform the following steps.
Connect to an etcd node through SSH.
Run the following command to stop kubelet.
sudo systemctl stop kubelet
Run the following command to stop etcd.
sudo systemctl stop etcd
Run the following command to stop Docker.
sudo systemctl stop docker
Run the following command to shut down the ectd node.
sudo shutdown now
Perform the same operations on other etcd nodes (if any) to shut them down.
Shut down storage
When all the worker nodes and control planes are shut down, you can shut down any persistent storage devices (if any).
Restart the Kubernetes cluster
- Power on any persistent storage devices (if any).
- Power on the instances for ectd nodes. You can log in to the etcd nodes and run the command
docker ps
to ensure that ectd is up and running. - Power on the instances for control planes. You can log in to the control planes and run the command
docker ps
to ensure that kube-apiserver, kube-controller-manager, and kube-scheduler are up and running. - Power on the instances for worker nodes. You can log in to the worker nodes and run the command
docker ps
to ensure that kubelet and kube-proxy are up and running.
Conclusion
This article hopes to give you a practical idea about how to restart a Kubernetes cluster. Nevertheless, restarting Kubernetes clusters requires caution because we might come across downtime during the restarting process, especially when we run single replicas of our application. In this connection, we should always pay attention to issues necessary to be taken into consideration before restarting any Kubernetes clusters.