Kubernetes Architecture: Explained

Kubernetes Architecture: Explained

In my previous article I explained what Kubernetes is and it's various components. In this article I will try to explain it's Architecture in as simple words as I can.

So to summarize my previous article, Kubernetes is a container orchestration platform designed for running distributed applications having large number of containers. Think of it as the operating system for cloud-native applications just as desktop applications run on Windows, MacOS, Linux.

Architecture

There are two types of nodes Kubernetes operates on - Master and Worker. The master node is also called as the control plane and the worker node as the worker machines or compute machines. I will go through the roles each one of them have inside of the Kubernetes cluster.

architecture1.png

Processes running inside Master Node (Control Plane)

Master node acts as the controlling node and point of contact. Usually there is a single master node but its possible to have multiple master node setup. There are 4 processes that run on every master node which control the cluster state and the worker nodes.

  • Api Server
  • Scheduler
  • Controller-Manager
  • etcd

Api Server

API server is the frontend of the Kubernetes. When you want to interact with a Kubernetes components, you first talk with the API server which will validate your request and if everything is fine, it will forward your request to other processes of that component. You talk with the API server using some client. The client could be an UI (Kubernetes Dashboard), CLI (kubectl) or the Kubernetes API. So, the API Server is the cluster gateway which gets the initial request. It also acts as the gatekeeper for authentication.

Scheduler

When a request comes to schedule a new pod to the Api server, after validating it, the Api server will hand the request over to the scheduler in order to schedule the pod in one of the worker nodes. Scheduler will not just assign the pod to any node. Scheduler has an intelligent way of deciding on which worker node the pod will be scheduled. It will consider how much resources like CPU, RAM your application pod needs and then its gonna look at the worker nodes to check which one of them has the most available resources. Scheduler just decides on which node the new pod should be scheduled but a different process will actually start the pod in the node (discussed further).

Controller-Manager

Controller-Manager detects state changes of the cluster. For example, when a pod dies the controller-manager detects this change and reschedules new pod as soon as possible by making the request to the scheduler. There are also other processes that controller-manager is responsible for like connecting services to pods so requests go to the right endpoints, creating accounts and API access tokens.

etcd

etcd is the cluster brain which is a key-value store. Every change in the cluster like creation or deletion of a pod gets stored or updated into etcd. Scheduler receives the information of node resources from etcd. Likewise, the controller-manager too receives information about changes in cluster state like dying of pod from etcd. Note that application data is not stored in etcd. It just stores cluster state information which master processes use to communicate with worker processes and vice-versa.

In multi-master nodes cluster, every master node has its own master processes. The API servers are load balanced and etcd forms distributed storage across all master nodes.

Processes running inside Worker Node

Worker Nodes are one of the most important component of Kubernetes cluster. Each node has multiple application pods with containers running on that node. These are the nodes that do the actual work and hence called worker nodes. Every worker node has 3 processes running inside.

  • Container runtime
  • kubelet
  • kube-proxy

Container runtime

Container runtime is the software that is responsible for running containers. Usually it is Docker but it could be other technologies as well. Every application pod has containers running inside so every node must have container runtime installed.

kubelet

Kubelet is the process of Kubernetes itself that interacts with both container runtime and the node. It is responsible for starting a pod with a container inside and assigning resources like CPU, RAM to it. When control plane i.e. the master node wants something to happen in a worker node, the kubelet installed in that worker node executes the action. Kubelet also ensures that pods and their containers are healthy and running in their desired state. This component also reports to the master on the health of the host where it is running.

kube-proxy

kube-proxy is a network proxy that runs on each worker node in the cluster. Kube-proxy has intelligent forwarding request logic inside that makes sure the communication works in a performant way and handles network communications inside or outside of your cluster.

So this was all from my side. There is of course lot more into it that I just can't cover yet but this much should be enough to get you started with architecture of Kubernetes.