Isolate critical pods in your SKS cluster

In Kubernetes clusters you have components more critical than others. These components should be isolated from user workloads to ensure the stability and security of your cluster.

Here you will learn how to master this isolation using Kubernetes and Exoscale features. We will focus on SKS critical components but the same principles can be applied to your own critical workloads (ie: your monitoring stack, your logging stack, etc…).

Prerequisites

Knowledge of SKS - how to create and/or update an SKS Cluster

Exoscale Critical Components Isolation Overview

To isolate critical components in your SKS cluster, you need to configure the appropriate Kubernetes resources. This typically involves:

Creating dedicated node pools for critical components.
Using Kubernetes taints and tolerations to ensure that critical components are scheduled on the appropriate nodes.

Exoscale critical components includes:

Your choice of CNI plugin (calico or cilium)
CoreDNS
Metrics Server

Our managed addons are using Kubernetes tolerations and affinities to ensure they are scheduled on dedicated nodes.

Here is an extract of the pod spec we apply to our critical managed addons:

spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          preference:
            matchExpressions:
            - key: exoscale.com/node-role
              operator: In
              values:
                - system
  tolerations:
    - operator: "Exists"
      key: CriticalAddonsOnly

With this configuration our critical components will be scheduled (if possible) on nodes labeled with exoscale.com/node-role=system and tainted with CriticalAddonsOnly=true:NoSchedule. Note that we are not using nodeSelector here, which permits our critical pods to run on any SKS cluster node if no dedicated node is available. This is important to ensure the availability of these critical components, as this node isolation is for more advanced users.

Configuration

In the example below, we will create a dedicated node pool for critical components and configure taints and tolerations accordingly, in the context of an SKS cluster.

First ensure you have an anti-affinity group created to spread your node pools across different hosts:

exo compute anti-affinity-group create my-cluster-sks-critical-workloads

Now you can create a small dedicated node pool for critical components with the appropriate labels and taints:

exo compute sks nodepool add my-sks-cluster critical-nodepool \
  --size 2 \
  --label exoscale.com/node-role=system \
  --taint CriticalAddonsOnly=true:NoSchedule \
  --instance-type small \
  --disk-size 20 \
  --anti-affinity-group my-cluster-sks-critical-workloads

Now we have our nodes ready in our SKS cluster:

❯ kubectl get node
NAME               STATUS   ROLES    AGE   VERSION
pool-a2437-hctjt   Ready    <none>   53s   v1.34.1
pool-a2437-weezc   Ready    <none>   47s   v1.34.1

You can verify that the nodes have the correct labels and taints:

❯ kubectl get node pool-a2437-hctjt -o yaml
apiVersion: v1
kind: Node
metadata:
  ...
  labels:
    ...
    exoscale.com/node-role: system
    ...
  name: pool-a2437-hctjt
  resourceVersion: "3457021066"
  uid: afeeec78-9aac-458c-993d-838eee510d96
spec:
  ...
  taints:
  - effect: NoSchedule
    key: CriticalAddonsOnly
    value: "true"
  ...

You can see that our pods are correctly scheduled on our dedicated nodes:

❯ kubectl get pod -A -o wide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE     IP                NODE               NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-59556d9b4c-zhqfr   1/1     Running   0          7m57s   192.168.218.117   pool-a2437-hctjt   <none>           <none>
kube-system   calico-node-sksc9                          1/1     Running   0          3m29s   <redacted>        pool-a2437-hctjt   <none>           <none>
kube-system   calico-node-vdlcj                          1/1     Running   0          3m23s   <redacted>        pool-a2437-weezc   <none>           <none>
kube-system   coredns-7d78fc9666-9sk8d                   1/1     Running   0          7m51s   192.168.218.130   pool-a2437-hctjt   <none>           <none>
kube-system   coredns-7d78fc9666-jlcsg                   1/1     Running   0          7m51s   192.168.153.66    pool-a2437-weezc   <none>           <none>
kube-system   konnectivity-agent-8468bff8f8-c4dmm        1/1     Running   0          7m50s   192.168.153.65    pool-a2437-weezc   <none>           <none>
kube-system   konnectivity-agent-8468bff8f8-l6x8k        1/1     Running   0          7m50s   192.168.218.129   pool-a2437-hctjt   <none>           <none>
kube-system   kube-proxy-kpkzt                           1/1     Running   0          3m29s   <redacted>        pool-a2437-hctjt   <none>           <none>
kube-system   kube-proxy-lj22f                           1/1     Running   0          3m23s   <redacted>        pool-a2437-weezc   <none>           <none>
kube-system   metrics-server-6654fc5ff6-qn4dk            1/1     Running   0          7m48s   192.168.218.131   pool-a2437-hctjt   <none>           <none>

With this setup, your SKS critical components are now isolated on dedicated nodes, enhancing the security of your Kubernetes cluster.

Conclusion

Isolating critical components in your SKS cluster is essential for maintaining the stability and security of your Kubernetes environment.

By leveraging dedicated node pools, taints, and tolerations, you can ensure that these vital components are protected from potential disruptions caused by user workloads. This approach not only enhances the reliability of your cluster but also provides a robust framework for managing critical workloads effectively.

This isolation strategy can be extended to other critical workloads within your cluster, allowing you to maintain high availability and performance across your Kubernetes infrastructure. You may consider creating dedicated nodepools for your CI/CD runners, your monitoring stack, your logging stack, ArgoCD/FluxCD, etc…