Karpenter node controller

Karpenter node controller

Karpenter is a strong alternative to our long-standing SKS Cluster Autoscaler, providing a more flexible and efficient way to manage node provisioning and scaling in your SKS clusters. Karpenter dynamically launches just the right compute resources to handle your cluster’s workloads based on real-time demand. It allows you to have from scale-to-zero to infinite scaling capabilities, depending on your workloads.

Karpenter is currently in Public Preview; please be aware that some features may change before it is Generally Available.

Prerequisites

  • An Exoscale SKS cluster on the Pro plan.
  • Karpenter add-on enabled on your SKS cluster. You can enable it via the Exoscale Console, CLI or Terraform.

You don’t have to install Karpenter yourself, as it is provided as a managed add-on on Exoscale SKS clusters.

Considerations

Karpenter requires dedicated IAM permissions to manage compute instances on your behalf. To this end, when you enable the Karpenter add-on, an IAM role with the required permissions is automatically created.

Make sure not to delete or modify this role, as it is essential for Karpenter’s functionality.

Karpenter further requires certain CRDs to be installed in your cluster. When you enable the Karpenter addon, these CRDs are also automatically installed.

kubectl get crds | grep karpenter
exoscalenodeclasses.karpenter.exoscale.com              2025-11-04T10:13:23Z
nodeclaims.karpenter.sh                                 2025-11-04T10:13:26Z
nodeoverlays.karpenter.sh                               2025-11-04T10:13:26Z
nodepools.karpenter.sh                                  2025-11-04T10:13:27Z

Configuration

If you haven’t previously set up any nodes or configured the cluster-autoscaler before, all your SKS-backed pods will be in a Pending state on your cluster; this is expected, as no Node is available to schedule them.

In case you add Karpenter to an existing cluster where the cluster-autoscaler is used, they won’t conflict on node ownership. However, if pods require some more nodes to be scheduled and both cluster-autoscaler nodepool and Karpenter nodepool can satisfy the request, both controllers may try to provision new nodes. This may lead to overprovisioning of nodes in your cluster. Existing pods won’t be rescheduled to Karpenter nodes if they are already scheduled on cluster-autoscaler nodes.

❯ kubectl get pod -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-59556d9b4c-dgjdj   0/1     Pending   0          7m2s
kube-system   coredns-7d78fc9666-9rrqf                   0/1     Pending   0          6m58s
kube-system   coredns-7d78fc9666-kgclq                   0/1     Pending   0          6m58s
kube-system   konnectivity-agent-5b79d466d4-99jtg        0/1     Pending   0          6m57s
kube-system   konnectivity-agent-5b79d466d4-stbq5        0/1     Pending   0          6m57s
kube-system   metrics-server-6654fc5ff6-cjs46            0/1     Pending   0          6m55s

The main advantage of Karpenter is the possibility to have a fine-grained control over the node provisioning, permitting to have the best fitting nodes for your workloads.

Karpenter relies on 2 Kubernetes CRDs to automatically scale the cluster’s Nodes:

  • ExoscaleNodeClass: Exoscale SKS-specific implementation details for Karpenter
  • NodePool: Karpenter generic nodepool definition

ExoscaleNodeClass

apiVersion: karpenter.exoscale.com/v1
kind: ExoscaleNodeClass
metadata:
  name: standard
spec:
  imageTemplateSelector:
    # version: Kubernetes version (semver format like "1.34.1")
    # If omitted (or if you use imageTemplateSelector: {}), the control plane's
    # current Kubernetes version will be auto-detected at runtime
    version: "1.34.1"

    # variant: Template variant (optional, defaults to "standard")
    # Options: "standard" for regular workloads, "nvidia" for GPU-enabled nodes
    variant: "standard"
  
  # Disk size in GB (default: 50)
  diskSize: 100
  
  # Security groups (optional)
  # List the security group IDs to attach to instances
  # Ensure to set a correct security group allowing necessary traffic
  # like CNI or ingress
  securityGroups:
    - "<setme>"
  
  antiAffinityGroups: []
  privateNetworks: []

This ExoscaleNodeClass defines a node class named standard to be used by any NodePool you will define later. This Kubernetes resource permits to define which image you want to use. If you use nvidia variant, it’s important to set the variant. For most of the workloads imageTemplateSelector can be left empty, Karpenter will use the control plane’s Kubernetes version and the standard variant by default. Thanks to Karpenter drift mechanism, when you upgrade your SKS cluster version, Karpenter will automatically use the new version for new nodes.

Do not forget to set up the securityGroups attribute, as your Kubernetes cluster needs to have a working network mesh. See this documentation for more information about security groups in SKS.

NodePool

We will use here the previously configured ExoscaleNodeClass to create a NodePool with some basic configuration.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: standard-main
spec:
  template:
    metadata:
      labels:
        nodepool: standard-main
    spec:
      nodeClassRef:
        group: karpenter.exoscale.com
        kind: ExoscaleNodeClass
        name: standard
      
      requirements:
        - key: "node.kubernetes.io/instance-type"
          operator: In
          values:
            - "standard.small"
            - "standard.medium"
            - "standard.large"
            - "standard.extra-large"

      expireAfter: 24h
  
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    
    consolidateAfter: 30m
    
    budgets:
    - nodes: "10%" 

  limits:
    cpu: 1000
    memory: 4000Gi

This NodePool named standard-main will provision nodes using the standard ExoscaleNodeClass. We permit to reduce nodepool size after 30 minutes of stabilization, but only 10% of the nodes can be removed at once. Node lifetime is 24h, after this time nodes will be terminated and replaced by new ones if needed.

This nodepool can provision nodes with standard.small, standard.medium, standard.large and standard.extra-large instance types only. The maximum resources that can be provisioned by this nodepool is 1000 CPU and 4000Gi of memory (in total).

To see what other attributes you can customize, refer to the official Karpenter documentation.

Deploying the configuration

Now that we have our ExoscaleNodeClass and NodePool definitions, we can deploy them to our SKS cluster.

kubectl apply -f exoscale-nodeclass.yaml
kubectl apply -f nodepool.yaml

After few seconds you should see new nodes and nodeclaims being provisioned by Karpenter:

❯ kubectl get nodes
NAME                           STATUS   ROLES    AGE    VERSION
k-main-5hw57   Ready    <none>   92s    v1.34.1
k-main-x65sj   Ready    <none>   107s   v1.34.1
❯ kubectl get nodepool 
NAME            NODECLASS   NODES   READY   AGE
standard-main   standard    2       True    3m38s
❯ kubectl get nodeclaim
NAME                  TYPE             CAPACITY    ZONE       NODE                           READY   AGE
standard-main-5hw57   standard.small   on-demand   ch-gva-2   k-main-5hw57   True    3m19s
standard-main-x65sj   standard.small   on-demand   ch-gva-2   k-main-x65sj   True    3m19s

We see our pods scheduled on the newly created nodes:

❯ kubectl get pod -A -o wide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE     IP                NODE           NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-59556d9b4c-dgjdj   1/1     Running   0          3h6m    192.168.86.193    k-main-x65sj   <none>           <none>
kube-system   calico-node-87lkx                          1/1     Running   0          6m2s    <redacted>        k-main-x65sj   <none>           <none>
kube-system   calico-node-wz6bx                          1/1     Running   0          5m47s   <redacted>        k-main-5hw57   <none>           <none>
kube-system   coredns-7d78fc9666-9rrqf                   1/1     Running   0          3h6m    192.168.86.195    k-main-x65sj   <none>           <none>
kube-system   coredns-7d78fc9666-kgclq                   1/1     Running   0          3h6m    192.168.115.130   k-main-5hw57   <none>           <none>
kube-system   konnectivity-agent-5b79d466d4-99jtg        1/1     Running   0          3h6m    192.168.86.194    k-main-x65sj   <none>           <none>
kube-system   konnectivity-agent-5b79d466d4-stbq5        1/1     Running   0          3h6m    192.168.115.129   k-main-5hw57   <none>           <none>
kube-system   kube-proxy-cb59l                           1/1     Running   0          6m2s    <redacted>        k-main-x65sj   <none>           <none>
kube-system   kube-proxy-xrbnm                           1/1     Running   0          5m47s   <redacted>        k-main-5hw57   <none>           <none>
kube-system   metrics-server-6654fc5ff6-cjs46            1/1     Running   0          3h6m    192.168.86.196    k-main-x65sj   <none>           <none>

Cleanup

If Karpenter is enabled on your SKS cluster and you want to disable it, or remove your SKS cluster, make sure to first delete all Karpenter NodePools. Our API will prevent you from disabling Karpenter or deleting the cluster if any NodePool is still present.

kubectl delete nodepools.karpenter.sh --all

FAQ

What happens if I delete a Karpenter compute instance outside of Kubernetes?

Karpenter will automatically detect the missing node and will provision a new one to replace it when needed. Please note that any workloads running on the deleted node will be rescheduled to other available nodes . You may experience a brief disruption in your services after abruptly removing the node.

What does Karpenter do when downscaling a nodepool?

When Karpenter decides to downscale a nodepool, it first cordons the node to prevent new pods from being scheduled on it. Then, it evicts all pods running on the node, respecting Pod Disruption Budgets (PDBs) if they are defined. Once the node is drained, Karpenter terminates the compute instance associated with the node. This process helps to ensure that the cluster remains stable and that workloads are not disrupted more than necessary. You can prevent Karpenter from evicting certain pods by setting the karpenter.sh/do-not-evict annotation to true on the Pod spec or on the Node spec.

What happens if I remove a Karpenter NodeClaim object directly?

The NodeClaim is tied to the node. At Exoscale we decided to be conservative and we evict pods running on the node. After 15 minutes, Karpenter will terminate the compute instance associated with the node if it has not successfully finished to evict all pods. Removing this object is not recommended, you should only do it for testing purposes or in case of emergency.