Enabling GPU Support in SKS Nodes

Exoscale SKS allows you to run GPU-accelerated workloads, such as Machine Learning (ML), data analytics, and video transcoding, on your cluster. In this documentation, we will guide you through the steps to enable GPU support in Exoscale SKS nodes.

Prerequisites

As a prerequisite for the following documentation, you need:

An Exoscale SKS cluster on the Pro plan.
An organization with at least one GPU instance type authorized.
Access to your cluster via kubectl.
Basic Linux knowledge.

If you do not have access to an SKS cluster, follow the Quick Start Guide.

Enabling GPU Support in SKS

NOTE: If you are running nodes for Kubernetes version 1.31.12, 1.32.8, 1.33.4, or 1.34.0 or more recent, this section is not relevent anymore, and you don’t need to apply it. On these versions of our SKS node OS, the device plugin is pre-installed as a system service directly communicating with the kubelet process. If you previously deployed the NVIDIA Device Plugin DaemonSet, you can remove it once you upgraded all GPU nodes of your cluster to such versions.

To use GPUs in Kubernetes, the NVIDIA Device Plugin is required. The NVIDIA Device Plugin is a DaemonSet that automatically enumerates the number of GPUs on each node of the cluster and allows Pods to run on GPUs.

To enable GPU support in Exoscale SKS nodes, you need to deploy the following DaemonSet:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/main/deployments/static/nvidia-device-plugin.yml

NOTE
This is a simple static DaemonSet meant to demonstrate the basic features of the nvidia-device-plugin.

Running and Testing GPU Jobs

Once nodes with GPU are ready, NVIDIA GPUs can be requested by a container using the nvidia.com/gpu resource type:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  restartPolicy: Never
  containers:
    - name: cuda-container
      image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU
  tolerations:
  - key: nvidia.com/gpu
    operator: Exists
    effect: NoSchedule
EOF

kubectl logs gpu-pod
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Last updated on August 29, 2025