Partitioning GPU instances using MIG
Multi-Instance GPU (MIG) allows you to partition a physical GPU into multiple isolated instances, each with dedicated memory and compute resources. This is particularly useful when you need to run multiple workloads that don’t require a full GPU.
Prerequisites
- A GPU instance with a MIG-capable GPU. On Exoscale, the following GPU types support MIG:
- A supported NVIDIA datacenter driver installed in the guest (R460 or later)
Install NVIDIA Driver
Install a supported NVIDIA datacenter driver in the guest. For MIG support, you need a driver that supports MIG (R460 or later). We recommend using the server driver variant for stability:
# Add NVIDIA repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install driver (requires reboot)
sudo apt update
sudo apt install -y nvidia-driver-570-server
sudo rebootCheck MIG Capability
Verify that your GPU supports MIG:
nvidia-smi -L
nvidia-smi -q | lessLook for MIG capability in the output. If your GPU supports MIG, you’ll see MIG devices listed.
Enable MIG Mode
Enable MIG mode on the GPU:
sudo nvidia-smi -i 0 -mig 1Replace 0 with the appropriate GPU index if you have multiple GPUs.
Create GPU Partitions
List available MIG profiles:
sudo nvidia-smi mig -lgipCreate a GPU partition using the profile ID:
sudo nvidia-smi mig --create-gpu-instance <profile_id> --default-compute-instance -i 0Replace <profile_id> with the ID of the desired MIG profile from the previous step. The --default-compute-instance flag creates a compute instance along with the GPU instance.
On GPU A30, MIG configurations persist beyond the lifecycle of the instance.
Make sure to clean up the configuration before stopping or deleting the VM to avoid impacting future workloads.
Verify MIG Devices
Confirm the created MIG devices:
nvidia-smi -LYou should see multiple MIG devices listed, each with its own UUID.
Clean Up MIG Configuration (applicable for A30 only)
On GPU A30, MIG configurations persist on the GPU even after a virtual machine is stopped or deleted.
This means that if a GPU is reused by another instance, previously created MIG partitions may still be present and affect expected behavior.
To avoid conflicts or unexpected errors, it is recommended to remove any MIG configuration before stopping or deleting the instance. Also apply this command in case the GPU instance started with an unwanted MIG configuration to access the full GPU A30 performance.
Example:
sudo nvidia-smi mig -i 0 -dgiRun Workloads on MIG Devices
Using CUDA
Set the CUDA_VISIBLE_DEVICES environment variable to target a specific MIG device:
export CUDA_VISIBLE_DEVICES=MIG-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxReplace the UUID with the MIG device you want to use. You can get the UUID from the nvidia-smi -L output.
Using Docker
First, install Docker and the NVIDIA container toolkit:
# Install Docker
curl -fsSL https://get.docker.com | sh
# Add NVIDIA container toolkit
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo systemctl restart dockerYou can run Docker containers against specific MIG devices using the NVIDIA runtime:
docker run --gpus ' MIG-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' nvidia/cuda:12.9.1-base-ubuntu24.04 nvidia-smiTo run two Docker containers each using a different MIG partition:
# Get MIG device UUIDs
MIG_UUIDS=$(nvidia-smi -L | grep MIG | awk -F'MIG-' '{print "MIG-" $2}' | awk -F')' '{print $1}')
# Extract individual UUIDs
MIG1=$(echo $MIG_UUIDS | cut -d' ' -f1)
MIG2=$(echo $MIG_UUIDS | cut -d' ' -f2)
# Run container on first MIG device
docker run --gpus "$MIG1" -it --name container1 nvidia/cuda:12.9.1-base-ubuntu24.04 nvidia-smi
# Run container on second MIG device
docker run --gpus "$MIG2" -it --name container2 nvidia/cuda:12.9.1-base-ubuntu24.04 nvidia-smiConsiderations
- MIG partitions are fixed in size and cannot be resized after creation
- Each MIG instance has dedicated memory that cannot be oversubscribed
- Not all GPUs support MIG - check NVIDIA’s documentation for supported hardware
- MIG mode must be enabled before creating partitions and cannot be disabled without rebooting
- On some GPUs (such as A30), MIG configurations persist until explicitly removed. On Blackwell generation, MIG configurations do not persist across reboots.