Azure Kubernetes Cluster with GPU node on demand

Azure Kubernetes Cluster with GPU node on demand

CISEL's photo
CISEL
·Apr 1, 2022·

3 min read

Subscribe to our newsletter and never miss any upcoming articles

As you may have noticed, the cost of using GPUs on public clouds is high, even very* high.

So, if your application needs to run temporarily on GPU nodes, you'll need to make sure that these nodes are only charged during these executions!

We'll show you how to do this using a Kubernetes Azure AKS cluster and the nodepool concept.

What we will achieve here:

  • Create an AKS cluster
  • Get the prerequisites to use GPU nodes
  • Create and additional nodepool with the GPU node
  • Create a Kubernetes deployment that request GPU

Let's do it!

We will assume that you have an account with sufficient rights to create and manage resource group in your subscription.

Create the AKS cluster

We will login into the Azure subscription and create a single node AKS cluster in the Switzerland North location.

az login

$resourcegrouplocation="switzerlandnorth"
$resourcegroupname="aks-gpu-demo-01"
$aksclustername="aks-gpu-demo-01"


az group create --name $resourcegroupname --location $resourcegrouplocation
az aks create -g $resourcegroupname -n $aksclustername --node-count 1 --generate-ssh-keys

And voilà, your AKS cluster is being deployed in your subscription. When the cluster is up and running you can kubectl it.

az aks get-credentials -g $resourcegroupname -n $aksclustername
kubectl get nodes -o wide

Get the prerequisites to use GPU nodes

There is a more administrative part to do before you can continue

You have to apply for a quota extension for NC6s_v3 VMs. This extension request doesn't imply any additional cost, it's just administrative.

quota1.png

Once your extension request is validated you can proceed with the GPU feature registration.

az aks get-credentials --resource-group $resourcegroupname --name $aksclustername
az feature register --name GPUDedicatedVHDPreview --namespace Microsoft.ContainerService
az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/GPUDedicatedVHDPreview')].{Name:name,State:properties.state}"
az provider register --namespace Microsoft.ContainerService
az extension add --name aks-preview
az extension update --name aks-preview

Here we will create the new GPU nodepool with the standard_nc6s_v3 model. We will specify a maximum of 1 node and a minimum of 0. This will allow us to scale by default the number of GPU nodes to 0 and therefore not have 1 node running all the time.

$gpunodepoolname="aksgpudemo01"
$gpunodesize="standard_nc6s_v3"
$mingpunodecount="0"
$maxgpunodecount="1"

az aks nodepool add --resource-group  $resourcegroupname --cluster-name $aksclustername --name $gpunodepoolname --node-count 0 --node-vm-size $gpunodesize --node-taints sku=gpu:NoSchedule --aks-custom-headers UseGPUDedicatedVHD=true --enable-cluster-autoscaler --min-count $mingpunodecount --max-count $maxgpunodecount

To use the GPU node, schedule a GPU-enabled workload with the appropriate resource request. Below we run a Tensorflow job against the MNIST dataset. Create a file named aks-gpu-demo-01.yaml and paste the following YAML manifest. The important part of the manifest below is the use of nvidia.com/gpu: 1 to request the schedule on a GPU worker node.

apiVersion: batch/v1
kind: Job
metadata:
  labels:
    app: aks-gpu-demo-01
  name: aks-gpu-demo-01
spec:
  template:
    metadata:
      labels:
        app: saks-gpu-demo-01
    spec:
      containers:
      - name: aks-gpu-demo-01
        image: mcr.microsoft.com/azuredocs/samples-tf-mnist-demo:gpu
        args: ["--max_steps", "500"]
        imagePullPolicy: IfNotPresent
        resources:
          limits:
           nvidia.com/gpu: 1
      restartPolicy: OnFailure
      tolerations:
      - key: "sku"
        operator: "Equal"
        value: "gpu"
        effect: "NoSchedule"

Deploy the job and watch the nodes

kubectl apply -f aks-gpu-demo-01.yaml
watch kubectl get nodes -o wide

You can now see the GPU node being used by the job and then being deleted once the job is finished.

With this method we optimize and control the costs generated by the use of GPU nodes ;-)

Feel free to comment this article if you have questions.

cisel.ch

 
Share this