How-To Setup GPU resources v1.3
Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See /edb-postgres-ai/1.3/hybrid-manager/ai-factory/.
Use this guide to prepare GPU resources in your Kubernetes cluster (Hybrid Manager or compatible) to support Model Serving with KServe.
Goal
Prepare your cluster to run GPU-based Model Serving workloads using KServe.
Estimated time
20–40 minutes (provisioning depends on your cloud provider).
What you accomplish
- Provision GPU node groups/pools in your cluster.
- Label and taint GPU nodes correctly.
- Deploy the NVIDIA device plugin DaemonSet.
- Store your NVIDIA API key as a Kubernetes secret.
- Enable your cluster to run NIM model containers in KServe.
Prerequisites
- Access to a Kubernetes cluster with appropriate permissions.
- Administrative access to provision node groups (AWS EKS / GCP GKE / RHOS).
- NVIDIA API key for accessing NIM models.
- Familiarity with
kubectl
.
Provision GPU nodes
Provision GPU node groups (EKS) or node pools (GKE/RHOS):
- Use instances with L40S or A100 GPUs (for example,
g6e.12xlarge
on AWS ora2-highgpu-4g
on GCP). - Recommended: at least one node with four GPUs for large models.
Label and taint GPU nodes
kubectl label node <gpu-node-name> nvidia.com/gpu=true kubectl taint nodes <gpu-node-name> nvidia.com/gpu=true:NoSchedule
Deploy the NVIDIA device plugin
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.1/nvidia-device-plugin.yml kubectl get ds -n kube-system nvidia-device-plugin-daemonset
Store NVIDIA API key as Kubernetes secret
kubectl create secret generic nvidia-nim-secrets --from-literal=NGC_API_KEY=<your_NVIDIA_API_KEY>
This secret is used by ClusterServingRuntime
for NIM models.