How-To Setup GPU resources v1.3

Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See /edb-postgres-ai/1.3/hybrid-manager/ai-factory/.

Use this guide to prepare GPU resources in your Kubernetes cluster (Hybrid Manager or compatible) to support Model Serving with KServe.

Goal

Prepare your cluster to run GPU-based Model Serving workloads using KServe.

Estimated time

20–40 minutes (provisioning depends on your cloud provider).

What you accomplish

  • Provision GPU node groups/pools in your cluster.
  • Label and taint GPU nodes correctly.
  • Deploy the NVIDIA device plugin DaemonSet.
  • Store your NVIDIA API key as a Kubernetes secret.
  • Enable your cluster to run NIM model containers in KServe.

Prerequisites

  • Access to a Kubernetes cluster with appropriate permissions.
  • Administrative access to provision node groups (AWS EKS / GCP GKE / RHOS).
  • NVIDIA API key for accessing NIM models.
  • Familiarity with kubectl.

Provision GPU nodes

Provision GPU node groups (EKS) or node pools (GKE/RHOS):

  • Use instances with L40S or A100 GPUs (for example, g6e.12xlarge on AWS or a2-highgpu-4g on GCP).
  • Recommended: at least one node with four GPUs for large models.

Label and taint GPU nodes

kubectl label node <gpu-node-name> nvidia.com/gpu=true
kubectl taint nodes <gpu-node-name> nvidia.com/gpu=true:NoSchedule

Deploy the NVIDIA device plugin

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.1/nvidia-device-plugin.yml
kubectl get ds -n kube-system nvidia-device-plugin-daemonset

Store NVIDIA API key as Kubernetes secret

kubectl create secret generic nvidia-nim-secrets --from-literal=NGC_API_KEY=<your_NVIDIA_API_KEY>

This secret is used by ClusterServingRuntime for NIM models.

Next steps