How-To Setup GPU resources v1.3.2

The November 2025 Innovation Release of EDB Postgres AI is available. For more information, see the release notes.

Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See /edb-postgres-ai/1.3/hybrid-manager/ai-factory/.

Use this guide to prepare GPU resources in your Kubernetes cluster (Hybrid Manager or compatible) to support Model Serving with KServe.

Goal

Prepare your cluster to run GPU-based Model Serving workloads using KServe.

Estimated time

20–40 minutes (provisioning depends on your cloud provider).

What you accomplish

Provision GPU node groups/pools in your cluster.
Label and taint GPU nodes correctly.
Deploy the NVIDIA device plugin DaemonSet.
Store your NVIDIA API key as a Kubernetes secret.
Enable your cluster to run NIM model containers in KServe.

Prerequisites

Access to a Kubernetes cluster with appropriate permissions.
Administrative access to provision node groups (AWS EKS / GCP GKE / RHOS).
NVIDIA API key for accessing NIM models.
Familiarity with kubectl.

Provision GPU nodes

Provision GPU node groups (EKS) or node pools (GKE/RHOS):

Use instances with L40S or A100 GPUs (for example, g6e.12xlarge on AWS or a2-highgpu-4g on GCP).
Recommended: at least one node with four GPUs for large models.

Label and taint GPU nodes

kubectl label node <gpu-node-name> nvidia.com/gpu=true
kubectl taint nodes <gpu-node-name> nvidia.com/gpu=true:NoSchedule

Deploy the NVIDIA device plugin

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.1/nvidia-device-plugin.yml
kubectl get ds -n kube-system nvidia-device-plugin-daemonset

Store NVIDIA API key as Kubernetes secret

kubectl create secret generic nvidia-nim-secrets --from-literal=NGC_API_KEY=<your_NVIDIA_API_KEY>

This secret is used by ClusterServingRuntime for NIM models.

KServe concepts

Next steps

← Prev

How-To Manage Repository and Image Tag Metadata

↑ Up

AI Factory Models

How-To Configure a ClusterServingRuntime