How-To Verify InferenceServices and GPU usage v1.3

Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See /edb-postgres-ai/1.3/hybrid-manager/ai-factory/.

Use this guide to confirm the correct deployment and operational status of InferenceServices and GPU resource usage.

Goal

Ensure your deployed InferenceServices are correctly utilizing GPU resources.

Estimated time

5–10 minutes.

Steps

Check InferenceService status

kubectl get inferenceservice -n <namespace>

Look for READY to ensure the service is running.

Confirm GPU resource usage

kubectl describe nodes | grep nvidia.com/gpu
kubectl exec -n <namespace> -it <pod-name> -- nvidia-smi

Troubleshoot common issues

kubectl get ds -n kube-system nvidia-device-plugin-daemonset
kubectl describe pods -n <namespace>