Observability (AI Factory on HM) v1.3.2

The November 2025 Innovation Release of EDB Postgres AI is available. For more information, see the release notes.

Hub quick links: Model Serving observability — Gen AI hub

Observability in Hybrid Manager

When you deploy AI Factory components on Hybrid Manager, the platform gives you integrated observability for both Model Serving (KServe) and Gen AI Builder workloads.

Model Serving (KServe)

Use the platform dashboards and logs to monitor InferenceServices and GPU workloads.
Metrics include request latency, throughput, GPU/CPU utilization, and error rates.
See: Monitor InferenceService

Gen AI Builder (Assistants & threads)

Each assistant thread is recorded in the system so you can trace inputs, retrieval context, and outputs.
Thread data can be viewed in the Gen AI Builder UI inside Hybrid Manager.
Usage metrics (number of runs, latency, errors) are integrated into the platform’s observability stack.
See: View threads (hub)

Accessing metrics and logs

Dashboards. Hybrid Manager surfaces AI Factory metrics through the built-in observability dashboards (Grafana).
Logs. Use kubectl logs or the HM UI log viewer to inspect Gen AI Builder pods, KServe InferenceServices, and retriever jobs.
Tracing. OpenTelemetry drivers are included in Gen AI Builder for deeper tracing; see Gen AI observability drivers (hub).

Typical commands

For cluster-level troubleshooting or custom dashboards, you can also pull metrics and logs directly:

# Check logs for a running InferenceService
kubectl logs -n <project-namespace> svc/<model-service-name>

# Port-forward Grafana if running locally
kubectl port-forward -n observability svc/grafana 3000:3000

Key takeaway

Model Serving → monitored as KServe services with full metrics.
Gen AI Builder → assistants and threads tracked for usage and auditing.
Hybrid Manager provides a unified observability layer — no separate monitoring stack to configure.

↑ Up

AI Factory in Hybrid Manager

Troubleshooting AI Factory on Hybrid Manager