Using models in Model Clusters & AIDB v1.3.2

The November 2025 Innovation Release of EDB Postgres AI is available. For more information, see the release notes.

Hub quick links: Model Serving — Access KServe endpoints

From applications

Applications can call deployed model endpoints directly via KServe InferenceServices.

Quickstart (Python): Quickstart using inference endpoint

These endpoints provide a standard REST API for sending requests and retrieving predictions or generations from your deployed models.

From AIDB (SQL patterns)

AIDB lets you call models from SQL, making them available directly inside Postgres. This can be useful for embedding models in pipelines or enabling in-database inference.

OpenAI API compatibility: Using models via OpenAI-compatible SQL calls
NVIDIA NIM: Using NVIDIA NIM in AIDB

Hybrid Manager specifics

When models are deployed through Hybrid Manager:

Service URLs. Each model is exposed as an internal KServe endpoint within your HM project. The URL is visible in the Model Library or the Model Serving details page.
Authentication. Endpoints are protected by the platform. Applications running inside the same project can reach them directly. If you connect from outside, configure authentication using the HM ingress and project-scoped credentials.
Observability. Requests and logs from these endpoints flow into HM observability, giving you usage metrics, latency, and error tracking.

Key takeaway

Applications → call KServe inference endpoints over REST.
AIDB → use SQL patterns to call models inline, including OpenAI-style calls and NVIDIA NIM.
Hybrid Manager handles service discovery, auth, and observability so you can focus on building.

↑ Up

AI Factory in Hybrid Manager