Using models in Model Clusters & AIDB v1.3

Hub quick links: Model ServingAccess KServe endpoints


From applications

Applications can call deployed model endpoints directly via KServe InferenceServices.

These endpoints provide a standard REST API for sending requests and retrieving predictions or generations from your deployed models.


From AIDB (SQL patterns)

AIDB lets you call models from SQL, making them available directly inside Postgres. This can be useful for embedding models in pipelines or enabling in-database inference.


Hybrid Manager specifics

When models are deployed through Hybrid Manager:

  • Service URLs. Each model is exposed as an internal KServe endpoint within your HM project. The URL is visible in the Model Library or the Model Serving details page.
  • Authentication. Endpoints are protected by the platform. Applications running inside the same project can reach them directly. If you connect from outside, configure authentication using the HM ingress and project-scoped credentials.
  • Observability. Requests and logs from these endpoints flow into HM observability, giving you usage metrics, latency, and error tracking.

Key takeaway

  • Applications → call KServe inference endpoints over REST.
  • AIDB → use SQL patterns to call models inline, including OpenAI-style calls and NVIDIA NIM.
  • Hybrid Manager handles service discovery, auth, and observability so you can focus on building.