AI Factory Quickstart v1.3
Hub quick links: Gen AI — Pipelines — Model Serving
What you’ll build
This quickstart gets you from zero to a working Gen AI assistant that answers questions from a Knowledge Base, plus a deployed model endpoint you can call from applications.
Prerequisites
- Prepare your deployment's GPU: Setup GPU
- Familiarity with the Gen AI concepts and Pipelines: Gen AI, Pipelines
Step 1 — Create a Knowledge Base
Create a Knowledge Base and ingest your content so the assistant can retrieve context:
- Create a Knowledge Base
- (Optional) Review Knowledge Base concepts: Knowledge Bases (explained)
Step 2 — Create an Assistant
Build a conversational assistant and attach your Knowledge Base:
- Create an Assistant
- (Optional) Review assistants: Assistants
Step 3 — Deploy a model endpoint
Deploy a model using KServe so your applications can call it directly:
- Deploy NIM containers
- Create an InferenceService
- (Optional) Customize runtime: Configure ServingRuntime
Step 4 — Call the endpoint from an app
Test your deployed model with a simple client and learn how to reach endpoints:
Step 5 — Monitor and iterate
Keep an eye on model health and performance:
- Monitor InferenceService
- (Optional) Tune Knowledge Base performance: Knowledge Base performance
Next steps
- Learn progressively: Learning paths
- Explore patterns: Use cases and Solutions
- Deep dive into the APIs and reference material via the Gen AI and Model Serving hubs