Gen AI Architecture (Entities and Flows) v1.3
This page describes how Gen AI entities work together at runtime and where configuration and governance live. Use it as a companion to concepts, how‑tos, and quickstarts.
Entities and relationships
- Assistant → orchestrates retrieval and tool use to fulfill the directive.
- Ruleset → constrains Assistant behavior (policy, tone, safety).
- Knowledge → knowledge bases (embedded content) + retrievers (hybrid search) + data lake inputs.
- Tools → enable actions and integrations via authenticated, typed interfaces.
- Structures → reusable building blocks (tasks, pipelines, workflows) called by Assistants.
- Threads → capture state and execution trace (prompts, retrievals, tool calls, outputs).
Entity docs:
Control and data flows
User/App → Assistant → (Retriever → Knowledge Base) → Model endpoint └→ (Tool call → API/service) ↳ Thread (state, trace)
Flow
- User or app calls the Assistant with a directive/input.
- Assistant retrieves context via the configured retriever and knowledge base(s).
- Assistant optionally invokes Tools to gather data or perform actions.
- Assistant calls the model endpoint for generation, grounded on retrieved context.
- Thread captures prompts, retrieved chunks, tool I/O, and responses for audit and debugging.
Endpoints and models
- Use private Model Serving endpoints for Sovereign AI.
- Internal calls use cluster‑local DNS; external calls use portal + access keys.
- Operation paths: chat (
/v1/chat/completions
), embeddings (/v1/embeddings
), rerank (/v1/ranking
). - Client patterns: OpenAI‑compatible or raw HTTP; see access guide and Python client quickstart.
Governance and observability
- Governing inputs: knowledge curation, retriever filters, tool scopes, rulesets.
- Observability: use Threads and platform metrics/logs to debug and monitor.
- Auditability: Threads provide a durable record of decisions and data use.