Gen AI Architecture (Entities and Flows) v1.3

This page describes how Gen AI entities work together at runtime and where configuration and governance live. Use it as a companion to concepts, how‑tos, and quickstarts.

Entities and relationships

  • Assistant → orchestrates retrieval and tool use to fulfill the directive.
  • Ruleset → constrains Assistant behavior (policy, tone, safety).
  • Knowledge → knowledge bases (embedded content) + retrievers (hybrid search) + data lake inputs.
  • Tools → enable actions and integrations via authenticated, typed interfaces.
  • Structures → reusable building blocks (tasks, pipelines, workflows) called by Assistants.
  • Threads → capture state and execution trace (prompts, retrievals, tool calls, outputs).

Entity docs:

Control and data flows

User/App → Assistant → (Retriever → Knowledge Base) → Model endpoint
                          └→ (Tool call → API/service)
                           ↳ Thread (state, trace)

Flow

  1. User or app calls the Assistant with a directive/input.
  2. Assistant retrieves context via the configured retriever and knowledge base(s).
  3. Assistant optionally invokes Tools to gather data or perform actions.
  4. Assistant calls the model endpoint for generation, grounded on retrieved context.
  5. Thread captures prompts, retrieved chunks, tool I/O, and responses for audit and debugging.

Endpoints and models

  • Use private Model Serving endpoints for Sovereign AI.
  • Internal calls use cluster‑local DNS; external calls use portal + access keys.
  • Operation paths: chat (/v1/chat/completions), embeddings (/v1/embeddings), rerank (/v1/ranking).
  • Client patterns: OpenAI‑compatible or raw HTTP; see access guide and Python client quickstart.

Governance and observability

  • Governing inputs: knowledge curation, retriever filters, tool scopes, rulesets.
  • Observability: use Threads and platform metrics/logs to debug and monitor.
  • Auditability: Threads provide a durable record of decisions and data use.

Where to start