Gen AI Builder FAQ v1.3

This FAQ focuses on Gen AI Builder inside Hybrid Manager’s AI Factory. It covers common questions about models, knowledge bases, assistants, tools, SDK usage, security, performance, and operations.

Table of contents

Which models/drivers are supported?

Gen AI Builder is driver‑based. You can use private/endpoints or hosted APIs via Prompt Drivers, Embedding Drivers, and Vector Store Drivers.

  • Prompt Drivers: OpenAI, Anthropic, Google, and more (see reference: reference/sdk/drivers/index.mdx and per‑provider pages)
  • Embedding Drivers: OpenAI, Google, NVIDIA NIM, Hugging Face, etc. (reference/sdk/drivers/embedding-drivers.mdx)
  • Vector Store Drivers: pgvector (Postgres), local, Pinecone, Qdrant, Redis, and others (reference/sdk/drivers/vector-store-drivers.mdx)
  • NVIDIA NIM: deploy as private endpoints via Model Serving

See also: AI Factory Models and deploy NIM containers.

When to use RAG vs fine‑tuning?

Prefer RAG for living knowledge and governed answers; use fine‑tuning for tone/format/narrow tasks. Many production assistants combine both: retrieve context from Knowledge Bases and guide responses with Rulesets. For modular pipelines, explore RAG Engines.

How do I create and update Knowledge Bases?

  • Create: Create a Knowledge Base and configure Data Sources (Confluence, Google Drive, S3, Web Page, Data Lake, Custom)
  • Update: Manage a Knowledge Base; re‑sync when source Libraries change
  • Storage: embeddings live in Postgres (pgvector) or a configured vector store
  • Tuning: choose chunking and metadata that match your retrieval; validate with golden questions

SDK references: Data Loaders, Embedding Drivers, Vector Store Drivers.

What are Retrievers and how do I tune them?

Retrievers control how assistants fetch context: target KBs, max tokens, filters, re‑ranking.

  • Create: Create a Retriever
  • Tune: similarity thresholds, top‑K, metadata filters, rerankers
  • Advanced: modular RAG Engines with retrieval/rerank/response stages

SDK references: RAG Engines, Rerank Drivers.

How do Assistants call Tools and Structures?

Assistants orchestrate retrieval + generation + actions. Use Tools for external systems, or promote Structures (pipelines/workflows/agents) as callable Tools.

SDK references: Structures, Tools, Assistant Drivers, Conversation Memory Drivers.

Where does data live and how is access governed?

  • Documents: ingest via Data Sources; store in governed Data Lake or Postgres
  • Embeddings/vectors: Postgres pgvector (recommended) or another configured vector store
  • Governance: enforce permissions in source systems; restrict tool usage per project; audit retrieval and tool calls with threads/logs

See: Configure Data Lake, Vector Engine.

How do I improve latency and control cost?

  • Retrieval: reduce top‑K, improve chunking/metadata, use re‑rank selectively
  • Generation: choose right model per route; stream responses; batch where safe
  • Caching: memoize embeddings, hot retrievals, and tool outputs when possible
  • Infra: colocate models and KBs; scale with Model Serving autoscaling

See SDK: Engines and Drivers. See Models: Model Serving.

How do I test, observe, and troubleshoot?

  • Testing: golden sets, conversation playbooks, SDK unit tests
  • Observability: thread logs, assistant/structure run events; optionally export to your observability stack
  • Debugging: verify retrieval set first; inspect ruleset changes; re‑run the same structure/assistant with stored inputs

Docs: Threads, SDK Structures/observability.

Typical use cases and patterns

  • Enterprise knowledge assistants: KB + Retriever + Assistant + Tooling (tickets/CRM)
  • Customer support copilots: policy/FAQ KBs + routing + guardrails via Rulesets
  • Workflow bots: Structures + Tools for approvals, data enrichment, and reporting
  • RAG for analytics: pgvector + Pipelines + Assistant for guided exploration

Explore: Hybrid KB best practices, Quickstart UI.

Where to find SDK references and examples?

See also the product guides: assistants, knowledge bases, retrievers, rulesets, structures, tools, threads.