AI Factory Pipelines v1.3
AI Factory Pipelines is a core capability of EDB Postgres® AI (EDB PG AI), enabling intelligent, automated AI data workflows directly in your Postgres clusters — including Retrieval-Augmented Generation (RAG), Knowledge Bases, and vector-powered applications.
Pipelines abstracts away complexity in preparing, managing, and serving AI data for Postgres — integrating with Vector Engine and enabling Sovereign AI patterns where your data stays inside your database or trusted object storage.
Why Pipelines?
Modern AI workloads — especially RAG and Gen AI — require:
- Clean, preprocessed data
- Consistent embeddings
- Efficient vector indexes
- Real-time updates when source data changes
Manually managing these steps is complex and error-prone. Pipelines automates this lifecycle:
- Prepares and cleans data
- Runs embedding generation
- Maintains vector indexes
- Supports auto-processing and Knowledge Bases
Pipelines helps eliminate "hallucinations" from stale embeddings and makes building reliable Gen AI easier.
What You Can Do
With Pipelines, you can:
- Automate vector embedding pipelines for text and image data
- Maintain Knowledge Bases with real-time or batch updates
- Perform semantic and similarity search using Vector Engine
- Integrate with Gen AI Builder and other AI Factory components
- Build fully auditable, controlled Sovereign AI systems on your data
How It Works
Preparers
Preparers define how source data is cleaned, chunked, and prepared for embedding. You can create preparers for:
- Postgres tables (
aidb.create_table_preparer()
) - Object storage (S3-compatible) (
aidb.create_volume_preparer()
)
Once created, you can run:
- Bulk processing (
aidb.bulk_data_preparation()
) - Auto-processing (
aidb.set_auto_preparer()
) to keep embeddings up to date automatically.
See Preparer Concepts to learn more.
Knowledge Bases
Knowledge Bases provide a simple abstraction for RAG:
- Define a Knowledge Base (
aidb.create_table_knowledge_base()
,aidb.create_volume_knowledge_base()
) - Automatically manage embeddings and vector indexes
- Use
aidb.retrieve_key()
andaidb.retrieve_text()
for semantic search
Supports:
- Text and image data
- Multi-modal models (e.g., CLIP)
- Integration with Gen AI Builder’s retrieval flows
Auto-Processing
Pipelines supports flexible auto-processing modes:
- Sync or async
- Table or volume sources
- Configurable batch sizes and triggers
See Auto-Processing for details.
Who Should Use Pipelines?
Pipelines is ideal for:
- Gen AI Builders creating enterprise Knowledge Bases
- Data Engineers building semantic search pipelines
- AI teams needing automated RAG pipelines
- Architects implementing Sovereign AI in regulated environments
When to Use Pipelines
Use Pipelines when you need to:
- Keep embeddings current with changing data
- Reduce RAG errors caused by stale or incomplete embeddings
- Power multi-modal Gen AI search across text and images
- Maintain full auditability and governance of embedding processes
Configuration values explained
- Chunk size and overlap: Smaller chunks improve grounding; overlap helps preserve context across chunks. Tune based on document structure.
- Embedding model selection: Use a consistent model for ingestion and query. Choose for your domain and language.
- Auto‑processing cadence: Align with data change frequency and SLA. Consider sync vs. async modes.
- Batch sizes and parallelism: Increase for throughput but monitor storage, network, and memory.
- Similarity threshold and top‑K: Higher thresholds reduce noise; tune top‑K for recall vs. latency.
Learn More
AI Accelerator Pipelines helps you build trusted, scalable Sovereign AI solutions on top of your Postgres data — keeping AI pipelines clean, current, and production-ready.
Start here
Pipelines overview
Where to start with AI Accelerator Pipelines.
Compatibility
Compatibility information for the EDB Postgres AI - AI Accelerator Pipelines.
Limitations
Limitations of the EDB Postgres AI - AI Accelerator Pipelines.
Get started
Getting started
How to get started with AI Accelerator Pipelines.
Installing
How to install (or upgrade) AI Accelerator Pipelines.
Core concepts
Knowledge bases
Creating and using knowledge bases in AI Accelerator Pipelines.
Preparers
Creating and using preparers in AI Accelerator Pipelines.
Capabilities
Capabilities of EDB Postgres AI - AI Accelerator Pipelines.
GPU vector build
Accelerate vector index builds using NVIDIA GPUs for large vector tables.
Models
How to work with models in AI Accelerator Pipelines.
Storage and access
Reference
Reference
Reference documentation for AI Accelerator Pipelines.
Release notes
Release notes
Release notes for EDB Postgres AI - AI Accelerator
Legal
Licenses
Open Source Licenses for software incorporated in EDB Postgres AI - AI Accelerator - Pipelines, by package and license.