Data preparation SQL functions Innovation Release

This documentation covers the current Innovation Release of EDB Postgres AI. See also:

Hybrid Manager dual release strategy
Documentation for the current Long-term support release

Each pipeline step operation is also available as a standalone SQL function. You can call these functions directly in queries for one-off transformations, for testing and exploration, or to build custom workflows outside of the pipeline framework.

Function	Operation	Description
`aidb.chunk_text()`	ChunkText	Divides long text into smaller, semantically coherent segments.
`aidb.parse_html()`	ParseHtml	Extracts readable text from HTML, stripping tags while preserving structure.
`aidb.parse_pdf()`	ParsePdf	Extracts text from binary PDF data, with page-level `part_id` output.
`aidb.perform_ocr()`	PerformOcr	Extracts text from images using an OCR-capable AI model.
`aidb.summarize_text()`	SummarizeText	Generates concise summaries of long text passages using an AI model.
`aidb.summarize_text_aggregate()`	SummarizeText	Summarizes text across multiple rows using a SQL aggregate pattern.

Note

To use these operations as steps inside a pipeline, see Pipeline steps.

Text chunking

The chunking step divides long text into smaller segments based on configurable parameters, optimizing it for processing by LLMs and embedding in knowledge bases.

Data parsing

The parsing step extracts structured text from various formats (like HTML and PDF) using AI models, preparing it for downstream processing in the pipeline.

Performing OCR

The OCR step extracts text from images using AI models, enabling the conversion of visual data into searchable text for indexing in knowledge bases.

Text summarizing

The summarizing step generates concise summaries of long text passages using AI models, improving retrieval accuracy in RAG applications.

Embedding

The embedding step transforms processed text or image data into vector representations using AI models, creating a searchable knowledge base for semantic retrieval in RAG applications.

← Prev

AI Factory 1.3 release notes

↑ Up

AI Factory

Text chunking

Data preparation SQL functions Innovation Release

Note

Text chunking

Data parsing

Performing OCR

Text summarizing

Embedding

← Prev

↑ Up

Next →