Analytics Accelerator v1.3
Use the Analytics Accelerator (PGAA) to explore the analytical capabilities built on EDB Postgres®. This accelerator helps you understand core concepts, explore key technologies such as EDB Postgres® Lakehouse, and learn how to implement analytics with EDB Hybrid Manager (HM).
We integrate modern data architectures and open standards with the reliability and flexibility of Postgres to help you unlock valuable insights.
Navigating the analytics accelerator
The accelerator organizes content into four areas:
Conceptual foundations Build your understanding of analytics principles and EDB’s approach.
EDB core analytics technologies Learn about EDB solutions and technologies that power our analytics offerings.
Practical guidance and solutions Find use cases, persona-based guides, how-to articles, and tutorials.
Product-specific implementations Access documentation for how these analytics capabilities surface and are managed in EDB products, such as EDB Hybrid Manager.
Conceptual foundations
Understand the principles and strategies behind modern data analytics and EDB’s approach.
Generic analytics concepts Learn about data architectures (Data Warehouse, Data Lake, Lakehouse) and foundational technologies (columnar storage, vectorized engines, and others).
EDB analytics concepts Explore EDB’s vision for Postgres® analytics and how EDB leverages core technologies.
Review in-depth explanations of EDB analytical features, design choices, and advanced topics across the sections below.
EDB core analytics technologies
Learn about EDB’s analytics technologies and how they extend Postgres®.
Lakehouse clusters (EDB Postgres Lakehouse overview) Review the EDB Postgres® Lakehouse solution and its components for enabling analytics on object storage.
Apache Iceberg with EDB solutions Understand how EDB solutions use Apache Iceberg to manage large analytical datasets.
Delta Lake with EDB solutions Learn how EDB Postgres® interacts with Delta tables to enable reliable data lakes.
Tiered tables with EDB Postgres Manage data across storage tiers using EDB Postgres Distributed (PGD) and Lakehouse capabilities to optimize cost and performance.
Use Anywhere (Manual/Reference)
- Lakehouse overview: Lakehouse clusters
- Open formats: Apache Iceberg, Delta Lake
- Storage locations: see Reference for functions and configuration
- Reference: Functions, PGAA functions, Direct scan, Datasets
Use With PGD (Manual/Reference)
- Concepts: Tiered Tables
Use In Hybrid Manager (Manual/Reference)
- Getting ready: Getting setup
- Provision: Create a Lakehouse cluster
- Catalogs: Configure an Iceberg catalog connection
- Interop: Read with or without a catalog
How-Tos (Runbook-Aligned)
These guides mirror the runbook flows and code examples.
— Core How-Tos
- How-to Read with/without a catalog
- How-to Read/write without a catalog
- How-to Reads/writes with a catalog
- How-to Integrate a third-party catalog
Where to start
- Start with Generic analytics concepts and Lakehouse overview to understand core ideas.
- If you’re experimenting with external data, use the No Catalog how-tos.
- If you’re integrating with PGD/Tiered Tables or catalogs, follow the PGD and Catalog how-tos.
Postgres Lakehouse is built using a number of technologies:
- PostgreSQL
- Seafowl, an analytical database
- Apache DataFusion, the query engine used by Seafowl
- Delta Lake (and specifically delta-rs), for implementing the storage and retrieval layer of Delta Tables
Level 100
The most important thing to understand about Postgres Lakehouse is that it separates storage from compute. This design allows you to scale them independently, which is ideal for analytical workloads where queries can be unpredictable and spiky. You wouldn't want to keep a machine mostly idle just to hold data on its attached hard drives. Instead, you can keep data in object storage (and also in highly compressible formats), and only provision the compute needed to query it when necessary.
On the compute side, a vectorized query engine is optimized to query Lakehouse tables but still fall back to Postgres for full compatibility.
On the storage side, Lakehouse tables are stored using highly compressible columnar storage formats optimized for analytics.
Level 200
Here's a slightly more comprehensive diagram of how these services fit together:
Level 300
Here's the more detailed, zoomed-in view of "what's in the box":
Getting Started
Generic Analytics Concepts
General industry concepts that underpin the Analytics Accelerator and modern data analytics architectures.
Concepts
EDB’s vision, strategy, and technologies for delivering Analytics Accelerator capabilities on Postgres.
Terminology
Glossary of key terms used in the Analytics Accelerator and Hybrid Manager analytics features.
Architecture
Understanding the Analytics Accelerator architecture for unified transactional and analytical processing
Quick Start
Launch a Lakehouse node and query sample data.
Reference
Things to know about EDB Postgres® AI Lakehouse
Use Anywhere
Postgres Lakehouse
Understanding the EDB Postgres Lakehouse architecture for scalable analytics on modern data lake storage
Apache Iceberg Integration
Understanding Apache Iceberg's architecture and implementation within Analytics Accelerator for scalable data lake operations
Delta Lake Integration
Understanding Delta Lake's role in the Analytics Accelerator lakehouse architecture and its practical applications
Use With PGD
Tiered Tables
Understanding how Tiered Tables enable cost-efficient data lifecycle management through automated offloading to object storage
Learning
Learning Resources
Navigate Analytics Accelerator documentation with explanations, tutorials, how-to guides, use cases, persona-based guidance, and structured learning paths.
Learning Paths
Structured learning paths for the Analytics Accelerator, from foundational concepts to advanced techniques and official training.
Level 101
Learn the foundational concepts of modern data analytics and the EDB Analytics Accelerator.
Level 201
Learn how to practically apply core Analytics Accelerator technologies and use cases.
Level 301
Advanced techniques and architecture patterns for scaling and optimizing Analytics Accelerator implementations.
Analytics Accelerator for your role a persona-based guide
Guidance for DBAs, DevOps engineers, data scientists, and developers to use Analytics Accelerator effectively.
How-To Hybrid Manager
Create Lakehouse Cluster
Step-by-step guide to create a Lakehouse cluster in Hybrid Manager for fast analytics on object storage.
How-To Playbooks
Getting Setup
Prepare your EDB Postgres AI Hybrid Manager Lakehouse cluster for read-only analytics on Delta Lake and Iceberg datasets.
Integrate with Third-Party Iceberg Catalogs
Configure PostgreSQL Advanced Analytics (PGAA) to work with external Apache Iceberg catalogs for seamless data lake integration
Lakehouse Read With/Without A Catalog
Configure PostgreSQL Advanced Analytics (PGAA) to work with external Apache Iceberg catalogs for seamless data lake integration
Analytics Storage Configuration
Complete guide for implementing PostgreSQL Analytics Accelerator (PGAA) with PGD High Availability for automated data tiering, catalog integration, and lakehouse architectures.