VectorChord Innovation Release

VectorChord is a high-performance Postgres extension for dense vector search. It extends pgvector with index types that are faster and more memory-efficient at scale, making it well-suited for knowledge bases containing millions or billions of embeddings.

For installation and full technical reference, see VectorChord in pg_extensions.

When to use VectorChord

Standard pgvector HNSW and IVFFlat indexes work well for most workloads. VectorChord becomes the better choice when:

  • Your knowledge base contains millions or billions of embeddings and HNSW indexes are consuming too much RAM.
  • You need lower query latency at high throughput.
  • Your embeddings have high dimensionality (above the 2000-dimension limit supported by pgvector's HNSW index).

Index types

VectorChord provides two index types, both configurable through the vector index config helpers.

chord_hnsw

A VectorChord implementation of HNSW with granular control over graph connectivity and search depth. Use this as a drop-in upgrade from pgvector's HNSW when you need more tuning control or better performance at scale.

Configure with aidb.vector_index_chord_hnsw_config():

ParameterDescription
mMaximum connections per node in the graph (default: 16).
ef_constructionSearch depth during index build (default: 64).
max_connectionsHard cap on graph connections.
mlLevel multiplier controlling the graph layer structure.
vector_data_typeVector storage type (e.g., 'float32').

chord_vchordq

A quantization-based index (IVF-RaBitQ) designed for maximum throughput at very large scale. It trades a small amount of recall accuracy for significantly faster searches and lower memory usage than HNSW.

Configure with aidb.vector_index_chord_vchordq_config():

ParameterDescription
listsNumber of IVF clusters. Higher values increase accuracy at scale.
spherical_centroidsUse normalized centroids — recommended for cosine similarity.
vector_data_typeVector storage type (e.g., 'float32').

Using VectorChord with pipelines

VectorChord index types are configured through the vector_index parameter of aidb.knowledge_base_config(), which is passed as step_N_options when creating a pipeline.

Configuring a pipeline with chord_hnsw

SELECT aidb.create_pipeline(
    name               => 'my_semantic_pipeline',
    source             => 'source_docs',
    source_key_column  => 'id',
    source_data_column => 'content',
    destination        => 'my_knowledge_base',
    step_1             => 'KnowledgeBase',
    step_1_options     => aidb.knowledge_base_config(
                              'my_embedding_model',
                              'Text',
                              distance_operator => 'Cosine',
                              vector_index      => aidb.vector_index_chord_hnsw_config(
                                                      m               => 16,
                                                      ef_construction => 64,
                                                      max_connections => 32
                                                  )
                          ),
    auto_processing    => 'Live'
);

Configuring a pipeline with chord_vchordq

SELECT aidb.create_pipeline(
    name               => 'my_large_scale_pipeline',
    source             => 'billion_scale_docs',
    source_key_column  => 'id',
    source_data_column => 'content',
    destination        => 'my_large_kb',
    step_1             => 'KnowledgeBase',
    step_1_options     => aidb.knowledge_base_config(
                              'my_embedding_model',
                              'Text',
                              distance_operator => 'Cosine',
                              vector_index      => aidb.vector_index_chord_vchordq_config(
                                                      lists               => '1000',
                                                      spherical_centroids => true
                                                  )
                          ),
    auto_processing    => 'Background',
    background_sync_interval => '60 seconds'
);

Multi-step pipeline with chunking and VectorChord

For document pipelines, chunk text first and then embed into a VectorChord-indexed knowledge base:

SELECT aidb.create_pipeline(
    name               => 'doc_pipeline_vchord',
    source             => 'raw_documents',
    source_key_column  => 'id',
    source_data_column => 'body',
    destination        => 'doc_knowledge_base',
    step_1             => 'ChunkText',
    step_1_options     => aidb.chunk_text_config(200, 250, 25, 'words'),
    step_2             => 'KnowledgeBase',
    step_2_options     => aidb.knowledge_base_config(
                              'my_embedding_model',
                              'Text',
                              vector_index => aidb.vector_index_chord_hnsw_config(
                                                 m => 32,
                                                 ef_construction => 128
                                             )
                          ),
    auto_processing    => 'Live'
);

Choosing between index types

ScenarioRecommended index
Up to ~10M vectors, general usechord_hnsw
100M+ vectors, memory-constrainedchord_vchordq
High-dimensional embeddings (>2000 dims)chord_vchordq
Cosine similarity with normalized vectorschord_vchordq with spherical_centroids => true
Drop-in pgvector HNSW replacementchord_hnsw

Further reading