AI Accelerator - Pipelines 3.0.1 release notes

Released: 3 April 2025

In this release, we're adding auto-processing support to the new Preparer Pipeline and make a breaking change to the return type of two batch processing functions.

As always, we also include enhancements to existing functionality.

Highlights

  • The new Data Preparation Pipeline now supports Live auto processing when the data source is a table.
  • The retriever pipeline now automatically creates vector indexes for faster lookups.
  • Model encode/decode batch functions now return rows, making them fit more naturally into SQL queries.
  • Ubuntu 24.04LTS builds are now available

Enhancements

DescriptionAddresses
Live auto processing for the Preparer Pipeline

The preparer pipeline now support live auto processing for data sources that are postgres tables. When enabled, aidb will create triggers on the source table to automatically run the preparer task.

Model encode/decode batch functions now return rows instead of an array of results.

The two functions aidb.decode_text_batch() and aidb.encode_text_batch() now return a set of rows as their result, instead of a single row containing an array. This change was made based on customer feedback. In the case of the encode function, the return type is a list of vector embeddings. These are arrays themselves (arrays of floating point numbers). By returning this set of vectors as an array, we end up with a two-dimensional array. These kinds of arrays are not well supported in Postgres and get flattened in the return type of the function. Complicated parsing of the flattened array was required to work with the return type. This was less of an issue with the decode function but we decided to change both to retain consistency in the return types. Now, the batch functions return a set of rows, one row for each input value.

Auto-processing (auto-embedding) is now configured via new function.

The two functions aidb.enable_auto_embedding_for_table() and aidb.disable_auto_embedding_for_table() have been marked as deprecated. The new function aidb.set_retriever_auto_processing() accepts a "processing mode" argument: Live and Disabled mode are used to turn auto-embedding on and off. For the Preparer Pipeline, the function aidb.set_preparer_auto_processing() serves the same purpose.

The retriever pipeline now automatically creates vector indexes for faster lookups.

Vector indexes are now automatically created when creating retrievers in the pipeline. These indexes can be configured in the aidb.create_retriever_for_table() and aidb.create_retriever_for_volume() functions which now accept an optional index_type argument. This allows you to specify the index type to be used for the retriever. The default value is vector.

Added support for pgfs.allowed_local_paths GUC

A new GUC pgfs.allowed_local_paths has been added to control the local file access in PGFS. By default, local file access is now restricted to /tmp/pgfs. This GUC can be used to allow access to other directories. See PGFS Settings for more information.


Could this page be better? Report a problem or suggest an addition!