Known issues v1.7

These are the currently known issues and limitations identified in this Analytics Accelerator release. Where applicable, we have included workarounds to help you mitigate the impact of these issues. These issues are actively tracked and are planned for resolution in a future release.

  • While the function pgaa.launch_task() supports multiple maintenance operations for Delta Lake (such as z-order, vacuum, and purge), it currently only supports the compaction operation for Iceberg tables. For other Iceberg maintenance routines, use pgaa.spark_sql().
  • The function pgaa.execute_compaction() does not support the parameter settings. Use pgaa.spark_sql() or pgaa.launch_task() to run Spark procedures directly in your cluster or to execute compaction for Iceberg tables.
  • Integration with Apache Spark is are supported for:
    • Read-only queries on Parquet files in S3-compatible object storage or a shared POSIX filesystem.
    • Read-only queries for Iceberg tables in Iceberg REST catalogs.
  • There is a known issue in the Iceberg Spark Library version 1.9.2 when using Spark to read from Iceberg tables. Under certain conditions, equality deletes may occasionally be skipped during concurrent executions. The current workaround is to disable the Spark application setting spark.sql.iceberg.executor-cache.enabled on your spark-defaults.conf file. Disabling this cache ensures data consistency by correctly processing all deletes, but it may have performance implications for high-volume read workloads.