Optimizing migration performance

Suggest edits

Notice: Data Migration Service capabilities are only available as part of the EDB Postgres AI Hybrid Manager, which is currently in tech preview.

You can leverage the EDB Data Migration Service (DMS), a service included with the Hybrid Manager (HM), to migrate your data from existing database(s) (Oracle, self-managed Postgres, etc.) to a HM-managed database cluster. EDB DMS supports two migration methods: snapshot and snapshot + streaming, allowing you to migrate your data with minimal to no downtime.

Understanding migration workloads in HM

The EDB Data Migration Service runs within the same Kubernetes environment where your EDB Hybrid Manager is deployed (on-premises, AWS, Red Hat OpenShift, and Google Cloud). This means that most of the workloads required for migration operations are managed and optimized by your Kubernetes cluster.

Generally, you don't need to extensively fine-tune resources, as your Kubernetes environment meeting the general requirements for running the Hybrid Manager should ensure a smooth migration. However, for large datasets, optimizing specific components can significantly accelerate the process.

The key components to consider for resource allocation are:

EDB DMS Reader: This component runs on your source database's host machine.
Destination database cluster: This can be the database managed by HM or a different external self-managed database, however here we'll focus on an HM-managed database. This is where your data will reside.
EDB DMS Writer: This component is only required if your migration target is an external (self-managed) database cluster, and it runs on that database's host machine.

Whether you perform a snapshot or snapshot + streaming migration also affects performance differently:

Snapshot migrations: They primarily focus on efficiently transferring a large volume of existing data from a source database to the target. Performance for these is heavily influenced by total data size and sustained disk I/O.
Snapshot + streaming migrations: They first transfer a snapshot of existing data and then continuously stream ongoing changes (transactions) from the source to the destination. Performance here is driven by the sustained transaction rate (TPS) and the ability to keep up with the source's write activity.

Optimizing migration performance for large datasets

The HM and DMS are designed to handle resource usage automatically. For datasets smaller than 100 GB, they are configured to perform fast migrations without further tuning. However, for datasets larger than 100 GB, a few targeted alterations can lead to significantly better performance and quicker migrations.

Tuning destination database performance

Optimizing the destination database cluster (whether an HM-managed database or a self-managed database) provides the most significant performance benefits from resource tuning.

Note

These recommendations are based on current testing and are subject to change. EDB continuously works to improve migration performance and reduce migration times.

Storage recommendations

During a migration, storage IOPS can peak at 2000, with an average throughput of roughly 300 MB/s and peaks up to 350 MB/s. To accommodate these demands:

Separate WAL and data volumes: Aim for storage volumes that can handle a minimum of 2500-3000 IOPS and at least 200-250 MB/s of throughput for both your data and Write-Ahead Log (WAL) volumes. Using separate volumes for WAL and data is highly recommended for optimal performance.
Unified storage and WAL volume: A unified storage and WAL volume can adversely affect performance. For example, using a single volume with 3000 IOPS and 125 MB/s throughput can increase migration time by over 40%. If you must use a unified storage and WAL volume, we recommend a throughput of 300-400 MB/s. Keep in mind that if the WAL becomes full, the migration will fail.

How to adjust this?

For HM-managed database clusters:

Adjust WAL and data volume allocation by editing your cluster's Storage.
By default, HM-managed clusters use separate volumes for data and WAL volumes, but if you unselected the Use a separate storage volume for Write-Ahead Logs option during cluster creation, you won't be able to enable this setting post creation.
Volume speeds (IOPS or throughput) however, are controlled by the infrastructure provider, see Request Amazon EBS volume notifications or Google Cloud - About Persistent Disk performance for more information.

Memory recommendations

While 5 GB of memory is sufficient, a larger amount will facilitate optimal performance during a migration. Use at least 10 GB of memory for the destination database.

How to adjust this?

For HM-managed database clusters:

Adjust memory allocation by editing your cluster's Cluster Settings > Instance size.
Additionally, you can also adjust the shared_buffers, effective_cache_size and wal_buffers parameters in the DB Configuration.

CPU recommendations

The destination database cluster should have at least 4 CPU cores. A lower core count will negatively impact migration performance.

How to adjust this?

For HM-managed database clusters, adjust CPU allocation by editing your cluster's Instance size.

Note

Apply these optimizations for the duration of the migration process. After the migration is complete, you can adjust these values as needed based on your ongoing workload requirements.

Extensive testing has shown that the biggest factor affecting migration performance is volume speed (IOPS and throughput). While decreasing CPU below recommendations can reduce performance, it's not nearly to the degree that low IOPS and throughput for storage will. Halving the storage throughput can increase migration time by over 40%, whereas halving the CPU count to 2 cores might increase it by 10-15%. These recommendations are designed to yield the best migration performance; using lower values may still result in successful migrations but at a slower speed.

Tuning EBD DMS Reader and Writer performance

Optimal performance for the EDB DMS Reader and Writer is achieved when the host machines where these components are installed follow these specifications.

EDB DMS Reader

CPU: 3 CPU cores

Memory: 4 GB

EDB DMS Writer

CPU: 2 CPU cores

Memory: 2.5 GB

Note

If both the EDB DMS Reader and EDB DMS Writer are installed on the same physical or virtual machine, ensure that the host machine has sufficient combined resources. For example, a machine hosting both would ideally have at least 5 CPU cores and 6.5 GB of memory to provide optimal performance for both components.

Performance benchmarking for migrations

To provide a clear understanding of expected migration performance, EDB conducted a series of tests to benchmark migration speeds under various conditions.

While the previous recommendations are derived from comprehensive testing across various migration configurations (including different table sizes, multiple tables, and streaming scenarios), this section presents results from a single 100 GB snapshot migration. Additional detailed performance benchmarks will be incorporated into future releases to further assist in planning your data migrations.

Approach

All migration performance tests were conducted using an automated test suite. The testing infrastructure was hosted on an EC2 t4i.4xlarge instance. Resources on this instance were shared across these Docker containers:

EDB DMS Reader container: 4 CPU, 16 GB memory

Oracle database: 8 CPU, 32 GB memory

All tests originated from an Oracle database and targeted an HM-managed EDB Postgres Advanced Server (EPAS) cluster. The HM-managed database was hosted on a Hybrid Manager deployment running on an m6a.2xlarge instance.

EPAS database cluster configuration

Volumes: Minimum 3,000 IOPS and 500 MB/s throughput.

Storage: Separate storage and WAL volumes were used for each test on the EPAS cluster, with volumes being approximately the same size.

GUC Settings: Default GUC settings were used on the PostgreSQL cluster. Further testing with increased memory allocations and fine-tuned GUC values did not show a significant impact on migration speed.

Migration data

The table contained two columns: one serving as the primary key and the other populated with random integers. A snapshot-only migration was performed several times to account for variability and specify tolerance ranges.

Migration procedure

Populated the Oracle database with the specified amount of tables.
Initiated and completed the migration process.

Results

Nr. of tables	Snapshot size (GB)	Total data size (GB)	Streaming enabled	Streaming TPS	Total streaming data size (GB)	EPAS resources (CPU/memory in GB)	Time to finish (mins)	Average transfer speed (GB/min)	Max. target CPU/ memory/ IOPs usage
1	100	100	False	N/A	N/A	4/10	56.84±0.455	1.76	80%/ 91%/ 1.2k

← Prev

Known issues, limitations, and notes

↑ Up

EDB Data Migration Service

Could this page be better? Report a problem or suggest an addition!