How Oracle Active Data Guard provides better Disaster Recovery than AWS RDS Read Replicas

Introduction

Recently I have been asked whether AWS RDS for Oracle provides a solid disaster recovery solution using Read Replicas based on Oracle Active Data Guard.

Oracle (Active) Data Guard is undoubtedly Oracle’s most comprehensive data protection and disaster recovery solution for Oracle Database. However, whenever I read or hear “RDS”, the next word that comes to my mind is “limitations”. So I decided to review the AWS documentation and see what’s there. Quotes and screenshots are taken from the following AWS documentation:

Zero Data Loss (RPO=0)

AWS RDS Read Replicas only support asynchronous replication, which does not guarantee zero data loss. AWS claims PRO in single-digit minutes, which might be realistic depending on the workload:

Data Guard offers both asynchronous and synchronous replication. Data Guard Maximum Protection Mode guarantees not to lose any data, not even a single transaction, under any circumstances. Active Data Guard Far Sync provides zero data loss at any distance. Network encryption and redo compression can also be offloaded to the Far Sync instance.

Automatic Failover (lower RTO)

AWS RDS Read Replicas only support manual failover:

It is claiming an RTO in single-digit minutes:

However, before promoting a Read Replica:

These wait times all count towards RTO, not talking about the time needed until the DBA is notified about a failure, logs in to the system, does some analysis, and decides to promote a Read Replica. RTO will very probably be more than just “single-digit minutes”.

Data Guard offers automatic failover via the Fast-Start Failover (FSFO) feature providing a recovery time of 2 minutes or even less without the need for human intervention. Additionally, FSFO helps avoid split-brain situations and data loss.

Application Failover

When it comes to failures, you should consider the database and the application. With AWS Read Replicas, you need to stop the application and restart it, pointing to the new primary database:

Oracle (Transparent) Application Continuity can be used with Active Data Guard to provide transparent application failover and even replay in-flight transactions on the new primary database.

Oracle Global Data Services (GDS) additionally provides region-based workload routing, load balancing, role-based global services, replication lag-based workload routing for Active Data Guard, and more.

Recreating Standbys after Failover

After a disaster, you need to manually promote one of the Read Replicas as the new primary, as explained before, as there are no automatic failover capabilities. On top of that

After a “failover” you are not having a new primary and one or more standby databases. You just have a new standalone database. All former standbys need to be deleted and new ones created. During this time, the production database is not protected in case of subsequent failures, leaving aside the manual effort.

Oracle Data Guard provides a failover capability without rebuilding the additional healthy standby databases in your configuration, providing continuous protection of your production data. Only the failed former primary needs to be reinstated. This can even be done automatically after a fast-start failover has occurred by setting the FastStartFailoverAutoReinstate broker configuration property to TRUE.

Data Guard is not only about Disaster Recovery

We considered disaster recovery so far. However, (Active) Data Guard provides much more:

  • Intensive workload scale-out.
  • Easier development cycles.
  • Online upgrades and migrations.
  • Backup and maintenance offload.

Multitenant Architecture

The considerations for RDS for Oracle replicas site states that only the deprecated non-CDB architecture is supported:

This means no consolidation at the PDB level and an increase in cost and maintaining replicas for multiple RDS instances.

Rolling Upgrade and Standby-First Patch Apply

Keep in mind that planned maintenance in AWS RDS means downtime! Whether applying Release Updates (RUs) or upgrading to a major release, the primary database is not available:

Oracle Data Guard Standby-First Patch Apply allows you, as the name reveals, to patch your standby databases first, convert to snapshot standby for testing and switch back to physical standby. When ready, switch over (near zero downtime) and patch the former primary.

Near-zero downtime is also applicable for major version upgrades via Data Guard’s rolling upgrade capabilities. Active Data Guard simplifies this process using the DBMS_ROLLING PL/SQL package.

Offload Backups to Physical Standby

AWS RDS Read Replicas do not support backups, so backups must always be on the primary, generating additional load and requiring high bandwidth:

In Oracle Data Guard, you can offload the RMAN backup on a physical standby freeing up resources on the primary for the production workload. Active Data Guard provides Fast Incremental Backups on Physical Standby to consume even fewer resources on the standby database system.

Oracle replicas for RDS Custom for Oracle

AWS also offers a customized RDS solution that enables access and customization of the operating system and database. Maintenance, backup, scaling, and high availability are shared responsibilities. While it supports the creation of CDBs, it comes with additional limitations regarding disaster recovery by not supporting cross-region standbys.

Conclusion

When discussing disaster recovery, there are two main things to consider: zero data loss and automatic failover to reduce RPO and RTO. AWS RDS Read Replicas do not provide either.

Data Guard (customer-managed, not via RDS) ensures zero data loss, even at any distance via Far Sync (requires ADG license) and provides an automatic failover via Fast-Start Failover (FSFO) having RPO=0 and RTO of around 2 minutes (or even less if tuned properly) without the need for human intervention. Additionally, Application Continuity and Global Data Services ensure transparent application failover for end users.

When implementing high availability, do not forget the planned maintenance. AWS RDS Read Replicas add additional downtime when it comes to patching and upgrades. Oracle Data Guard Rolling Upgrade and Standby-First Patch Apply ensure near zero downtime for your database.

Further Reading

Would you like to get notified when the next post is published?