Oracle Data Guard and Active Data Guard ensure high availability and disaster recovery during maintenance operations and in case of outages. Recovery Point Objective (RPO) of zero (zero data loss) can be achieved by setting up SYNC or FASTSYNC replication, even at any distance, using Far Sync without compromising performance. Recovery Time Objective (RTO) is the duration of time within a service that should be restored.
After a primary database fails, a notification will be sent to the monitoring team (hopefully). Monitoring contacts the responsible DBA. The DBA picks up their phone (hopefully). The DBA requests access to the production system, waits until approval, logs in and investigates the error, creates a change request, waits for approval, and, finally, initiates the failover. ALL THIS counts towards your RTO! The DBA is not even yet done. They still need to reinstate the failed primary.
This is where automatic failover, or Fast-Start Failover (FSFO), comes into place. The Observer monitors the entire Data Guard configuration. In the case of an unhealthy primary, it initiates an automatic failover reducing the recovery time tremendously. Additionally, the failed primary will automatically be reinstated and taken back as a standby into the data guard configuration.
Oracle Autonomous Database on Shared Infrastructure provides an automatic failover out of the box when zero data loss can be guaranteed. Oracle Autonomous Database on Dedicated Infrastructure enables you to configure FSFO at the click of a button.
This blog post takes you through the steps to configure FSFO for Base Database or Exadata Database Service in Oracle Cloud.
- VM DB System with a Data Guard configuration. Primary and Standby in different Availability Domains (AD1 and AD3).
- IaaS Compute VM running Oracle Linux 8 on a separate Availability Domain (AD2), which will be used for the Observer.
Data Guard Broker
Oracle Data Guard broker automates and centralizes the creation, maintenance, and monitoring of Oracle Data Guard configurations. It allows you to perform all management operations locally or remotely through the broker’s client interfaces:
- Oracle Enterprise Manager Cloud Control, or
- Oracle Data Guard command-line interface, DGMGRL
The Oracle Data Guard monitor is the server-side broker component that is integrated with the Oracle database:
DGMGRL command-line is part of the Oracle Database software installation and is available at $ORACLE_HOME/bin/dgmgrl.
When you enable Data Guard via Cloud Tooling, the Data Guard broker configuration is already created. You can connect to DGMGRL from either the primary or standby site.
dgmgrl SYS@CDB01_fraad1 ... Welcome to DGMGRL, type "help" for information. Password: Connected to "CDB01_fraad1" Connected as SYSDBA. DGMGRL> show configuration;
The Observer is a low-footprint OCI (Oracle Call Interface) client built into the DGMGRL CLI and, like any other client, may be run on a different hardware platform than the database servers. DGMGRL includes commands to create the observer process for fast-start failover:
DGMGRL> start observer;
Only the Observer can initiate an automatic failover. The Observer’s secondary task is to automatically reinstate a failed primary as a standby if that feature is enabled.
Ideally, the Observer should run in a separate location from the primary and standby systems. For example, if your primary and standby are in AD1 and AD3, start your Observer on a compute VM in AD2.
The question is, where to get the DGMGRL command-line interface on that VM if you don’t have any Oracle Database software installed on it? Well, DGMGRL is also included in the Oracle Database Client. Install the Oracle Database Client by choosing the Administrator option from Oracle Universal Installer.
Fast-Start Failover (FSFO)
You can enable fast-start failover to allow the broker to determine if a failover is necessary
and to automatically initiate a failover to a standby database:
DGMGRL> enable fast_start failover;
The observe-only mode for fast-start failover enables you to test how fast-start failover will
work in your environment with no impact on your current configuration or on applications:
DGMGRL> enable fast_start failover observe only;
Installation and Configuration
Step 1: Install Oracle Database Client with Administrator option
Install Oracle Database Client on the Observer VM. For a graphical interface, enable X11 forwarding on Linux systems.
For the installer to successfully run, install the following packages beforehand:
sudo yum install xdpyinfo -y sudo yum install make -y sudo yum install libnsl.x86_64 -y sudo yum install gcc.x86_64 -y
Download the Oracle Database Client LINUX.X64_193000_client.zip file. Unzip and install the client on your compute VM where the Observer should run:
unzip LINUX.X64_193000_client.zip cd client ./runInstaller
Choose Administrator type, specify a location for storing the Oracle software files (Oracle Home directory), check prerequisites, and click Install. The dgmgrl binary will be within the bin directory in the Oracle Home. Add the bin directory to your PATH environment variable.
Step 2: Connecting to primary and standby
On the Observer VM, create a tnsnames.ora file and enter the connection strings for the primary and standby databases:
mkdir /home/opc/network vi /home/opc/network/tnsnames.ora CDB01_fraad1=(DESCRIPTION=(SDU=65535)(SEND_BUF_SIZE=10485760)(RECV_BUF_SIZE=10485760)(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.0.58)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=CDB01_fraad1.subnetpublic.vcnfra.oraclevcn.com)(UR=A))) CDB01_fraad3=(DESCRIPTION=(SDU=65535)(SEND_BUF_SIZE=10485760)(RECV_BUF_SIZE=10485760)(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.0.36)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=CDB01_fraad3.subnetpublic.vcnfra.oraclevcn.com)(UR=A))) export TNS_ADMIN=/home/opc/network dgmgrl SYS@CDB01_fraad1 ... Welcome to DGMGRL, type "help" for information. Password: Connected to "CDB01_fraad1" Connected as SYSDBA. DGMGRL> show configuration; Configuration - CDB01_fraad1_CDB01_fraad3 Protection Mode: MaxAvailability Members: CDB01_fraad1 - Primary database CDB01_fraad3 - Physical standby database Fast-Start Failover: Disabled
Step 3: Set FSFO targets and start the Observer
Set the FSFO targets:
DGMGRL> show database CDB01_fraad1 FastStartFailoverTarget; FastStartFailoverTarget = '' DGMGRL> edit database CDB01_fraad1 set property FastStartFailoverTarget='CDB01_fraad3'; Property "faststartfailovertarget" updated DGMGRL> show database CDB01_fraad1 FastStartFailoverTarget; FastStartFailoverTarget = 'CDB01_fraad3' # then, the other way arround in case of a role switch DGMGRL> edit database CDB01_fraad3 set property FastStartFailoverTarget='CDB01_fraad1'; Property "faststartfailovertarget" updated
Start the Observer process on the Observer VM:
DGMGRL> start observer obsad2 file is /home/opc/fsfo.dat logfile is /home/opc/obsad2.log; Observer 'obsad2' started
The command does not return the prompt. Open a new session to continue. To runt the observer in the background:
dgmgrl SYS/MySecretSYS_PW@CDB01_fraad1 "start observer obsad2 file is /home/opc/fsfo.dat logfile is home/opc/obsad2.log;" &
Check the observer:
DGMGRL> show observer Configuration - CDB01_fraad1_CDB01_fraad3 Primary: CDB01_fraad1 Active Target: CDB01_fraad3 Observer "obsad2"(18.104.22.168.0) - Master Host Name: observervm Last Ping to Primary: 2 seconds ago Last Ping to Target: 3 seconds ago
Step 4: Enable Fast-Start Failover
Now we are ready to enable FSFO:
DGMGRL> enable fast_start failover; Enabled in Zero Data Loss Mode. DGMGRL> show configuration Configuration - CDB01_fraad1_CDB01_fraad3 Protection Mode: MaxAvailability Members: CDB01_fraad1 - Primary database CDB01_fraad3 - (*) Physical standby database Fast-Start Failover: Enabled in Zero Data Loss Mode DGMGRL> show fast_start failover Fast-Start Failover: Enabled in Zero Data Loss Mode Protection Mode: MaxAvailability Lag Limit: 30 seconds Threshold: 30 seconds Active Target: CDB01_fraad3 Potential Targets: "CDB01_fraad3" Observer: obsad2 Shutdown Primary: TRUE Auto-reinstate: TRUE ... DGMGRL> show configuration FastStartFailoverAutoReinstate FastStartFailoverAutoReinstate = 'TRUE' DGMGRL> show configuration FastStartFailoverThreshold FastStartFailoverThreshold = '30' DGMGRL> edit configuration set property FastStartFailoverThreshold=15; Property "faststartfailoverthreshold" updated
Step 5: Test Fast-Start Failover
To test FSFO, I will kill the PMON process on the primary (NOT on production!) and monitor the Observer log file:
tail -f /home/opc/obsad2.log ... Initiating Fast-Start Failover to database "CDB01_fraad3"... Initiating Fast-start Failover. Performing failover NOW, please wait... Failover succeeded, new primary is "CDB01_fraad3" ... Initiating reinstatement for database "CDB01_fraad1"... Reinstating database "CDB01_fraad1", please wait... The standby CDB01_fraad1 is ready to be a FSFO target Reinstatement of database "CDB01_fraad1" succeeded ...
If you don’t have Oracle Restart in place, you will need to start the failed primary manually in MOUNT mode. The Observer will notice the state change and begin reinstating as soon as the database starts.
Oracle Data Guard is a powerful technology with hundreds of configuration options. This blog post shows just an example of a possible configuration. Read the Oracle documentation for full available options and details.
Cloud Tooling or Not Cloud Tooling
The Cloud Tooling enables you to set up Data Guard at the click of a button. Currently, Cloud Tooling does not provide automation to set up FSFO for VM DB Systems and Exadata Database Service. If you enable FSFO for a Data Guard configuration built by Cloud Tooling, and FSFO initiates a failover, the role switch will be synchronized to the Cloud Control Plane, subsequently. During this time, you cannot use some of the automation provided by Cloud Tooling. After Cloud Control Plane is in sync with the current primary and standby roles, Cloud Tooling becomes available again. E.g., it is possible to initiate a switchover via Cloud Tooling.
However, Oracle does not test the impact on Cloud Tooling when additional manual configurations are in place. If FSFO is needed, you might want to configure Data Guard manually (not via Cloud Tooling) to be on the safe side.
Oracle Data Guard Fast-Start Failover (FSFO) monitors your Data Guard environments and initiates an automatic failover in the case of an outage, reducing the recovery time enormously. The manual effort of reinstating the failed primary is also eliminated. The Observer requires very lightweight resources not even worth talking about. Enabling Fast-Start Failover does not require the license for Active Data Guard. In summary, FSFO does not require additional licenses, only minimal resources, is easy to enable, and provides enormous benefits.
- Oracle Data Guard Broker Concepts
- Oracle Data Guard Concepts and Administration
- Database Client Installation Guide for Linux
- Part 4/5: Zero Downtime Migration (ZDM) – Physical Online Migration using Data Guard
- How to Clone a PDB from a Standby Database in a Data Guard Environment
- Hot Clone a remote PDB in Data Guard Environments using Transient no-standby PDBs