How to create Oracle Database Services for High Availability

Introduction

To use High Availability features like Fast Application Notification (FAN), Draining, and Application Continuity, you need to create a custom database service and set the high availability attributes accordingly. This is not possible when using the default database service, which is the service with the same name as the PDB. Additionally, custom database services provide location transparency for the application.

FAN and Draining

Service status changes like stopping and starting the service can be propagated to Oracle client drivers and Oracle connection pools using FAN, a high-availability notification mechanism. FAN events are published using Oracle Notification Service (ONS).

The clients take immediate action accordingly, e.g., to balance connections or to establish new connections and route work to available database instances or where the service is started.

Draining means allowing connected sessions to complete their current work in a specific duration (drain timeout) before maintenance is started.

Set the database service attribute notification to true to enable FAN and drain_timeout to time_in_seconds to enable draining:

srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> 
-preferred <instance_name>
-available <instance_name>
-stopoption IMMEDIATE
-notification true
-drain_timeout 300 

#example
srvctl add service -db RACCDB_fra -pdb MOVIEDB -service drainsrv -preferred RACCDB1,RACCDB2 -stopoption IMMEDIATE -notification true -drain_timeout 300

#start the service
srvctl start service -db RACCDB_fra -service drainsrv

#check status
srvctl status service -db RACCDB_fra -service drainsrv
Service drainsrv is running on instance(s) RACCDB1,RACCDB2

When you stop the service on instance 1, RACCDB1:

#stop service on instance 1
srvctl stop service -db RACCDB_fra -service drainsrv -instance RACCDB1

#check status
srvctl status service -db RACCDB_fra -service drainsrv
Service drainsrv is running on instance(s) RACCDB2

You can specify a different drain_timeout when stopping the service:

srvctl stop service -db RACCDB_fra -service drainsrv -instance RACCDB1 -drain_timeout 10

New connections will immediately be directed to instance 2.

Existing sessions to instance 1 remain connected to instance 1 for the duration of drain_timeout and can finish their work (commit) during this time without any interruption.

Ideally, all sessions drain (=finish their work) within the drain_timeout specified. You can set the drain_timeout as high as needed for your application, e.g., 30 minutes or 1 hour. After that, the database node will be ready for maintenance, and the application will not encounter any interruption if all sessions drain.

Application Continuity

What happens to sessions with in-flight work that do not complete their work within drain_timeout?

This is where Application Continuity comes into play! Application Continuity will restore the session state and replay the in-flight work on a servicing instance transparently for the application.

The database service needs to have the Application Continuity attributes set:

srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> 
-preferred <instance_name>
-available <instance_name>
-stopoption IMMEDIATE
-notification true
-drain_timeout 300
-commit_outcome true
-retention 86400
-replay_init_time 600
-session_state DYNAMIC
-failover_restore LEVEL1
-failovertype TRANSACTION

For Transparent Application Continuity – an operation mode of Application Continuity – set failover_restore and failovertype to AUTO:

srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> 
...
#-session_state defaults to AUTO if failover_type is set to AUTO 
-failover_restore AUTO
-failovertype AUTO

A key difference is that Application Continuity requires the application to use a Connection Pool, while Transparent Application Continuity works with or without Connection Pools. I will talk about the differences in more detail in a future post.

Explanation of the following attributes:

  • commit_outcome set to true enables Transaction Guard, which is required by Application Continuity and Transparent Application Continuity.
  • retention determines the amount of time (in seconds) that the commit outcome is retained in the database.
  • replay_init_time specifies the difference between the time, in seconds, of the original execution of the first operation of a request and the time that the replay is ready to start after a successful reconnect. Application Continuity will not replay after the specified amount of time has passed. This parameter is intended to avoid the unintentional execution of a transaction when a system is recovered after a long period.
  • session_state describes how the non-transactional session state is changed by the application within a request. Examples of the session states are NLS settings, optimizer preferences, event settings, PL/SQL global variables, and temporary tables.

Services for Active Data Guard

Create role-based services when using Active Data Guard. Set the role attribute to primary for services to be used for the primary site (read/write) and standby for services to be open on the standby site (read-only). The services start and stop automatically at a site based on their role.

srvctl add service -db <db_unique_nam> -pdb <pdb_name> -service <service_name> 
...
-failover_restore AUTO
-failovertype AUTO
-role primary

Create the same service name with the same role on both primary and standby sites, e.g.:

#on primary and standby sites
srvctl add service -db RACCDB_fra -pdb MOVIEDB -service tacsrv -preferred RACCDB1,RACCDB2 -stopoption IMMEDIATE -notification true -drain_timeout 300 -commit_outcome true -retention 86400 -replay_init_time 600 -session_state AUTO -failover_restore AUTO -failovertype AUTO 
-role primary

#on primary site
srvctl start service -db RACCDB_fra -service tacsrv 

Note that we didn’t start the service on the standby site. The service will automatically start after a switchover or failover operation.

Note 1: When using Active Data Guard with single-instance databases, remove the preferred and available attributes that are related to Oracle RAC databases.

Note 2: To create the same service name on both sites in Oracle Cloud, include the domain name in the service, e.g., tacsrv.oracle.com, as described in this blog post.

Draining in Data Guard Environments

To allow services to drain during switchover operations, use the WAIT option of the switchover Data Gaurd broker command:

DGMGRL> switchover to <standby_db_unique_name> wait;

--example
DGMGRL> switchover to CDB01_lhrad2 wait;
Stopping services and waiting for sessions to drain...

The broker determines the maximum drain_timeout value for all active services and waits for up to that amount of time for all current client requests to be processed, before proceeding with the switchover. Alternatively, you can specify the number of seconds to wait for sessions to drain.

DGMGRL> switchover to <standby_db_unique_name> wait <timeout_in_seconds>;

Services in Autonomous Database

Autonomous Database provides predefined services with different characteristics to be used for applications depending on their requirements. The service names differ depending on your workload and deployment option:

(Transparent) Application Continuity can be enabled as follows:

Connection String

The application should use the recommended connection string configured for high availability as described here on page 5:

Alias = (DESCRIPTION = (CONNECT_TIMEOUT=90)(RETRY_COUNT=50)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3)
	(ADDRESS_LIST = (LOAD_BALANCE=ON)
		(ADDRESS = (PROTOCOL=TCP)(HOST=primary-scan)(PORT=1521)))
	(ADDRESS_LIST = (LOAD_BALANCE=ON)
		(ADDRESS = (PROTOCOL=TCP)(HOST=secondary-scan)(PORT=1521)))
	(CONNECT_DATA = (SERVICE_NAME=gold-cloud)))

More on connection strings in a following blog post. Stay tuned!

Conclusion

Always create and use custom database services for your application, as these allow you to set high availability attributes for Fast Application Notification (FAN), Draining, and Application Continuity.

In Data Guard environments, use role-based database services and create a service with the same name on all sites. Specify the WAIT option in the Data Guard broker switchover command to allow services to drain during switchover operations.

Further Reading

Would you like to get notified when the next post is published?