How Application Continuity Works for Unplanned Interruption in Oracle RAC Environments

Introduction

In a previous blog post, we discussed how Draining and Application Continuity work for maintenance events like rolling patching with Oracle RAC. For planned maintenance, draining can be used to allow active sessions to finish their work (drain) before the database instance is shut down. However, in unplanned events like an instance failure, all sessions will be terminated immediately. In this case, you need to configure Application Continuity to transparently replay interrupted in-flight requests on a surviving database instance, so end-users and applications do not encounter any failures.

The Environment

Let’s take a 2-node Oracle RAC as an example environment and go through time to understand how Application Continuity makes failure events transparent to the end user. The application is using a connection pool that is configured to hold 30 sessions.

Normal Operation

During normal operation, both RAC nodes are up and running and serving the application. Depending on your load-balancing strategy the number of sessions might be different or equally distributed across both nodes. For a better visualization of the diagram below let’s assume having 10 sessions on node 1 and 20 sessions on node 2:

Unplanned Event (Instance Failure)

In case of an unplanned interruption like an instance failure, the database instance is gone and all sessions are terminated. There is no time for draining as is the case when we stop a service intentionally preparing for maintenance:

Application Continuity / Transparent Application Continuity

After instance #2 fails, Fast Application Notification (FAN) sends a DOWN event that clears idle sessions from the connection pool immediately. New Connections are established to the remaining and available instance #1 when requested by the application.

As Application Continuity is configured, active sessions are restored on instance #1 and recovered by Application Continuity, masking the outage from end-users and applications:

Without Application Continuity enabled, active sessions would receive an error that needs to be handled by the application.

Normal Operation, Again

After instance #2 is repaired and available again, FAN sends a UP event to inform the connection pool that a new instance is available for use, allowing sessions to be created on this instance at the next request. Assuming the same distribution of sessions as before (10 on node 1 and 20 on node 2):

Conclusion

Oracle RAC provides high availability for the Oracle database. After a database instance failure, Fast Application Notification (FAN) events eliminate applications waiting on TCP time-outs. New connections are established to available instances. With Application Continuity enabled, active interrupted sessions are recovered and replayed on an available instance transparently to the end-users and applications.

Further Reading

Would you like to get notified when the next post is published?