How to enable Data Plane Events for VM DB Systems on Oracle Cloud

Introduction

The Oracle Cloud Events Service enables you to execute actions automatically based on state changes of your cloud resources. For example, a state change could be when the creation of the database is completed. An action could be sending an email notification or executing a function to implement post-creation tasks like creating users. If the database becomes unavailable, you send a notification to the DBAs to investigate the issue and failover to standby manually when you don’t have Fast-Start Failover (FSFO) in place.

The documentation provides a complete list of database events. This blog post offers a step-by-step guide to enable the critical data plane events for VM DB Systems for the database, database node, and the DB System. These include:

  • The status of the database being down, hanging, or running.
  • The archiver process hangs.
  • Database and ASM ORA-600 and ORA-7445.
  • CRS status.
  • Data Guard status.
  • and more.

The Environment

  • VM DB System running a single instance database of version 19.12.

Preparation

Step 1: Enable Oracle AHF telemetry service

To receive critical events for VM DB Systems, you must enable the Oracle Autonomous Health Framework (AHF) telemetry service on the database host using the dbcli utility. It is also recommended to update the dbcli utility to the latest version. As root user:

#update dbcli
cliadm update-dbcli

#enable AHF telemetry
export DEVMODE=true
dbcli manage-ahftelemetry -a start

For RAC DB systems, execute the commands on each node in the cluster.

Step 2: Create a Topic and a Subscription

Create a topic and a subscription in the notification service to trigger an action (e.g., send email) should an event (e.g., a database is down) occur.

In the Oracle Cloud Console, search for Notifications and click on the Notifications service under Application Integration:

On the Notifications page, click Create Topic, enter a Name and Description, and click Create:

The topic will immediately be created and visible in the topics list.

Click on Subscriptions, then Create Subscription. Choose the Topic created earlier and a Protocol, which is the action that will be triggered in case of an event. Depending on your choice, further information will be required. In this case, I will keep it simple and choose email.

The email subscription will be in the Pending status. Oracle sends you an email for confirmation. Once you click on Confirm Subscription in the email you received, the subscription status becomes Active.

You can create multiple subscriptions for the same topic, e.g., multiple emails, or if you want to receive an email, a Slack message, and trigger a function simultaneously.

Create Rules for Events

Step 3: Create Rule

Create a rule to filter what events you want to get notifications for.

In the Oracle Cloud Console, search for Rules and click on the Rules service under Events Service:

Click Create Rule. Enter a Display Name and a Description. Choose Even Type for Condition, Database for Service Name, and Database – Critical for Event Type.

Under Actions, choose Notifications for Action Type, the compartment where you have created the topic, and finally, the Topic to be used. Click Create Rule to create the rule.

The rule will immediately be created, and the details page will be displayed.

On the left-hand side, under Resources, you can click on Event Matching to add further event types, e.g., DB Node – Critical or DB System – Critical:

In this case, it makes sense as my action is just sending an email. If you are executing a function, it might be better to separate the rules and have a different topic for each rule. This will keep the code in your function simple, without the need to parse and distinguish what outage occurred in the function code to execute the corresponding action.

Testing

Step 4: Test the Event Service

To test the configuration, you could cause one of your test databases to crash by terminating the PMON process. As oracle user:

#get the process id of the  PMON process
ps -ef | grep pmon

#terminate PMON
kill -9 41017

Check your mailbox. You will receive an email with the following content:

{
  "eventType" : "com.oraclecloud.databaseservice.database.critical",
  "cloudEventsVersion" : "0.1",
  "eventTypeVersion" : "2.0",
  "source" : "DataPlane",
  "eventTime" : "2022-01-17T14:03:20Z",
  "contentType" : "application/json",
  "data" : {
    "compartmentId" : "ocid1.compartment.oc1..aaaaaaaa...",
    "compartmentName" : "spetrus",
    "resourceName" : "CDB01_fra1t2",
    "resourceId" : "ocid1.database.oc1.eu-frankfurt-1.antheljr...",
    "availabilityDomain" : "AAef:EU-FRANKFURT-1-AD-1",
    "additionalDetails" : {
      "serviceType" : "dbcs",
      "hostName" : "host01",
      "component" : "cdb",
      "instanceName" : "cdb01",
      "dbName" : "cdb01_fra1t2",
      "description" : "Database: CDB01_fra1t2 Instance: CDB01, status is offline",
      "eventName" : "AVAILABILITY.DB_STATUS",
      "dbSystemId" : "ocid1.dbsystem.oc1.eu-frankfurt-1.antheljr...",
      "status" : "offline"
    }
  },
  "eventID" : "2a727136-d59a-47a4-bf85-413aa09da8d0",
  "extensions" : {
    "compartmentId" : "ocid1.compartment.oc1..aaaaaaaa..."
  }
}

A short time later, as my MV DB System is configured with Oracle Restart, which automatically restarts the database in this case, I get a second notification about the database being online again:

...
      "description" : "Database: CDB01_fra1t2 Instance: CDB01, status is online",
      "eventName" : "AVAILABILITY.DB_STATUS",
      "dbSystemId" : "ocid1.dbsystem.oc1.eu-frankfurt-1.antheljr...",
      "status" : "online"
...

Conclusion

Oracle Cloud Events Service extends the automation capabilities of the Oracle Cloud. It enables you to automatically get notifications and execute further tasks based on status changes of your cloud resources or when database critical events occur. This reduces your operation team’s reaction time and increases the database’s high availability.

Further Reading

Would you like to get notified when the next post is published?