Hello Upgrade! Do you run faster in the Cloud?

Introduction

A customer is having long upgrade times on-premises. As they are moving to the Cloud, I got the question whether their upgrade in the Cloud would be faster.

First of all, when it comes to patching, upgrade, and migration, I can only encourage you to read the blogs by Mike Dietrich and Daniel Overby Hansen. The questions about how long an upgrade will take and what factors have an impact on the upgrade duration are already explained in:

In this blog post I will:

  • Summarize the factors that do and do not impact the database upgrade duration
  • Discuss and consider these to answer the question about any speed gain in the cloud, and
  • Present different upgrade test results on Database Cloud Service virtual machines.

Factors that impact Upgrade Duration

Huge Impact

  • Number of Installed Components: more components installed in the database means more time to upgrade, especially as most component upgrade scripts run serially. This is the biggest time consumer in the upgrade process.
  • Statistics: gathering the dictionary and fixed objects statistics upfront speeds up the upgrade and could have a big impact on the upgrade duration.
  • Database Architecture: a non-CDB is only one database that has to be upgraded, while in multitenant we need to upgrade the cdb$root, the pdb$seed, and all the user-created PDBs (parallel in bunches).
  • CPU Count: has a huge impact if you are upgrading a CDB with multiple PDBs, as the PDBs will be upgraded in bunches depending on the CPU count, one bunch after another. In a non-CDB or “Single-Tenant” environment CPU count wouldn’t have an impact in a correctly sized system.
  • Complexity of Database Dictionary: if you have a huge database dictionary, then it will obviously take longer to get it updated.
  • Database Version Combination: due to the differences in the data dictionaries, upgrading from 11.2.0.4 to 19c involves more steps and scripts to be ran than an upgrade from 18c to 19c.

Little Impact

  • Memory, CPU and Disk speed: little impact unless you are running your database on a really outdated hardware. Database upgrade is not IO bound, but it could have an impact while upgrading a very huge data dictionary.

No Impact

  • Database Size: as the upgrade does not modify user data, the database size has no impact on the upgrade duration. Upgrading the time zone is an exception.
  • Upgrade Tools: independent of the tool used for the upgrade (AutoUpgrade, DBUA, dbupgrade), the same catctl.pl script will be used in the background.

Is it faster in the Cloud?

Well, when we look at the factors that impact the upgrade duration most, we’ll realize that these are things related to the database itself and would not change by having the database on another environment. Except CPU count.

For sure, if are running on a really outdated hardware and move to the cloud, you’ll obviously get much better CPU performance. But this is not cloud specific, you could migrate your database to newer hardware on-premises too (and CPU speed has little impact anyway).

However, having more CPUs available to upgrade more PDBs in parallel in a multitenant environment is something you can benefit from in the Cloud. Just scale up the number of OCPUs enabled (an online operation on Bare Metal and Exadata systems) before upgrade and scale down again after the upgrade finishes.

Even if you are using the Bring Your On License (BYOL) model and don’t have enough licenses for more OCPUs, you can switch to the License Included model and scale up before upgrade and then scale down and move back to BYOL after the upgrade.

On-Premises you’d need to license the maximum number of cores used all the time. However, if you have the capacity and the licenses needed anyway, then scaling up on-premises will give you the same benefits.

How fast is it in the Cloud?

As there is no query or tool that gives you any estimation, the only way to measure it is to test the upgrade of your database copy upfront under comparable hardware and configurations.

I did the following tests on Database Cloud Service virtual machines, upgrading a single instance EE CDB with 15 PDBs (16 PDBs including the pdb$seed) from version 18.12 to 19.9:

Test 1

Having a VM with 4 OCPUs, 15 components are installed by default, and the statistics are NOT gathered:

SQL> alter system set "_exclude_seed_cdb_view"=false scope=both;

System altered.

SQL> select con_id, comp_id, comp_name, status 
from cdb_registry
order by con_id, comp_id; 

    CON_ID COMP_ID    COMP_NAME                                STATUS
---------- ---------- ---------------------------------------- -----------
         1 APS        OLAP Analytic Workspace                  VALID
         1 CATALOG    Oracle Database Catalog Views            VALID
         1 CATJAVA    Oracle Database Java Packages            VALID
         1 CATPROC    Oracle Database Packages and Types       VALID
         1 CONTEXT    Oracle Text                              VALID
         1 DV         Oracle Database Vault                    VALID
         1 JAVAVM     JServer JAVA Virtual Machine             VALID
         1 OLS        Oracle Label Security                    VALID
         1 ORDIM      Oracle Multimedia                        VALID
         1 OWM        Oracle Workspace Manager                 VALID
         1 RAC        Oracle Real Application Clusters         OPTION OFF
         1 SDO        Spatial                                  VALID
         1 XDB        Oracle XML Database                      VALID
         1 XML        Oracle XDK                               VALID
         1 XOQ        Oracle OLAP API                          VALID

The same components are installed in pdb$seed and in every user-created PDB. I reduced the output to con_id=1 for the sake of space and simplicity.

Check statistics if they have been refreshed within the last 7 days:

SQL> select con_id, operation, target, end_time
from cdb_optstat_operations
where 
     (   (operation = 'gather_fixed_objects_stats')
      or (operation = 'gather_dictionary_stats' and (target is NULL or target in ('SYS', 'SYSTEM')))
      or (operation = 'gather_schema_stats' and target in ('SYS', 'SYSTEM'))
     )
     and end_time > sysdate - 7 
order by con_id, end_time;

no rows selected

After hitting the “upgrade” button in the cloud console, you can check the process and the DBUA log on the local system. As user root:

dbcli list-jobs
ID                                       Description                                                                
---------------------------------------- -------------------
...										 ...
5a350bc2-4def-41e2-ad8c-05332a3f25ee     Database upgrade ...

The DBUA log file is located at /u01/app/oracle/cfgtoollogs/dbua/<job_id>/trace.log_<timestamp>. When we have a look into it, we get valuable information:

Upgrade parallelims level configured to: 8
Using default parallel upgrade options, this CDB with 16 PDBs will first upgrade the CDB$ROOT, and then upgrade at most 4 PDBs at a time using 2 parallel processes per PDB.

Database: PDB001
Cause: Dictionary statistics do not exist or are stale (not up-to-date).
Action: Gather stale data dictionary statistics prior to database upgrade in off-peak time using: EXECUTE DBMS_STATS.GATHER_DICTIONARY_STATS;

Database: PDB001
Cause: None of the fixed object tables have had stats collected.
Action: Gather statistics on fixed objects prior the upgrade.

## you'll see these messages for all PDBs

So, having 4 OCPUs means 8 threads. As each PDB is upgraded using two threads, 4 PDBs will be upgraded at a time. This means, the 16 PDBs will be upgraded in 4 bunches: pdb$seed and the first 3 user-created PDBs, then PDB 4 to 7, and so on.

As a result, I ended up with 3h 41m upgrade time:

dbcli describe-job -i 5a350bc2-4def-41e2-ad8c-05332a3f25ee
Task Name                 Start Time                          End Time                            Status
------------------------- ----------------------------------- ----------------------------------- ----------
Database Upgrade          February 15, 2021 2:02:30 PM UTC    February 15, 2021 5:43:46 PM UTC    Success

Test 2

Same as previous, except I gathered the statistics upfront this time. Execute the following in cdb$root, pdb$seed, and all user created PDBs:

SQL> EXECUTE DBMS_STATS.GATHER_DICTIONARY_STATS; 
SQL> EXECUTE DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

SQL> select con_id, operation, to_char(max(end_time),'DD-MON-YYYY hh24:mi') AS latest
from cdb_optstat_operations
where operation in ('gather_dictionary_stats', 'gather_fixed_objects_stats')
group by con_id, operation
order by con_id;

    CON_ID OPERATION                      LATEST
---------- ------------------------------ --------------------------
         1 gather_dictionary_stats        15-FEB-2021 19:03
         1 gather_fixed_objects_stats     15-FEB-2021 19:05
         ...

Is it exhausting to switch between all the PDBs one by one? Yes it is! Save the two statements in a script (gather_stats.sql) and run the catcon.pl. It will do it all for you in one command:

$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catcon.pl -n 1 -e -b gather_stats -d '''.''' gather_stats.sql &

## Use uppercase -C to exclude specific container(s) and a lowercase -c to include specific container(s).

The total upgrade time got down to 3h 18m. Gathering statistics saved 23 minutes. 10% time saving compared to Test 1.

Test 3

This time I gathered the statistics and scaled up to 8 OCPUs instead of 4. Having a look at the DBUA log file:

Upgrade parallelims level configured to: 16
Using default parallel upgrade options, this CDB with 16 PDBs will first upgrade the CDB$ROOT, and then upgrade at most 8 PDBs at a time using 2 parallel processes per PDB.

Now the PDBs are upgraded in two bunches of 8 PDBs each. So it is a little surprise that the upgrade time got down to 2h 1min. The 4 additional OCPUs saved further 1h 17m. In total, 45% time saving compared to Test 1.

Test 4

By scaling up to 16 OCPUs it was able to reduce the upgrade time to 1h 30m. The additional 8 OCPUs saved further 31m. In total, 59% time saving compared to Test 1.

Test 5

Even though removing components is not supported in the cloud, I was excited to get the numbers. I followed the steps here to remove OWM, ORDIM, SDO, CONTEXT, XOQ, and APS. Components left for this test:

    CON_ID COMP_ID    COMP_NAME                                STATUS
---------- ---------- ---------------------------------------- -----------
         1 CATALOG    Oracle Database Catalog Views            VALID
         1 CATJAVA    Oracle Database Java Packages            VALID
         1 CATPROC    Oracle Database Packages and Types       VALID
         1 DV         Oracle Database Vault                    VALID
         1 JAVAVM     JServer JAVA Virtual Machine             VALID
         1 OLS        Oracle Label Security                    VALID
         1 RAC        Oracle Real Application Clusters         OPTION OFF
         1 XDB        Oracle XML Database                      VALID
         1 XML        Oracle XDK                               VALID

With this the upgrade gets down to 1h 8m. Removing the 6 components above saved 22m. In total, 69% time saving compared to Test 1.

Removal of Components

Be aware that removing components is a major change in the database and you have to be careful with it:

  • Some components are mandatory, e.g. XDB since 12.1.0.1
  • Removal of some components is not supported, e.g. OLS since 12.1.0.2
  • Removal of some components requires downtime
  • There are some dependencies between components, e.g. SDO (Spatial) depends on Multimedia
  • Are you really sure the component is not being used? And will not be used in the future by developers?

It is a much better idea to think about what components are not needed before creating the database and then not installing them at all while database creation.

In the cloud, all components listed above in Test 1 will be installed by default. It is not supported to remove any components in the cloud, as Oracle tests all cloud automation functionality and guarantees these with the components installed.

Conclusion

Most impactful factors on upgrade duration relate to the database structure and architecture. These are the same whether on-premises or in the cloud. CPU is a vital resource, especially in the multitenant architecture while having multiple PDBs.

Less components installed in the database mean faster upgrade. Consider installing only the components needed at database creation time, as removing components afterwards is a major change and there are many things to consider. In the Cloud all components are installed and it is not supported to remove any.

In the cloud, you benefit from the flexibility to scale up and down your OCPUs as needed only for the duration of the upgrade, which speeds up your upgrade significantly, especially when having many PDBs in the same CDB.

Following are the results on different upgrade tests on Oracle Database Cloud Service virtual machines, upgrading a single instance EE CDB with 15 PDBs (16 PDBs including the pdb$seed) from version 18.12 to 19.9:

To minimize the upgrade time for single PDBs consider upgrading by using unplug/plug or (refreshable) PDB clones.

Further Reading

Would you like to get notified when the next post is published?