A customer is having long upgrade times on-premises. As they are moving to the Cloud, I got the question whether their upgrade in the Cloud would be faster.
First of all, when it comes to patching, upgrade, and migration, I can only encourage you to read the blogs by Mike Dietrich and Daniel Overby Hansen. The questions about how long an upgrade will take and what factors have an impact on the upgrade duration are already explained in:
- How long will my database upgrade take?
- Virtual Classroom 8 – Upgrade Internals – Feb 10, 2020: Slides and Recording
In this blog post I will:
- Summarize the factors that do and do not impact the database upgrade duration
- Discuss and consider these to answer the question about any speed gain in the cloud, and
- Present different upgrade test results on Database Cloud Service virtual machines.
Factors that impact Upgrade Duration
- Number of Installed Components: more components installed in the database means more time to upgrade, especially as most component upgrade scripts run serially. This is the biggest time consumer in the upgrade process.
- Statistics: gathering the dictionary and fixed objects statistics upfront speeds up the upgrade and could have a big impact on the upgrade duration.
- Database Architecture: a non-CDB is only one database that has to be upgraded, while in multitenant we need to upgrade the cdb$root, the pdb$seed, and all the user-created PDBs (parallel in bunches).
- CPU Count: has a huge impact if you are upgrading a CDB with multiple PDBs, as the PDBs will be upgraded in bunches depending on the CPU count, one bunch after another. In a non-CDB or “Single-Tenant” environment CPU count wouldn’t have an impact in a correctly sized system.
- Complexity of Database Dictionary: if you have a huge database dictionary, then it will obviously take longer to get it updated.
- Database Version Combination: due to the differences in the data dictionaries, upgrading from 184.108.40.206 to 19c involves more steps and scripts to be ran than an upgrade from 18c to 19c.
- Memory, CPU and Disk speed: little impact unless you are running your database on a really outdated hardware. Database upgrade is not IO bound, but it could have an impact while upgrading a very huge data dictionary.
- Database Size: as the upgrade does not modify user data, the database size has no impact on the upgrade duration. Upgrading the time zone is an exception.
- Upgrade Tools: independent of the tool used for the upgrade (AutoUpgrade, DBUA, dbupgrade), the same catctl.pl script will be used in the background.
Is it faster in the Cloud?
Well, when we look at the factors that impact the upgrade duration most, we’ll realize that these are things related to the database itself and would not change by having the database on another environment. Except CPU count.
For sure, if are running on a really outdated hardware and move to the cloud, you’ll obviously get much better CPU performance. But this is not cloud specific, you could migrate your database to newer hardware on-premises too (and CPU speed has little impact anyway).
However, having more CPUs available to upgrade more PDBs in parallel in a multitenant environment is something you can benefit from in the Cloud. Just scale up the number of OCPUs enabled (an online operation on Bare Metal and Exadata systems) before upgrade and scale down again after the upgrade finishes.
Even if you are using the Bring Your On License (BYOL) model and don’t have enough licenses for more OCPUs, you can switch to the License Included model and scale up before upgrade and then scale down and move back to BYOL after the upgrade.
On-Premises you’d need to license the maximum number of cores used all the time. However, if you have the capacity and the licenses needed anyway, then scaling up on-premises will give you the same benefits.
How fast is it in the Cloud?
As there is no query or tool that gives you any estimation, the only way to measure it is to test the upgrade of your database copy upfront under comparable hardware and configurations.
I did the following tests on Database Cloud Service virtual machines, upgrading a single instance EE CDB with 15 PDBs (16 PDBs including the pdb$seed) from version 18.12 to 19.9:
Having a VM with 4 OCPUs, 15 components are installed by default, and the statistics are NOT gathered:
SQL> alter system set "_exclude_seed_cdb_view"=false scope=both; System altered. SQL> select con_id, comp_id, comp_name, status from cdb_registry order by con_id, comp_id; CON_ID COMP_ID COMP_NAME STATUS ---------- ---------- ---------------------------------------- ----------- 1 APS OLAP Analytic Workspace VALID 1 CATALOG Oracle Database Catalog Views VALID 1 CATJAVA Oracle Database Java Packages VALID 1 CATPROC Oracle Database Packages and Types VALID 1 CONTEXT Oracle Text VALID 1 DV Oracle Database Vault VALID 1 JAVAVM JServer JAVA Virtual Machine VALID 1 OLS Oracle Label Security VALID 1 ORDIM Oracle Multimedia VALID 1 OWM Oracle Workspace Manager VALID 1 RAC Oracle Real Application Clusters OPTION OFF 1 SDO Spatial VALID 1 XDB Oracle XML Database VALID 1 XML Oracle XDK VALID 1 XOQ Oracle OLAP API VALID
The same components are installed in pdb$seed and in every user-created PDB. I reduced the output to con_id=1 for the sake of space and simplicity.
SQL> select con_id, operation, to_char(max(end_time),'DD-MON-YYYY hh24:mi') AS latest from cdb_optstat_operations where operation in ('gather_dictionary_stats', 'gather_fixed_objects_stats') group by con_id, operation order by con_id; no rows selected
After hitting the “upgrade” button in the cloud console, you can check the process and the DBUA log on the local system. As user root:
dbcli list-jobs ID Description ---------------------------------------- ------------------- ... ... 5a350bc2-4def-41e2-ad8c-05332a3f25ee Database upgrade ...
The DBUA log file is located at /u01/app/oracle/cfgtoollogs/dbua/<job_id>/trace.log_<timestamp>. When we have a look into it, we get valuable information:
Upgrade parallelims level configured to: 8 Using default parallel upgrade options, this CDB with 16 PDBs will first upgrade the CDB$ROOT, and then upgrade at most 4 PDBs at a time using 2 parallel processes per PDB. Database: PDB001 Cause: Dictionary statistics do not exist or are stale (not up-to-date). Action: Gather stale data dictionary statistics prior to database upgrade in off-peak time using: EXECUTE DBMS_STATS.GATHER_DICTIONARY_STATS; Database: PDB001 Cause: None of the fixed object tables have had stats collected. Action: Gather statistics on fixed objects prior the upgrade. ## you'll see these messages for all PDBs
So, having 4 OCPUs means 8 threads. As each PDB is upgraded using two threads, 4 PDBs will be upgraded at a time. This means, the 16 PDBs will be upgraded in 4 bunches: pdb$seed and the first 3 user-created PDBs, then PDB 4 to 7, and so on.
As a result, I ended up with 3h 41m upgrade time:
dbcli describe-job -i 5a350bc2-4def-41e2-ad8c-05332a3f25ee Task Name Start Time End Time Status ------------------------- ----------------------------------- ----------------------------------- ---------- Database Upgrade February 15, 2021 2:02:30 PM UTC February 15, 2021 5:43:46 PM UTC Success
Same as previous, except I gathered the statistics upfront this time. Execute the following in cdb$root, pdb$seed, and all user created PDBs:
SQL> EXECUTE DBMS_STATS.GATHER_DICTIONARY_STATS; SQL> EXECUTE DBMS_STATS.GATHER_FIXED_OBJECTS_STATS; SQL> select con_id, operation, to_char(max(end_time),'DD-MON-YYYY hh24:mi') AS latest from cdb_optstat_operations where operation in ('gather_dictionary_stats', 'gather_fixed_objects_stats') group by con_id, operation order by con_id; CON_ID OPERATION LATEST ---------- ------------------------------ -------------------------- 1 gather_dictionary_stats 15-FEB-2021 19:03 1 gather_fixed_objects_stats 15-FEB-2021 19:05 ...
Is it exhausting to switch between all the PDBs one by one? Yes it is! Save the two statements in a script (gather_stats.sql) and run the catcon.pl. It will do it all for you in one command:
$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catcon.pl -n 1 -e -b gather_stats -d '''.''' gather_stats.sql & ## Use uppercase -C to exclude specific container(s) and a lowercase -c to include specific container(s).
The total upgrade time got down to 3h 18m. Gathering statistics saved 23 minutes. 10% time saving compared to Test 1.
This time I gathered the statistics and scaled up to 8 OCPUs instead of 4. Having a look at the DBUA log file:
Upgrade parallelims level configured to: 16 Using default parallel upgrade options, this CDB with 16 PDBs will first upgrade the CDB$ROOT, and then upgrade at most 8 PDBs at a time using 2 parallel processes per PDB.
Now the PDBs are upgraded in two bunches of 8 PDBs each. So it is a little surprise that the upgrade time got down to 2h 1min. The 4 additional OCPUs saved further 1h 17m. In total, 45% time saving compared to Test 1.
By scaling up to 16 OCPUs it was able to reduce the upgrade time to 1h 30m. The additional 8 OCPUs saved further 31m. In total, 59% time saving compared to Test 1.
Even though removing components is not supported in the cloud, I was excited to get the numbers. I followed the steps here to remove OWM, ORDIM, SDO, CONTEXT, XOQ, and APS. Components left for this test:
CON_ID COMP_ID COMP_NAME STATUS ---------- ---------- ---------------------------------------- ----------- 1 CATALOG Oracle Database Catalog Views VALID 1 CATJAVA Oracle Database Java Packages VALID 1 CATPROC Oracle Database Packages and Types VALID 1 DV Oracle Database Vault VALID 1 JAVAVM JServer JAVA Virtual Machine VALID 1 OLS Oracle Label Security VALID 1 RAC Oracle Real Application Clusters OPTION OFF 1 XDB Oracle XML Database VALID 1 XML Oracle XDK VALID
With this the upgrade gets down to 1h 8m. Removing the 6 components above saved 22m. In total, 69% time saving compared to Test 1.
Removal of Components
Be aware that removing components is a major change in the database and you have to be careful with it:
- Some components are mandatory, e.g. XDB since 220.127.116.11
- Removal of some components is not supported, e.g. OLS since 18.104.22.168
- Removal of some components requires downtime
- There are some dependencies between components, e.g. SDO (Spatial) depends on Multimedia
- Are you really sure the component is not being used? And will not be used in the future by developers?
It is a much better idea to think about what components are not needed before creating the database and then not installing them at all while database creation.
In the cloud, all components listed above in Test 1 will be installed by default. It is not supported to remove any components in the cloud, as Oracle tests all cloud automation functionality and guarantees these with the components installed.
Most impactful factors on upgrade duration relate to the database structure and architecture. These are the same whether on-premises or in the cloud. CPU is a vital resource, especially in the multitenant architecture while having multiple PDBs.
Less components installed in the database mean faster upgrade. Consider installing only the components needed at database creation time, as removing components afterwards is a major change and there are many things to consider. In the Cloud all components are installed and it is not supported to remove any.
In the cloud, you benefit from the flexibility to scale up and down your OCPUs as needed only for the duration of the upgrade, which speeds up your upgrade significantly, especially when having many PDBs in the same CDB.
Following are the results on different upgrade tests on Oracle Database Cloud Service virtual machines, upgrading a single instance EE CDB with 15 PDBs (16 PDBs including the pdb$seed) from version 18.12 to 19.9:
To minimize the upgrade time for single PDBs consider upgrading by using unplug/plug or (refreshable) PDB clones.
- Minimal Downtime Migration and Upgrade using Refreshable PDB Clones
- Automate Patching and Upgrade your Cloud Databases using OCI CLI
- Using Custom Database Software Images to Create, Patch, and Upgrade your Cloud Databases