Save Storage Cost with Hybrid Partitioned Tables in Oracle Autonomous Database

Introduction

Partitioned Tables allow you to subdivide tables into smaller pieces (partitions) that can be managed and accessed at a finer level of granularity. The nice thing is that it is entirely transparent to your SQL SELECT statements and to the application.

External (Partitioned) Tables enable you to access data on external storage as if it were in a table in the database. Only the metadata are stored in the database while the data itself resides on the external storage.

Hybrid Partitioned Tables combine both partitioned and external partitioned tables. It allows to partition a table while storing the recent “hot” partitions in the database and keeping older partitions on the more cost efficient external storage, e.g. Object Storage in the Cloud.

In this blog post we will:

  1. Prepare the Autonomous Database to access Object Storage.
  2. Create a hybrid partitioned table with three partitions, partition 1 (older/cold) as an external partition on Object Storage, and partitions 2 and 3 (newer/hot) inside the database as internal partitions.
  3. Insert data into the table.
  4. Add a new internal partition, partition 4, to the table and insert data into it.
  5. Export the internal partition 2 to Object Storage (as it got “older” now) and drop it from the database.
  6. Use the exported data to create an external partition, so your SELECT statement will retrieve all data as before.

We will do this for LIST and RANGE partitioning.

The Environment

Oracle Autonomous Database on Shared Exadata Infrastructure and a Standard Oracle Object Storage bucket.

Step 1

To be able to access Object Storage from within Oracle Autonomous Database, we need to store the credentials in the database:

SQL> SET DEFINE OFF -- to disable the special character & if contained in the password
SQL> BEGIN
  DBMS_CLOUD.CREATE_CREDENTIAL(
    credential_name => 'OBJECT_STORE_CRED',
    username => 'sinan.petrus.toma@oracle.com',
    password => ')(:CxXyfvGs4+k6RZonU'
  );
END;
/
PL/SQL procedure successfully completed.

SQL> alter database property set default_credential = 'ADMIN.OBJECT_STORE_CRED';
Database altered.

SQL> select owner, credential_name, enabled from dba_credentials where credential_name = 'OBJECT_STORE_CRED';
OWNER      CREDENTIAL_NAME      ENABLED
---------- -------------------- -------------------
ADMIN      OBJECT_STORE_CRED    TRUE

Create a .csv file with the data for your external partition. Here I’ll keep it very simple having only two columns: partition_id and name:

pid1.csv
1,name1

Upload the file into your Object Storage bucket using the Web Console or a Rest API. Verify your credentials by listing the files on the Object Storage:

SQL> select object_name from dbms_cloud.list_objects('OBJECT_STORE_CRED','https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/');
OBJECT_NAME
------------------------------
pid1.csv

Step 2

Create the hybrid partitioned table having the first partition pointing to the .csv file from the previous step:

-- create hybrid partitioned table in CSV format 
SQL> BEGIN  
DBMS_CLOUD.CREATE_HYBRID_PART_TABLE( 
    table_name      => 'HYBRID_PART_TAB',  
    credential_name => 'OBJECT_STORE_CRED',  
    format          => json_object('type' VALUE 'CSV'),  
    column_list     => 'pid number(3), name varchar2(64)',
    partitioning_clause => 'partition by list (pid)
      ( partition p1 values (1) external location (''https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/pid1.csv''), 
        partition p2 values (2), 
        partition p3 values (3) )' 
     );
END;
/
PL/SQL procedure successfully completed.

Here we used the CSV format. However, in later steps when we want to export older partitions, the export will be in DATAPUMP format. If you want to add the older partition again as external partition to the database, the table has to be created with the DATAPUMP format too. So, let’s export the first partition and recreate the table with the DATAPUMP format.

-- export first partition
SQL> BEGIN
 DBMS_CLOUD.EXPORT_DATA(
    credential_name => 'OBJECT_STORE_CRED',
    file_uri_list => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/pid1.dmp',
    format => json_object('type' value 'datapump'),
    query => 'select pid, name from HYBRID_PART_TAB where pid=1'
 );
END;
/
PL/SQL procedure successfully completed.

-- list files on Object Storage
SQL> select object_name from dbms_cloud.list_objects('OBJECT_STORE_CRED','https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/');
OBJECT_NAME
------------------------------
pid1.csv
pid1.dmp
pid1.dmp_aaaaaa

-- recreate table in DATAPUMP format
SQL> drop table HYBRID_PART_TAB;
Table dropped.

SQL> BEGIN  
DBMS_CLOUD.CREATE_HYBRID_PART_TABLE( 
    table_name      => 'HYBRID_PART_TAB',  
    credential_name => 'OBJECT_STORE_CRED',  
    format          => json_object('type' VALUE 'datapump'),  
    column_list     => 'pid number(3), name varchar2(64)',
    partitioning_clause => 'partition by list (pid)
      ( partition p1 values (1) external location (''https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/pid1.dmp'') tablespace DATA, 
        partition p2 values (2),
        partition p3 values (3) )' 
     );
END;
/
PL/SQL procedure successfully completed.

Validate and check the table:

-- validate hybrid partitioned table
SQL> execute DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE (table_name => 'HYBRID_PART_TAB');
PL/SQL procedure successfully completed.

-- query dba_tables 
SQL> select hybrid from dba_tables where owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB';
HYBRID
----------
YES

-- list the partitions
SQL> select partition_name, tablespace_name, case read_only when 'YES' then 'external' else 'internal' end AS partition_type
from dba_tab_partitions
where table_owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB'
order by partition_name;
PARTITION_NAME  TABLESPACE_NAME PARTITION_TYPE
--------------- --------------- --------------------
P1              SYSTEM          external
P2              DATA            internal
P3              DATA            internal
--> metadata of the external partitions is stored in the SYSTEM tablespace

Query the data in the table:

SQL> select pid, name from HYBRID_PART_TAB order by pid;
       PID NAME
---------- ----------------------------
         1 name1

Step 3

Insert some data into the table:

SQL> INSERT INTO HYBRID_PART_TAB values (1, 'name1a');
            *
ERROR at line 1:
ORA-14466: Data in a read-only partition or subpartition cannot be modified.
-- as partition 1 is an external partition, it's read only

SQL> INSERT INTO HYBRID_PART_TAB values (2, 'name2');
1 row created.

SQL> INSERT INTO HYBRID_PART_TAB values (3, 'name3');
1 row created.

SQL> commit;
Commit complete.

SQL> select pid, name from HYBRID_PART_TAB order by pid;
       PID NAME
---------- ----------------------------
         1 name1
         2 name2
         3 name3

Step 4

Add a further internal partition to the table. Data in that partition will be stored in the database.

SQL> ALTER TABLE HYBRID_PART_TAB ADD PARTITION p4 VALUES (4);
Table altered.

-- check the new partition
SQL> select partition_name, tablespace_name, case read_only when 'YES' then 'external' else 'internal' end AS partition_type
from dba_tab_partitions
where table_owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB'
order by partition_name;
PARTITION_NAME  TABLESPACE_NAME PARTITION_TYPE
--------------- --------------- --------------------
P1              SYSTEM          external
P2              DATA            internal
P3              DATA            internal
P4              DATA            internal

SQL> INSERT INTO HYBRID_PART_TAB values (4, 'name4');
1 row created.

SQL> commit;
Commit complete.

SQL> select pid, name from HYBRID_PART_TAB order by pid;
       PID NAME
---------- ----------------------------
         1 name1
         2 name2
         3 name3
         4 name4

Step 5

Now, data in partition 2 are getting “old” and we don’t need them on the internal database storage anymore. We export the partitions’ data into the much more cost efficient Object Storage and remove it from the database.

-- export internal partition to Object Storage
SQL> BEGIN
 DBMS_CLOUD.EXPORT_DATA(
    credential_name => 'OBJECT_STORE_CRED',
    file_uri_list => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/pid2.dmp',
    format => json_object('type' value 'datapump'),
    query => 'select pid, name from HYBRID_PART_TAB where pid = 2'
 );
END;
/
PL/SQL procedure successfully completed.

-- check the files are created on Object Storage
SQL> select object_name from dbms_cloud.list_objects('OBJECT_STORE_CRED','https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/');
OBJECT_NAME
------------------------------
pid1.csv
pid1.dmp
pid1.dmp_aaaaaa
pid2.dmp
pid2.dmp_aaaaaa

-- drop the internal partition
SQL> ALTER TABLE HYBRID_PART_TAB DROP PARTITION p2;
Table altered.

-- check partitions
SQL> select partition_name, tablespace_name, case read_only when 'YES' then 'external' else 'internal' end AS partition_type
from dba_tab_partitions
where table_owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB'
order by partition_name;
PARTITION_NAME  TABLESPACE_NAME PARTITION_TYPE
--------------- --------------- --------------------
P1              SYSTEM          external
P3              DATA            internal
P4              DATA            internal

-- check data
SQL> select pid, name from HYBRID_PART_TAB order by pid;
       PID NAME
---------- ----------------------------
         1 name1
         3 name3
         4 name4

Data in partition 2 are now missing in the SELECT query results.

Step 6

Use the exported data to create an external partition, so your SELECT statement will retrieve all data as before, while keeping the data on Object Storage:

-- add the partition as external
SQL> ALTER TABLE HYBRID_PART_TAB
ADD PARTITION p2 VALUES (2) external location ('https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/pid2.dmp');
Table altered.

-- check partitions
SQL> select partition_name, tablespace_name, case read_only when 'YES' then 'external' else 'internal' end AS partition_type
from dba_tab_partitions
where table_owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB'
order by partition_name;
PARTITION_NAME  TABLESPACE_NAME PARTITION_TYPE
--------------- --------------- --------------------
P1              SYSTEM          external
P2              DATA            external
P3              DATA            internal
P4              DATA            internal

-- check data
SQL> select pid, name from HYBRID_PART_TAB order by pid;
       PID NAME
---------- ----------------------------
         1 name1
         2 name2
         3 name3
         4 name4

Now, all data are visible again using the same SQL query as before.

Partitioning by DATE or TIMESTAMP

Usually, huge tables are partitioned based on a DATE or a TIMESTAMP column, for example by month keeping the last three months on hot storage, while moving older partitions to cheaper storage. So let’s try again while partitioning by RANGE this time:

  • January partition as external.
  • February and March partitions as internal.
timestamp_pid1.csv
15-jan-2021 10:30:51,product1

-- create table in csv format
SQL> BEGIN  
DBMS_CLOUD.CREATE_HYBRID_PART_TABLE( 
    table_name      => 'HYBRID_PART_TAB_TIMESTAMP',  
    credential_name => 'OBJECT_STORE_CRED',  
    format          => json_object('type' VALUE 'csv'),  
    column_list     => 'sale_date TIMESTAMP, product_name varchar2(64)',
    partitioning_clause => 'partition by RANGE (sale_date) 
      ( partition p1 values less than (to_timestamp(''1-feb-2021 00:00:00'',''dd-mon-yyyy hh24:mi:ss'')) external location (''https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/timestamp_pid1.csv''),
        partition p2 values less than (to_timestamp(''1-mar-2021 00:00:00'',''dd-mon-yyyy hh24:mi:ss'')), 
        partition p3 values less than (to_timestamp(''1-apr-2021 00:00:00'',''dd-mon-yyyy hh24:mi:ss'')),
        partition p999 values less than (MAXVALUE) 
)' 
     );
END;
/

-- export to dumpfile
SQL> BEGIN
 DBMS_CLOUD.EXPORT_DATA(
    credential_name => 'OBJECT_STORE_CRED',
    file_uri_list => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/timestamp_pid1.dmp',
    format => json_object('type' value 'datapump'),
    query => 'select sale_date, product_name from HYBRID_PART_TAB_TIMESTAMP where sale_date < to_timestamp(''1-feb-2021  00:00:00'', ''dd-mon-yyyy hh24:mi:ss'')'
 );
END;
/

-- recreate using datapump format
SQL> drop table HYBRID_PART_TAB_TIMESTAMP;

SQL> BEGIN  
DBMS_CLOUD.CREATE_HYBRID_PART_TABLE( 
    table_name      => 'HYBRID_PART_TAB_TIMESTAMP',  
    credential_name => 'OBJECT_STORE_CRED',  
    format          => json_object('type' VALUE 'datapump'),  
    column_list     => 'sale_date TIMESTAMP, product_name varchar2(64)',
    partitioning_clause => 'partition by RANGE (sale_date) 
      ( partition p1 values less than (to_timestamp(''1-feb-2021 00:00:00'',''dd-mon-yyyy hh24:mi:ss'')) external location (''https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/timestamp_pid1.dmp''),
        partition p2 values less than (to_timestamp(''1-mar-2021 00:00:00'',''dd-mon-yyyy hh24:mi:ss'')), 
        partition p3 values less than (to_timestamp(''1-apr-2021 00:00:00'',''dd-mon-yyyy hh24:mi:ss'')),
        partition p999 values less than (MAXVALUE) 
)' 
     );
END;
/

-- check data
SQL> select to_char(sale_date, 'yyyy-mm-dd hh24:mi:ss') as sale_date, product_name
from HYBRID_PART_TAB_TIMESTAMP
order by sale_date;

SALE_DATE                      PRODUCT_NAME
------------------------------ ----------------------------------------------------------------
2021-01-15 10:30:51            product1

To add a new partition to the table, e.g. for the month April, split the last partition:

SQL> ALTER TABLE HYBRID_PART_TAB_TIMESTAMP 
	SPLIT PARTITION p999 AT (to_date('1-may-2021 00:00:00','dd-mon-yyyy hh24:mi:ss')) 
	INTO (PARTITION p4, 
		  PARTITION p999);

Insert some data into the table:

-- insert data 
SQL> INSERT INTO HYBRID_PART_TAB_TIMESTAMP values (to_timestamp('15-feb-2021 7:20:13','dd-mon-yyyy hh24:mi:ss'), 'product2'); 
SQL> INSERT INTO HYBRID_PART_TAB_TIMESTAMP values (to_timestamp('15-mar-2021 9:41:59','dd-mon-yyyy hh24:mi:ss'), 'product3');
SQL> INSERT INTO HYBRID_PART_TAB_TIMESTAMP values (to_timestamp('15-apr-2021 11:51:8','dd-mon-yyyy hh24:mi:ss'), 'product4');
SQL> commit;

Check the partitions and data:

-- check partitions
SQL> select partition_name, tablespace_name, case read_only when 'YES' then 'external' else 'internal' end AS partition_type
from dba_tab_partitions
where table_owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB_TIMESTAMP'
order by partition_name;

PARTITION_NAME  TABLESPACE_NAME PARTITION_TYPE
--------------- --------------- --------------------
P1              SYSTEM          external
P2              DATA            internal
P3              DATA            internal
P4              DATA            internal
P999            DATA            internal

-- check data
SQL> select to_char(sale_date, 'yyyy-mm-dd hh24:mi:ss') as sale_date, product_name
from HYBRID_PART_TAB_TIMESTAMP
order by sale_date;

SALE_DATE                      PRODUCT_NAME
------------------------------ ----------------------------------------------------------------
2021-01-15 10:30:51            product1
2021-02-15 07:20:13            product2
2021-03-15 09:41:59            product3
2021-04-15 11:51:08            product4

Export partition 2 (February) to Object Storage and drop the internal partition:

-- export internal partition to Object Storage
SQL> BEGIN
 DBMS_CLOUD.EXPORT_DATA(
    credential_name => 'OBJECT_STORE_CRED',
    file_uri_list => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/timestamp_pid2.dmp',
    format => json_object('type' value 'datapump'),
    query => 'select sale_date, product_name from HYBRID_PART_TAB_TIMESTAMP where sale_date between to_timestamp(''1-feb-2021  00:00:00'', ''dd-mon-yyyy hh24:mi:ss'') and to_timestamp(''28-feb-2021  23:59:59'', ''dd-mon-yyyy hh24:mi:ss'')'
 );
END;
/

-- drop the internal partition
SQL> ALTER TABLE HYBRID_PART_TAB_TIMESTAMP DROP PARTITION p2;

Now we try to add the exported data as an external partition as we did previously:

-- add the partition as external
SQL> ALTER TABLE HYBRID_PART_TAB_TIMESTAMP
ADD PARTITION p2 VALUES less than (to_timestamp('1-mar-2021 00:00:00','dd-mon-yyyy hh24:mi:ss')) external location ('https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/timestamp_pid2.dmp');
              *
ERROR at line 2:
ORA-14074: partition bound must collate higher than that of the last partition

Uppps… partition can NOT be added as February is not higher than the date in the last partition. This is an expected behavior. The right way to do it is to split the next higher partition. This is the “March” partition. Actually not the “March” partition, but the “less than April” partition. We split it at 1st of March and add the exported data to the newly created partition:

SQL> ALTER TABLE HYBRID_PART_TAB_TIMESTAMP 
	SPLIT PARTITION p3 AT (to_date('1-mar-2021 00:00:00','dd-mon-yyyy hh24:mi:ss')) 
	INTO (PARTITION p2 EXTERNAL LOCATION ('https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/oci_core_emea_od_kasher/b/external_partitions/o/timestamp_pid2.dmp') tablespace DATA, 
		  PARTITION p3);

Check partitions again:

-- check partitions
SQL> select partition_name, tablespace_name, case read_only when 'YES' then 'external' else 'internal' end AS partition_type
from dba_tab_partitions
where table_owner = 'ADMIN' and table_name = 'HYBRID_PART_TAB_TIMESTAMP'
order by partition_name;

PARTITION_NAME  TABLESPACE_NAME PARTITION_TYPE
--------------- --------------- --------------------
P1              SYSTEM          external
P2              SYSTEM          external              
P3              DATA            internal
P4              DATA            internal
P999            DATA            internal
--> metadata of the external partitions is stored in the SYSTEM tablespace

Query the data using the same SQL statement as before:

SQL> select to_char(sale_date, 'yyyy-mm-dd hh24:mi:ss') as sale_date, product_name
from HYBRID_PART_TAB_TIMESTAMP
order by sale_date;

SALE_DATE                      PRODUCT_NAME
------------------------------ ----------------------------------------------------------------
2021-01-15 10:30:51            product1
2021-02-15 07:20:13            product2
2021-03-15 09:41:59            product3
2021-04-15 11:51:08            product4

Conclusion

With hybrid partitioned tables you are able to subdivide a huge table into multiple partitions, e.g. one partition per month or quarter, while storing the most recent partitions inside the database for faster queries and keeping older partitions on the more cost efficient Object Storage.

You have to choose the Standard Object Storage bucket type. Archive Object Storage cannot be used to store external partitions.

Further Reading

Would you like to get notified when the next post is published?