Why do I get “ORA-12899 value too large for column”​ when migrating to Autonomous Database?

You’ll get this message if

  • Your source database uses a single-byte character set (e.g. WE8ISO8859P1), and
  • Your column width is defined using BYTE length semantic, and
  • You have some column values that do not fit when using a multi-byte character set (e.g. UTF8) as Autonomous Database does

If you are not familiar with these terms, then let’s discover them by example. Screenshots in gray are from the source, in turquoise from the target database. Download all SQL queries used in this blog and try it out yourself.

Source Database

For testing, I created a DBCS VM database using the WE8ISO8859P1 character set.

col parameter for a20
col value for a20

select * from nls_database_parameters where parameter = 'NLS_CHARACTERSET';

Let’s create a table with one column that should hold a maximum of 5 characters (using CHAR length semantic), and insert some values into it.

create table tab_char (col1 varchar2(5 CHAR));
insert into tab_char values ('Hello');
insert into tab_char values ('Hellö');

Now we will repeat this, but this time using the BYTE length semantic, so our column can store a maximum of 5 bytes.

create table tab_byte (col1 varchar2(5 BYTE));
insert into tab_byte values ('Hello');
insert into tab_byte values ('Hellö');

Again, everything is fine, as every character (even the character “ö”) requires one byte in a single-byte character set.

Check the metadata of the tables:

set lines 300
col table_name for a15
col column_name for a15
col data_type for a15
col char_used for a10
col char_length for 999
col data_length for 999

select table_name, column_name, data_type, char_used, char_length, data_length
from dba_tab_cols
where table_name in ('TAB_CHAR', 'TAB_BYTE')
order by 1,2,3;

As we see, the data length (last column in our query) is 5, regardless of BYTE or CHAR length semantic.

If you define neither CHAR nor BYTE while table creation, then the value of the parameter NLS_LENGTH_SEMANTICS is used. It also can be set at the session’s level.

select value from v$parameter where name = 'nls_length_semantics';
alter system  set nls_length_semantics=[CHAR | BYTE];
alter session set nls_length_semantics=[CHAR | BYTE];

Target Database

My target is an Autonomous Database that uses the UTF8 character set. To simplify testing, I’ll just run the same commands on the target instead of running Data Pump export and import.

So, using CHAR length semantic works fine. Let’s try BYTE semanic next:

When using BYTE length semantic, we get the message “too large for column”, as the character “ö” requires 2 bytes in a multi-byte character set. This is why we end up with 6 bytes, which is too large for a 5-byte column.

Now we check the metadata of the tables

and see that in the case of CHAR length semantic, our 5 characters can take up to 20 bytes on storage. So let’s check the length of the values in BYTE using the function lengthb

select col1, lengthb(col1) from tab_char;

Aha! As we see, the string “Hellö” takes 6 bytes, while in our source database only 5 bytes by using a single-byte character set:

The Solution

Option 1: we keep the BYTE length semantic and extend the column width in our tab_byte table from 5 to 20 byte (even though 6 would be enough for this example):

alter table tab_byte modify (col1 varchar2(20 BYTE));

Option 2: we keep the column width equal to 5, but switch from BYTE to CHAR length semantic (recommended):

alter table tab_byte modify (col1 varchar2(5 CHAR));

Where to do the modifications?

Here again, two options:

Option 1: Modify your source database before exporting the data using Data Pump.

Option 2: if you don’t want to modify your source database, then import the metadata only to the target first, modify the column definition on target, and finally import the data. Keep in mind that with this approach data pump parallel import will not be possible and the import will need a longer time to complete.

How to figure out what columns need to be modified?

The best way is, as you should do it anyway, to run the Autonomous Database Schema Advisor. It contains a section that will tell you about existing columns using the BYTE length semantic.

And provides you with a SQL statement to retrieve this information:

You could also just use the following query by providing your schemas:

set lines 300
col owner for a30
col table_name for a30
col column_name for a30
col data_type for a30
col char_length for 99999
col char_used for a10

select t.owner, t.table_name, t.column_name, t.data_type, t.char_length, t.char_used
from dba_tab_cols t join dba_objects o on (t.owner = o.owner and t.table_name = o.object_name)
where 1=1
and o.object_type = 'TABLE'
and t.char_used = 'B'
and t.data_type in ('CHAR', 'VARCHAR', 'VARCHAR2')
and t.owner in ('your_schemas')
order by t.owner, t.table_name, t.column_name, t.data_type;

How do I easily modify hundreds or thousands of column definitions?

Use the following queries to create the needed SQL commands.

Option 1: keep the BYTE length semantic and extend the column width

set lines 300
set pages 0
col sql for a300
spool run.sql
set echo on

select 'alter table ' || t.owner || '.' || t.table_name || ' modify (' || t.column_name || ' ' || t.data_type || '(' || CASE WHEN t.data_type='CHAR' THEN least(t.char_length*4, 2000) ELSE least(t.char_length*4, 32767) END || ' byte' || '));' as sql
from dba_tab_cols t join dba_objects o on (t.owner = o.owner and t.table_name = o.object_name)
where 1=1
and o.object_type = 'TABLE'
and t.char_used = 'B'
and t.data_type in ('CHAR', 'VARCHAR', 'VARCHAR2')
and t.owner in ('your_schemas');

spool off

#clean up the script run.sql and execute it
!vi run.sql
@run.sql

The maximum column length used here is just an example. Extend your columns only to the length needed to avoid potentials performance impact caused by chained rows.

Option 2: keep the column width and switch from BYTE to CHAR length semantic

set lines 300
set pages 0
col sql for a300
spool run.sql
set echo on

select 'alter table ' || t.owner || '.' || t.table_name || ' modify (' || t.column_name || ' ' || t.data_type || '(' || t.char_length || ' char' || '));' as sql
from dba_tab_cols t join dba_objects o on (t.owner = o.owner and t.table_name = o.object_name)
where 1=1
and o.object_type = 'TABLE'
and t.char_used = 'B'
and t.data_type in ('CHAR', 'VARCHAR', 'VARCHAR2')
and t.owner in ('your_schemas');

spool off

#clean up the script run.sql and execute it
!vi run.sql
@run.sql

Summary

If the database says your value is too large, then your value IS too large 🙂

Would you like to get notified when the next post is published?