2 Day Developer > Working in a Global Environ... > Setting Up the Globalizatio... > Length Semantics
Length Semantics |
Previous |
Next |
In single-byte character sets, the number of bytes and the number of characters in a string are the same. In multi-byte character sets, a character or code point consists of one or more bytes. Calculating the number of characters based on byte length can be difficult in a variable-width character set. Calculating column length in bytes is called byte semantics, while measuring column length in characters is called character semantics.
Character semantics is useful to define the storage requirements for multi-byte strings of varying widths. For example, in a Unicode database (AL32UTF8), suppose that you need to define a VARCHAR2
column that can store up to five Chinese characters together with five English characters. Using byte semantics, this column requires 15 bytes for the Chinese characters, which are 3 bytes long, and 5 bytes for the English characters, which are 1 byte long, for a total of 20 bytes. Using character semantics, the column requires 10 characters.
The expressions in the following list use byte semantics. Note the BYTE
qualifier in the VARCHAR2
expression and the B
suffix in the SQL function name.
VARCHAR2(20 BYTE)
SUBSTRB(
string
, 1, 20)
The expressions in the following list use character semantics. Note the CHAR
qualifier in the VARCHAR2
expression.
VARCHAR2(20 CHAR)
SUBSTR(
string
, 1, 20)
This section contains the following topic:
The NLS_LENGTH_SEMANTICS
parameter specifies BYTE
(default) or CHAR
semantics. By default, the character datatypes CHAR
and VARCHAR2
are specified in bytes, not characters. Therefore, the specification CHAR(20)
in a table definition allows 20 bytes for storing character data.
The NLS_LENGTH_SEMANTICS
parameter enables you to create CHAR
, VARCHAR2
, and LONG
columns using either byte-length or character-length semantics. NCHAR
, NVARCHAR2
, CLOB
, and NCLOB
columns are always character-based. Existing columns are not affected.
Example: Setting Length Semantics and Creating a Table shows an example of creating a table. When the database character set is WE8MSWIN1252, the last_name
column of the table can hold up to 10 Western European characters, occupying a maximum of 10 bytes. When the database character set is Unicode (AL32UTF8), the last_name
column can still hold up to 10 Unicode characters regardless of the language; however, it can occupy a maximum of 40 bytes.
Setting Length Semantics and Creating a Table
-- reset NLS parameters back to default here ALTER SESSION SET NLS_LANGUAGE = American NLS_TERRITORY = America; CREATE TABLE temp_employees_table ( employee_id NUMBER(4), last_name VARCHAR2(10 CHAR), job_id VARCHAR2(9), manager_id NUMBER(4), hire_date DATE, salary NUMBER(7,2), department_id NUMBER(2)) ; -- cleanup: drop the table DROP TABLE temp_employees_table;