UTF8からUTF8MB4へのONLINE DDL

online-alter-utf8

MySQL8.0.14のリリースノートに以下の様に3つの条件が揃えばONLINEで文字コードをONLINEで変更出来るとあったので確認。utf8(utf8mb3)からutf8mb4という限定された改善だが、utf8をメインで利用されている方も多いと思うのでサービスを停止しないでオンラインでメンテナンスしたいというニーズには有難い改善かもしれません。

ALTER TABLE now can be used to change a column character set in place (without a table rebuild), when these conditions apply:

– The column data type is CHAR, VARCHAR, a TEXT type, or ENUM.

– The character set change is from utf8mb3 to utf8mb4, or any character set to binary.

– There is no index on the column.

Changes in MySQL 8.0.14 (2019-01-21, General Availability)

MySQLではONLINE DDLが積極的に機能拡張されているので、サービス運用していてもメンテナンスしやすいです。ただ、機能拡張が頻繁に行われているので、場合によっては既に読んだ事があるマニュアルも定期的に確認する必要があるかもしれません。自分は、都度ALGORITHMオプションを付けて、ONLINEで処理出来るか確認する事が増えて来た気がします。

リマインド

MySQL5.7でvarchar sizeのオンラインでのサイズ変更が出来る様になりました。(255 byteの境界有り)

  • MySQL8.0.27 with utf8mb4(4byte)
mysql> CREATE TABLE `T1` (
    -> `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
    -> `note` varchar(10) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '-' COMMENT '8.0.27',
    -> `who` varchar(100) COLLATE utf8mb4_general_ci DEFAULT NULL,
    -> PRIMARY KEY (`id`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into T1(note,who) values('10xutf8mb4','Size 100 with utf8mb4');
Query OK, 1 row affected (0.01 sec)

mysql> select *,version() from T1;
+----+------------+-----------------------+-----------+
| id | note       | who                   | version() |
+----+------------+-----------------------+-----------+
|  1 | 10xutf8mb4 | Size 100 with utf8mb4 | 8.0.27    |
+----+------------+-----------------------+-----------+
1 row in set (0.00 sec)

mysql> ALTER TABLE T1 MODIFY note varchar(255), ALGORITHM=INPLACE, LOCK=NONE;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
mysql> ALTER TABLE T1 MODIFY note varchar(62), ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (0.07 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE T1 MODIFY who varchar(255), ALGORITHM=INPLACE, LOCK=NONE; /*** もともと255byteを超えた列 ***/
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> desc T1;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| id    | int          | NO   | PRI | NULL    | auto_increment |
| note  | varchar(62)  | YES  |     | NULL    |                |
| who   | varchar(255) | YES  |     | NULL    |                |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
  • MySQL5.7.20 with utf8mb4(4byte)
mysql> CREATE TABLE `T1` (
    -> `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
    -> `note` varchar(10) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '-' COMMENT '5.7.20',
    -> `who` varchar(100) COLLATE utf8mb4_general_ci DEFAULT NULL,
    -> PRIMARY KEY (`id`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into T1(note,who) values('10xutf8mb4','Size 100 with utf8mb4');
Query OK, 1 row affected (0.01 sec)

mysql> select *,version() from T1;
+----+------------+-----------------------+-----------+
| id | note       | who                   | version() |
+----+------------+-----------------------+-----------+
|  1 | 10xutf8mb4 | Size 100 with utf8mb4 | 5.7.20    |
+----+------------+-----------------------+-----------+
1 row in set (0.00 sec)

mysql> ALTER TABLE T1 MODIFY note varchar(255), ALGORITHM=INPLACE, LOCK=NONE;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
mysql> ALTER TABLE T1 MODIFY note varchar(62), ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE T1 MODIFY who varchar(255), ALGORITHM=INPLACE, LOCK=NONE; /*** もともと255byteを超えた列 ***/
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> desc T1;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| id    | int(11)      | NO   | PRI | NULL    | auto_increment |
| note  | varchar(62)  | YES  |     | NULL    |                |
| who   | varchar(255) | YES  |     | NULL    |                |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
ALTER TABLE and INPLACE

utf8mb3からutf8mb4へのONLINE DDL

カラムのキャラクター設定変更をオンライン出来る様になっているとの事なので確認

  • MySQL8.0.27 Online Alter Change from utf8(utf8mb3) to utf8mb4
mysql> show create table T1\G
*************************** 1. row ***************************
       Table: T1
Create Table: CREATE TABLE `T1` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '-' COMMENT '8.0.27',
  `who` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
1 row in set (0.00 sec)

mysql> insert into T1(note,who) values('10xutf8mb3','Size 100 with utf8mb4');
Query OK, 1 row affected (0.02 sec)

mysql> select *,version() from T1;
+----+------------+-----------------------+-----------+
| id | note       | who                   | version() |
+----+------------+-----------------------+-----------+
|  1 | 10xutf8mb3 | Size 100 with utf8mb4 | 8.0.27    |
+----+------------+-----------------------+-----------+
1 row in set (0.00 sec)

mysql> ALTER TABLE T1 MODIFY note varchar(10) CHARACTER SET utf8mb4, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (0.10 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table T1\G
*************************** 1. row ***************************
       Table: T1
Create Table: CREATE TABLE `T1` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
  `who` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
1 row in set (0.00 sec)

mysql> ALTER TABLE T1 MODIFY note varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, ALGORITHM=INPLACE, LOCK=NONE;  /*** Collationを付け忘れたので再実行 ***/
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table T1\G
*************************** 1. row ***************************
       Table: T1
Create Table: CREATE TABLE `T1` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  `who` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
1 row in set (0.00 sec)

mysql>
utf8 -> utf8mb4 online alter
  • MySQL5.7.20 Not Online Change from utf8 to utf8mb4
mysql> show create table T1\G
*************************** 1. row ***************************
       Table: T1
Create Table: CREATE TABLE `T1` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) CHARACTER SET utf8 NOT NULL DEFAULT '-' COMMENT '5.7.20',
  `who` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.01 sec)

mysql> insert into T1(note,who) values('10xutf8mb3','Size 100 with utf8mb4');
Query OK, 1 row affected (0.01 sec)

mysql> select *,version() from T1;
+----+------------+-----------------------+-----------+
| id | note       | who                   | version() |
+----+------------+-----------------------+-----------+
|  1 | 10xutf8mb3 | Size 100 with utf8mb4 | 5.7.20    |
+----+------------+-----------------------+-----------+
1 row in set (0.00 sec)

mysql> ALTER TABLE T1 MODIFY note varchar(10) CHARACTER SET utf8mb4, ALGORITHM=INPLACE, LOCK=NONE;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
mysql> ALTER TABLE T1 MODIFY note varchar(10) CHARACTER SET utf8mb4, ALGORITHM=COPY, LOCK=NONE;
ERROR 1846 (0A000): LOCK=NONE is not supported. Reason: COPY algorithm requires a lock. Try LOCK=SHARED.
mysql> ALTER TABLE T1 MODIFY note varchar(10) CHARACTER SET utf8mb4, ALGORITHM=COPY, LOCK=SHARED;
Query OK, 1 row affected (0.04 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> show create table T1\G
*************************** 1. row ***************************
       Table: T1
Create Table: CREATE TABLE `T1` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) DEFAULT NULL,
  `who` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

mysql>
utf8 -> utf8mb4 not online alter

上記の変更は、自分のケースではメンテナンス時間を取って、MySQL5.7に入れ替えた時に実施したけどオンラインでは出来ないので、MySQL8.0に入れ替えた後にスケジュール組んで入れ替える会社もあるんだろうなとイメージしました。

  • MySQL5.7.20
mysql> ALTER TABLE T_CONV CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, ALGORITHM=INPLACE, LOCK=NONE;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
mysql> ALTER TABLE T_CONV CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, ALGORITHM=COPY, LOCK=SHARED;
Query OK, 1 row affected (0.05 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> show create table T_CONV\G
*************************** 1. row ***************************
       Table: T_CONV
Create Table: CREATE TABLE `T_CONV` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) NOT NULL DEFAULT '-',
  `who` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

mysql> select @@version;
+-----------+
| @@version |
+-----------+
| 5.7.20    |
+-----------+
1 row in set (0.00 sec)

  • MySQL8.0.27
mysql> show create table T_CONV\G
*************************** 1. row ***************************
       Table: T_CONV
Create Table: CREATE TABLE `T_CONV` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '-',
  `who` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3
1 row in set (0.01 sec)

mysql> insert into T_CONV(note,who) values('utf8','uft8mb3');
Query OK, 1 row affected (0.02 sec)

mysql> select * from T_CONV;
+----+------+---------+
| id | note | who     |
+----+------+---------+
|  1 | utf8 | uft8mb3 |
+----+------+---------+
1 row in set (0.00 sec)

mysql> ALTER TABLE T_CONV CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table T_CONV\G
*************************** 1. row ***************************
       Table: T_CONV
Create Table: CREATE TABLE `T_CONV` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'PGの場合はSEQUENCE',
  `note` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL DEFAULT '-',
  `who` varchar(100) COLLATE utf8mb4_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
1 row in set (0.00 sec)

mysql> select @@version;
+-----------+
| @@version |
+-----------+
| 8.0.27    |
+-----------+
1 row in set (0.00 sec)
  • Collationサポート(MySQL8.0) まだutf8(utf8_general_ci)もサポートしているので可能ではある。
mysql> show collation like 'utf8_general_ci';
+-----------------+---------+----+---------+----------+---------+---------------+
| Collation       | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+-----------------+---------+----+---------+----------+---------+---------------+
| utf8_general_ci | utf8    | 33 | Yes     | Yes      |       1 | PAD SPACE     |
+-----------------+---------+----+---------+----------+---------+---------------+
1 row in set (0.00 sec)

mysql> show collation like 'utf8mb4_general_ci';
+--------------------+---------+----+---------+----------+---------+---------------+
| Collation          | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+--------------------+---------+----+---------+----------+---------+---------------+
| utf8mb4_general_ci | utf8mb4 | 45 |         | Yes      |       1 | PAD SPACE     |
+--------------------+---------+----+---------+----------+---------+---------------+
1 row in set (0.00 sec)

mysql>
CONVERT

In Case of PostgreSQL

PostgreSQLでサイズを変更する場合は以下のような感じ。DDLでテーブル定義を変更する時は、ロックに注意しながら対応する必要がある。ただ、PostgreSQLではMySQLと異なり、Truncate含むDDLはRollbackする事も出来るので、それぞれのRDBMSの違いを把握して最適な方法を選択すれば良い。

ALTER TABLEは既存のテーブルの定義を変更します。 以下のようにいくつかの副構文があります。 要求されるロックレベルはそれぞれの副構文によって異なることに注意してください。 特に記述がなければACCESS EXCLUSIVEロックを取得します。 複数のサブコマンドが使われるときは、それらのサブコマンドが要求するうち、もっとも高いレベルのロックを取得します。

ALTER TABLE
POC=# select version();
                                                     version
---------------------------------------------------------------------------------------------------------------
 PostgreSQL 13.4 (Debian 13.4-4.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
(1 row)


POC=# \d members;
                              Table "public.members"
 Column |          Type          | Collation | Nullable |         Default
--------+------------------------+-----------+----------+-------------------------
 id     | integer                |           | not null |
 name   | character varying(255) |           | not null |
 age    | integer                |           | not null |
 salary | numeric(10,0)          |           | not null |
 memo   | character varying(100) |           |          | NULL::character varying
 regist | date                   |           | not null |
Indexes:
    "members_pkey" PRIMARY KEY, btree (id)
    "idx_uq_members_name" UNIQUE, btree (name)

POC=# ALTER TABLE members ALTER COLUMN memo TYPE character varying(200);
ALTER TABLE
POC=# \d members;
                              Table "public.members"
 Column |          Type          | Collation | Nullable |         Default
--------+------------------------+-----------+----------+-------------------------
 id     | integer                |           | not null |
 name   | character varying(255) |           | not null |
 age    | integer                |           | not null |
 salary | numeric(10,0)          |           | not null |
 memo   | character varying(200) |           |          | NULL::character varying
 regist | date                   |           | not null |
Indexes:
    "members_pkey" PRIMARY KEY, btree (id)
    "idx_uq_members_name" UNIQUE, btree (name)

POC=#

キャラクターセットはDB単位で設定されていて、テーブル単位での指定は出来なそうだがCOLLATIONの指定は出来る様子。ただ、MySQLでもそうだが、仕様が変わらない限り、運用途中でCOLLATIONを変更するケースはレアだと思うので、最初の設定の段階できちんと設定しておけば文字コードも照合順序の変更も発生しないかと思う。

PostgreSQL COLLATION WITH TABLE

POC=# \x
Expanded display is on.
POC=# \l+
List of databases
-[ RECORD 1 ]-----+-------------------------------------------
Name              | POC
Owner             | postgres
Encoding          | UTF8
Collate           | C
Ctype             | C
Access privileges |
Size              | 543 MB
Tablespace        | pg_default
Description       |
-[ RECORD 2 ]-----+-------------------------------------------
Name              | postgres
Owner             | postgres
Encoding          | UTF8
Collate           | C
Ctype             | C
Access privileges |
Size              | 7933 kB
Tablespace        | pg_default
Description       | default administrative connection database
-[ RECORD 3 ]-----+-------------------------------------------
Name              | template0
Owner             | postgres
Encoding          | UTF8
Collate           | C
Ctype             | C
Access privileges | =c/postgres                               +
                  | postgres=CTc/postgres
Size              | 7753 kB
Tablespace        | pg_default
Description       | unmodifiable empty database
-[ RECORD 4 ]-----+-------------------------------------------
Name              | template1
Owner             | postgres
Encoding          | UTF8
Collate           | C
Ctype             | C
Access privileges | =c/postgres                               +
                  | postgres=CTc/postgres
Size              | 7753 kB
Tablespace        | pg_default
Description       | default template for new databases

POC=# \x
Expanded display is off.

POC=# SHOW client_encoding;
 client_encoding
-----------------
 UTF8
(1 row)

POC=# CREATE TABLE TC1_UND (note text COLLATE "und-x-icu");
CREATE TABLE
POC=# CREATE TABLE TC1_JA (note text COLLATE "ja-x-icu");
CREATE TABLE
POC=# \d+ TC_UND;
Did not find any relation named "TC_UND".
POC=#
POC=# \d+ TC1_UND;
                                 Table "public.tc1_und"
 Column | Type | Collation | Nullable | Default | Storage  | Stats target | Description
--------+------+-----------+----------+---------+----------+--------------+-------------
 note   | text | und-x-icu |          |         | extended |              |
Access method: heap

POC=# \d+ TC1_JA;
                                 Table "public.tc1_ja"
 Column | Type | Collation | Nullable | Default | Storage  | Stats target | Description
--------+------+-----------+----------+---------+----------+--------------+-------------
 note   | text | ja-x-icu  |          |         | extended |              |
Access method: heap

今回確認した範囲では特に大きな違いは見受けられなかったが、以下のケースを見てみるとCOLLATEも”C”で適切に処理してくれている。23.2. 照合順序サポートを参考に、時間がある時にもう少し深堀してみようと思う。

POC=# select * from TC1_JA order by note;
     note
---------------
 1
 2
 3
 a
 A
 This is test1
 This is test2
 ぁ
 あ
 ア
 ば
 バ
 ぱ
 パ
(14 rows)

POC=# select * from TC1_UND order by note;
     note
---------------
 1
 2
 3
 a
 A
 This is test1
 This is test2
 ぁ
 あ
 ア
 ば
 バ
 ぱ
 パ
(14 rows)

POC=# select * from TC1_UND order by note collate "C";
     note
---------------
 1
 2
 3
 A
 This is test1
 This is test2
 a
 ぁ
 あ
 ば
 ぱ
 ア
 バ
 パ
(14 rows)

POC=#

参照:

ALTER TABLE

23.3. 文字セットサポート

カテゴリー:

最近のコメント

表示できるコメントはありません。