MySQL's utf8 charset is actually utf8mb3 — it only supports characters up to 3 bytes. This means:
Emoji characters are truncated or rejected
Some CJK characters are not supported
Data silently corrupted in strict mode off
Detection
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAMEFROM INFORMATION_SCHEMA.COLUMNSWHERE CHARACTER_SET_NAME = 'utf8' AND TABLE_SCHEMA NOT IN ('information_schema', 'mysql', 'performance_schema');
Or with SQLFluff:
sqlfluff lint --dialect mysql --rules LT01 migration.sql
Fix
-- Convert a tableALTER TABLE users CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;-- Set default for new tablesALTER DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Platforms
MySQL 8.0+: utf8mb4 is the default charset
MariaDB 10.x: Still defaults to utf8mb3 in some versions