What is the difference between utf8_general_ci, utf8_unicode_ci, utf8mb4_general_ci, utf8mb4_unicode_ci collations. Which collation, character set and encoding to choose for MySQL database
August 25, 2023
As of MySQL 5.5.3 you must use utf8mb4 and not utf8. Both of these groups refer to UTF-8 encoding, but the older utf8 has MySQL-specific restrictions that prevent characters above 0xFFFD from being used.
Thus, neither utf8_general_ci nor utf8_unicode_ci need to be used anymore.
As for the new encoding versions utf8mb4_general_ci and utf8mb4_unicode_ci. That is unicode preferred over general. The utf8mb4_general_ci variant will be slightly faster in sorting (now this is not relevant), but has sorting issues in certain languages. The utf8mb4_unicode_ci encoding does not have these shortcomings.
So, the current recommended encoding for MySQL databases and tables is utf8mb4_unicode_ci.
Tip: To save space with utf8mb4, use VARCHAR instead of CHAR. Otherwise, MySQL will reserve four bytes for each character in a CHAR CHARACTER SET utf8mb4 column, as this is the maximum length possible. For example, MySQL must reserve 40 bytes for a CHAR(10) CHARACTER SET utf8mb4 column.
Note: more precisely, utf8mb4_unicode_ci is not exactly an encoding, in MySQL terms it is called “Collation” and includes a character set, as well as comparison and sorting rules. That is, utf8mb4_unicode_ci is a COLLATION, and utf8mb4 is a character set, and UTF-8 is already a variable length encoding.
Related articles:
- Output encoding issues in PowerShell and third-party utilities running in PowerShell (SOLVED) (82.5%)
- mysqldump in PowerShell corrupts non-Latin characters when exporting database (SOLVED) (82.5%)
- How to list MySQL/MariaDB users and their privileges (using RDBMS client and PHP) (67.6%)
- ERROR 1044 (42000): Access denied for user 'mial'@'localhost' to database 'TestDB'. Can't create MySQL database (SOLVED) (67.6%)
- Error when using ‘USE’ statement in MySQL (Mariadb): “ERROR 1044 (42000): Access denied for user” (SOLVED) (67.6%)
- How to determine the type and functions of USB Type-C: charging, connecting to a monitor, Thunderbolt (RANDOM - 50%)