Tag: character encoding

What is the difference between utf8_general_ci, utf8_unicode_ci, utf8mb4_general_ci, utf8mb4_unicode_ci collations. Which collation, character set and encoding to choose for MySQL database

As of MySQL 5.5.3 you must use utf8mb4 and not utf8. Both of these groups refer to UTF-8 encoding, but the older utf8 has MySQL-specific restrictions that prevent characters above 0xFFFD from being used. Thus, neither utf8_general_ci nor utf8_unicode_ci need to be used anymore. As for the new encoding versions utf8mb4_general_ci and utf8mb4_unicode_ci. That is unicode preferred over general. The...

mysqldump in PowerShell corrupts non-Latin characters when exporting database (SOLVED)

mysqldump is a MySQL utility for creating database and table backups. Unlike phpMyAdmin, which, although it offers a web interface, is a slower tool due to the limitations of intermediates such as PHP and Apache, mysqldump is a much more efficient tool without limitations for backing up very large data. But on Windows, mysqldump has some nuances. Due to the...

Output encoding issues in PowerShell and third-party utilities running in PowerShell (SOLVED)

What encoding is used in PowerShell by default. How to change the default output encoding to UTF-8 in PowerShell If you run the following command in PowerShell 5: "Testing" > test.file And check the encoding in the newly created test.file, it turns out that it is UTF-16LE. If you run the following command in PowerShell 7: "Testing" > test.file And...

How to find and remove non-UTF-8 characters from a text file

Filtering invalid UTF-8 characters Files that, in addition to ordinary characters, contain characters that are invalid from the UTF-8 point of view, cause a problem both when they are processed by utilities and when opened in text editors. An example of an error in Python 3 when trying to process a file with non-UTF-8 characters: [-] Exception as following: Traceback...
Loading...
X