Tag: mysqldump

mysqldump in PowerShell corrupts non-Latin characters when exporting database (SOLVED)

mysqldump is a MySQL utility for creating database and table backups. Unlike phpMyAdmin, which, although it offers a web interface, is a slower tool due to the limitations of intermediates such as PHP and Apache, mysqldump is a much more efficient tool without limitations for backing up very large data.

But on Windows, mysqldump has some nuances. Due to the peculiarities of PowerShell for working with encodings, all non-Latin characters can be corrupted in exported databases. This issue is not seen in CMD, but recent versions of Windows use PowerShell by default, so the issue in question affects all users who run mysqldump to back up databases in Windows.

The following command, executed in PowerShell 7:

.\mysqldump.exe -u root --all-databases > all-databases_ps7.sql

Creates a UTF-8 encoded all-databases_ps7.sql file into which all MySQL databases will be exported using mysqldump (for backup purposes). But in these databases all non-Latin characters will be irretrievably corrupted!

That is, instead of Cyrillic, it will be something like this:

'╨Р╤А╨▒╨╕╤В╤А╨░╨╢╨╜╤Л╨╣ ╨┐╤А╨╛╤Ж╨╡╤Б╤Б: ╤Г╤З╨╡╨▒╨╜╨╕╨║ / ╨Ъ.╨Ь. ╨Р╤А╤Б╨╗╨░╨╜╨╛╨▓, ╨Ф.╨е. ╨Т╨░╨╗╨╡╨╡╨▓, ╨а.╨Э. ╨У╨╕╨╝╨░╨╖╨

To avoid this problem, use mysqldump with the --result-file option. The following command will save the database in the correct encoding:

.\mysqldump.exe -u root --all-databases --result-file=all-databases.sql

You can also use the following two-command construct to fix the encoding problem:

[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("UTF-8")
.\mysqldump.exe -u root --all-databases > all-databases.sql

Output encoding issues in PowerShell and third-party utilities running in PowerShell (SOLVED)

What encoding is used in PowerShell by default. How to change the default output encoding to UTF-8 in PowerShell

If you run the following command in PowerShell 5:

"Testing" > test.file

And check the encoding in the newly created test.file, it turns out that it is UTF-16LE.

If you run the following command in PowerShell 7:

"Testing" > test.file

And check the encoding in the newly created test.file, it turns out that it is UTF-8.

The following command, executed in PowerShell 5:

.\mysqldump.exe -u root --all-databases > all-databases_ps5.sql

Creates a UTF-16LE encoded all-databases_ps5.sql file into which all MySQL databases will be exported using mysqldump (for backup purposes). In these databases, ALL non-Latin characters will be irretrievably corrupted!

That is, instead of Cyrillic, it will be something like this:

'╨Р╤А╨▒╨╕╤В╤А╨░╨╢╨╜╤Л╨╣ ╨┐╤А╨╛╤Ж╨╡╤Б╤Б: ╤Г╤З╨╡╨▒╨╜╨╕╨║ / ╨Ъ.╨Ь. ╨Р╤А╤Б╨╗╨░╨╜╨╛╨▓, ╨Ф.╨е. ╨Т╨░╨╗╨╡╨╡╨▓, ╨а.╨Э. ╨У╨╕╨╝╨░╨╖╨

The following command, executed in PowerShell 7:

.\mysqldump.exe -u root --all-databases > all-databases_ps7.sql

Creates a UTF-8 encoded all-databases_ps7.sql file into which all MySQL databases will be exported using mysqldump (for backup purposes). But in these databases, ALL non-Latin characters will be irretrievably corrupted AGAIN!

That is, it would seem that the default encoding has changed to UTF-8, but the problem with completely corrupted database backups has not gone away.

How to save output from third-party programs in UTF-8 encoding in PowerShell

The above behavior, which corrupts the output of commands in PowerShell, is not acceptable. Let's look at how to ensure that output in PowerShell is saved in UTF-8 encoding.

How to save PowerShell output in UTF-8 encoding, use the Out-File cmdlet

Consider the following command:

.\mysqldump.exe -u root --all-databases > all-databases.sql

As already shown above, it corrupts non-Latin characters due to incorrect encoding.

In some sources, as a solution to the problem, it is recommended to replace the “>” character with “Out-File”, and also specify the encoding using the “-Encoding UTF8” option. That is, use the following command:

.\mysqldump.exe -u root --all-databases | Out-File -Encoding UTF8 all-databases_fixed.sql

In fact, both in PowerShell 7 and in PowerShell 5, this command changes little. First, the command is analogous to the output redirection symbol “>”. Secondly, by default, the Out-File cmdlet uses UTF-8 encoding, that is, it is not necessary to specify it specifically.

But the most important thing is that despite the fact that the data is saved to a file with UTF-8 encoding (in the previous command, this is the all-databases_fixed.sql file), non-Latin characters in this file are still corrupted! The thing is that the Out-File cmdlet initially processes the received data in the wrong encoding. Therefore, it no longer matters how exactly Out-File saves the data – the data is corrupted already at the time of receipt in this cmdlet.

The problem was solved by explicitly specifying the encoding for the received data:

[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("UTF-8")

The subsequent run of the command saved all the data in the correct encoding:

.\mysqldump.exe -u root --all-databases | Out-File -Encoding UTF8 all-databases_fixed.sql

This method works equally well in PowerShell 7 and PowerShell 5.

By the way, the original command using this method also saves the data in the correct encoding:

[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("UTF-8")
.\mysqldump.exe -u root --all-databases > all-databases.sql

Saving output by third-party utilities

In all previous commands, we redirected the output of the mysqldump utility to a file, or to the Out-File cmdlet. But the mysqldump utility has a --result-file option after which you can specify a filename to output. That is, as a result of using this option, you do not need to use output redirection or PowerShell cmdlets.

The following command will save the database in the correct encoding:

.\mysqldump.exe -u root --all-databases --result-file=all-databases.sql

Loading...
X