Correcting weird characters in MySQL

Technology Blog

While working on web development projects I often need to copy the contents of one database to another. Consequently I’ve seen the problem described on Alex Kings’s blog a few times, and been able to deal with it using the technique he suggests.

Today however I saw the same symptoms – apostrophes and some other characters translated to weird glyphs in a database that had been saved and restored apparently using UTF-8 coding throughout. How can that happen, I wondered?

What I didn’t realise was that in order to ensure the entire restore process uses UTF-8, you also need some additional magic instructions to ensure the actual dump file is read as being UTF encoded. The software I use, SQLyog, adds the required lines to the start of every dump file by default but the file I had to work with had been created using phpMyAdmin – an excellent piece of software but not free of imperfections.

Once I’d recognised the problem, it was merely a matter of pasting the following incantations into the top of the dump file and restoring from it again:

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;

I’m sure not all these lines are strictly needed but the restore worked so I’m happy. Hopefully this little piece of knowledge will help save someone else from the desperate measures I’ve seen advocated elsewhere, such as using search and replace on each of the offending characters.

It’s good to understand how these character set mismatches can occur, as it’s not much fun restoring from a dump and finding that your database (or that of an important client) is now studded with peculiar-looking punctuation. If your database is UTF encoded, if possible make sure the dumps you take already have the magic recipe at the front, especially if someone else may be responsible for restoring data from them.

Share this post:
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Sphinn
  • TwitThis

2 Responses to “Correcting weird characters in MySQL”

  1. Derik Rhodes

    This really helped me out. I was creating a new database and needed to using information from an old database. After dumping the data from the old db into the new db, I noticed funky characters.

    I did not realize that I needed to include those 4 lines that you have posted here. I’d seen those lines before but they looked just like comments to me. I didn’t realize they played any role in setting character information.

    After pasting those lines into the .sql file in question, I imported to the new db and everything was back to normal.

    Thanks again for the post, it really helped me out.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>