A project I’m working on at the moment is going to have multiple language options available, not all of which use the same alphabet (e.g. Russian and Chinese).
To lessen the pain commonly associated with internationalisation on the web, it’s beneficial to use the UTF-8 character set. This short summary from the Unicode Consortium may help explain better;
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. … Unicode enables a single software product or a single website to be targeted across multiple platforms, languages and countries without re-engineering. It allows data to be transported through many different systems without corruption
Thankfully MySQL has supported Unicode for quite some time now, even if it’s not configured to use it by default.
First, let’s check what our settings are at the moment;
That’s to be expected, but it’s not really what we wanted.
Find your MySQL configuration file (on most Linux/BSD systems it’s /etc/my.cnf) and make sure it’s got the following statements under the relevant headers.