UTF8, PHP and MySQL

Everyone else probably already knows this stuff, but I hit an issue today to that took a while to sort out. Fortunately, some kind folks on IRC helped me, but as it's embarrassing to ask for help on the same issue twice, I'm writing down what I've learned!

The problem

Get a £ character stored to MySQL, retrieved and then displayed without any weird characters in front of it using UTF8.

The solution

Make sure that you are using UTF8 everywhere!

The browser:

<?php header("Content-type: text/html; charset=utf-8"); ?>

You can also use a meta tag that is redundant in theory:


<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Also, note that the the <form> element has an 'accept-charset' attribute which should also be set:


<form accept-charset="utf-8" ...>

MySQL:

Make sure that your table's collation is utf8_general_ci and that all string fields within the table also have the utf8_general_ci collation.

And here's the really important bit: make sure your client connection is also using UTF-8:

For mysql:


mysql_set_charset('utf8');

or for mysqli:


mysqli_set_charset('utf8');

or execute the SQL immediately after connection:


SET NAMES UTF8;

or for PDO:


$handle = new PDO("mysql:host=localhost;dbname=dbname",
    'username''password', 
    array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));

or for Zend_Db:


$params = array(
    'host' => 'localhost',
    'username' => 'username',
    'password' => 'password',
    'dbname' => 'dbname',
    'driver_options' => array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES UTF8;');
);
$db Zend_Db::factory('PDO_MYSQL'$params);

Note that in PHP 5.3.0 and 5.3.1, you cannot use the PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES UTF8;' option for PDO as it doesn't work! See bug 47224 for details.

For a Zend Framework application that uses Zend_Application, add this to your ini file:

resources.db.params.charset utf8

Now everything works as expected!

(as long as you don't have an output filter on your view that's too clever for its own good...)

Also, read About using UTF-8 fields in MySQL by Joshua Thijssen.

If you would like to comment on this article, please ping me on twitter.
If your response won't fit into 140 characters, write a blog post and then ping me!