Everyone else probably already knows this stuff, but I hit an issue today to that took a while to sort out. Fortunately, some kind folks on IRC helped me, but as it's embarrassing to ask for help on the same issue twice, I'm writing down what I've learned!
The problem
Get a £ character stored to MySQL, retrieved and then displayed without any weird characters in front of it using UTF8.
The solution
Make sure that you are using UTF8 everywhere!
The browser:
<?php header("Content-type: text/html; charset=utf-8"); ?>
You can also use a meta tag that is redundant in theory:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Also, note that the the <form> element has an 'accept-charset' attribute which should also be set:
<form accept-charset="utf-8" ...>
MySQL:
Make sure that your table's collation is utf8_general_ci and that all string fields within the table also have the utf8_general_ci collation.
And here's the really important bit: make sure your client connection is also using UTF-8:
For mysql:
mysql_set_charset('utf8');
or for mysqli:
mysqli_set_charset('utf8');
or execute the SQL immediately after connection:
SET NAMES UTF8;
or for PDO:
$handle = new PDO("mysql:host=localhost;dbname=dbname",
'username', 'password',
array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
or for Zend_Db:
$params = array(
'host' => 'localhost',
'username' => 'username',
'password' => 'password',
'dbname' => 'dbname',
'driver_options' => array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES UTF8;');
);
$db = Zend_Db::factory('PDO_MYSQL', $params);
Note that in PHP 5.3.0 and 5.3.1, you cannot use the PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES UTF8;' option for PDO as it doesn't work! See bug 47224 for details.
For a Zend Framework application that uses Zend_Application, add this to your ini file:
resources.db.params.charset = utf8
Now everything works as expected!
(as long as you don't have an output filter on your view that's too clever for its own good...)
Also, read About using UTF-8 fields in MySQL by Joshua Thijssen.



March 18th, 2009 at 20:46 #
Hi Rob!
Believe it or not, there's one more thing that can ruin the fun (learned the hard way): Apache's AddDefaultCharset http://httpd.apache.org/docs/2.0/mod/core.html#AddDefaultCharset
I know it's disabled on most servers, but I had the "luck" to come across on a server which had it set to Latin-1 (ISO-8859-1), overwriting every other setting.
Thought to mention it :)
Cheers,
Robert
March 18th, 2009 at 20:50 #
Ah yeah - that didn't come up today, but I've come across that one before too.
Thanks for commenting - no doubt it'll help me when I come back here in 6 months time :)
Regards,
Rob...
March 18th, 2009 at 22:15 #
Rob, in ZF i use MySQLi driver and i set my connection charset in this way:
$dbAdapterMySQLi->getConnection()->set_charset("utf8");
:)
March 19th, 2009 at 06:22 #
How to configure this for Zend_Db if I use config.ini file?
March 19th, 2009 at 06:57 #
Diego,
Thanks for the tip :)
Kreoton,
untested, but try:
db.params.driver_options.0.PDO::MYSQL_ATTR_INIT_COMMAND = "SET NAMES UTF8;"Regards,
Rob...
(Update: don't use the above!)
March 19th, 2009 at 08:43 #
A couple of slightly different methods here for Zend_DB
http://zfsnippets.com/snippets/view/id/13
March 19th, 2009 at 10:18 #
Kreoton & Rob:
I don't think
db.params.driver_options.0.PDO::MYSQL_ATTR_INIT_COMMAND = "SET NAMES UTF8;"
will work. I'm using
db.params.driver_options.1002 = "SET NAMES utf8"
and that's working like a charm (1002 is the integer value for PDO::MYSQL_ATTR_INIT_COMMAND).
Best regards
Stefan
March 19th, 2009 at 10:39 #
@4,7
db.params.driver_options.1002 = "SET NAMES utf8"
will indeed work. I don't think there's any way to use the constants.
When dealing with utf8 you may also need to set the encoding of your script files, especially if the £ is in that file, and not the database.
March 19th, 2009 at 11:08 #
Stefan and David,
Well, it was a complete guess :)
I trust your information as you've actually used it!
Regards,
Rob...
March 19th, 2009 at 11:54 #
I don't think I've met a web developer who hasn't had encoding issues. It's good to have all these points listed.
I just want to know why all the latest software releases aren't defaulting to utf-8?!
March 19th, 2009 at 12:19 #
I'm sure there are many reasons. I know that if utf8 had been the default when I started programming (and using iso-8859-1 for my webpages) it would have caused problems. Probably for compatibility reasons like these
March 19th, 2009 at 13:00 #
Good reference post! Bookmarked in case I ever need this =)
March 19th, 2009 at 13:57 #
Don't forget to use UTF-8 compatible functions in PHP for string comparison.
Most string operations won't work for UTf-8 encoded strings because a utf-8 char is 4 bytes where ascii is 1 byte.
March 19th, 2009 at 14:24 #
Thanks for Rob, Stefan Gehrig and David Caunt I would test this when I back home.
March 19th, 2009 at 16:13 #
bookmarked too, thanks.
March 20th, 2009 at 08:39 #
Don't forget to save your PHP-Files in UTF-8, in case you put your content directly in your PHP-Files.
March 21st, 2009 at 11:03 #
Some more:
- do not write BOM (Byte Order Mark) when saving your PHP files in UTF-8
- make shure you are using multibyte aware functions (mb_...)
- make shure your regular expression use the unicode modifier, and that the input to the expression is a valid utf-8 sequence
If you have access to the servers configuration files, you can tell apache to append the correct header to your scripts and tell MySQL to use utf8 for all the connections.
March 21st, 2009 at 11:33 #
Good tips, thanks.
Rob...
March 23rd, 2009 at 02:02 #
Also a really important article to read if you want to convert your database character set correctly (And avoid the wrong how-to's on the web):
http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/
March 23rd, 2009 at 12:58 #
If you work with Zend Framework, MySQLi as Driver and load your config out of an INI file you can use it this way:
database.params.driver_options.MYSQLI_INIT_COMMAND = "SET NAMES utf8"
OR according to the example above
$params = array(
'host' => 'localhost',
'username' => 'username',
'password' => 'password',
'dbname' => 'dbname',
'driver_options' => array(MYSQLI_INIT_COMMAND => 'SET NAMES UTF8;');
);
$db = Zend_Db::factory('Mysqli', $params);
March 24th, 2009 at 04:48 #
How come this is the first time I hear about this?
Dammit I've been writing é replace functions for years!!
April 7th, 2009 at 11:11 #
I'm not sure #20 is correct, I don't think you can use PHP constants in Ini Config files.
Simply use
database.params.driver_options.1002 = "SET NAMES utf8"
rather than
database.params.driver_options.MYSQLI_INIT_COMMAND = "SET NAMES utf8"
1002 is the value of MYSQLI_INIT_COMMAND
April 8th, 2009 at 05:44 #
David,
You can use PHP constants in the value part of an INI file, but I agree that I don't think you can you use them in the key part.
Regards,
Rob...
April 9th, 2009 at 10:16 #
Yup. This is the first problem, that every Czech programmer faces, when he learns PHP :) Coz Windows uses crazy Windows-1250 encoding.
$db->query('SET NAMES UTF8'); also works for Zend_Db
Also you should use "locale" collation (i.e.: utf8_czech_ci for czech)... otherwise it sorts chars based on their binary value rather then natural language sorting.
April 13th, 2009 at 01:19 #
Hello,
To me
=====
database.params.driver_options.MYSQLI_INIT_COMMAND = "SET NAMES utf8"
=====
doesn't work, but database.params.driver_options.1002 = "SET NAMES utf8" is working perfectly ;-)
April 23rd, 2009 at 21:11 #
Many thanks! mysql_set_charset("utf-8"); worked for me.
May 8th, 2009 at 12:27 #
David (#22), you're mixing up two things. As Salz` in #20 said, MYSQLI_INIT_COMMAND would be the constant to use for the *mysqli* extension. Its numerical value is 3 in contrast with PDO::MYSQL_ATTR_INIT_COMMAND which you should use with PDO and has a numerical value of 1002. But that mysqli constant itself actually also works. Maybe the only reason the PDO constant doesn't work is because of the colons?
In any case, you won't need either of those anymore as of Zend Framework 1.8 at least. You can use:
database.params.charset = utf8
May 8th, 2009 at 13:25 #
#27 - Correct, I was confused. 1.8 is king :)
May 19th, 2009 at 06:18 #
thanks to your site, I fixed encoding on mine. wonderful.
May 22nd, 2009 at 20:16 #
Response to: #22,#23,#25
=====
database.params.driver_options.MYSQLI_INIT_COMMAND = "SET NAMES utf8"
=====
Works perfekt, cause of line 303->313 in http://framework.zend.com/svn/framework/standard/trunk/library/Zend/Db/Adapter/Mysqli.php
especially:
====
$option = @constant(strtoupper($option));
====
Oh, and this is true since many version of Zend Framework.
August 12th, 2009 at 10:33 #
I don't know why but
;resources.db.params.driver_options.1002 = "SET NAMES UTF8;"
works for my fine but
database.params.charset = utf8
NOT
Do you know why?
October 2nd, 2009 at 07:18 #
In newer Zend Framework just use
database.params.charset = utf8
to set the charset in the ini file.
November 5th, 2009 at 17:13 #
greg606 You must be on #ZF prior to version 1.8
November 23rd, 2009 at 00:46 #
Thank You!! It was so frustrating.
February 5th, 2010 at 11:27 #
Thank you all!!!
In my case, the line
database.params.charset = utf8
worked perfect
May 24th, 2010 at 06:14 #
You rock, just saying, cause you saved me hours of trying to figure out where I was going wrong :)
June 25th, 2010 at 03:11 #
thanks I was having problems with Ñs
July 28th, 2010 at 21:08 #
Your tip about mysql_set_charset('utf8') just transformed my workflow. None of all three of my otherwise excellent O'Reilly manuals for PHP (PHP and MySQL, PHP Cookbook, or Programming PHP) even mention this function.
I thought I was doing everything correctly but the one missing piece was that PEAR was not set up to handle UTF-8 encoding.
So simple!
September 28th, 2010 at 06:55 #
Thanks for this tip.
November 8th, 2010 at 20:57 #
It works like a charm, Thanks Rob, thanks everybody!
December 19th, 2010 at 22:54 #
Beware when using:
resources.db.params.charset
and
resources.db.params.driver_options.1002
When you have the first, the second is overwritten when calling _connect. That sucks and it's most likely an ugly bug, as PDO::MYSQL_ATTR_INIT_COMMAND can be used for more than just setting the charset (which is my case)
January 18th, 2011 at 08:12 #
Hi,
Thank you so much for this post. I have spent almost all day trying to figure out how to get correct Lithuanian characters from my MySQL database correctly showing on my php website. Finally, I only needed to add this extra line after making the connection to database:
mysql_set_charset('utf8');
Thanks again, you saved a lot of my time. ;)
March 23rd, 2011 at 21:33 #
Hi Rob,
Today I started a new project and i had to deal with the same issue. My short solution is indeed:
$view->headMeta()->appendHttpEquiv('Content-Type', 'text/html;charset=utf-8');
(obviously you need to echo $this->headMeta() in your layout.phtml)
This is the only place in my code where I define utf-8 charset encoding and database saving is totally OK. And it saves me lots of typing and thus headaches ;)
greetz from Holland
April 17th, 2011 at 07:23 #
Don't forget to save your files according to the charset you want to user either ;-)
It's a common mistake among beginners...
http://web-o-blog.blogspot.com/2011/04/utf-8-with-phpmysql.html
July 14th, 2011 at 10:11 #
I've been looking for this answer for a few weeks now. I tried the php header() like you pointed out and it fixed the problem.
I appreciate this concise blog!
September 19th, 2011 at 22:03 #
Nice post, some real gems in here. Is there a PostgreSQL equivalent of PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8")?