My data in one of my Database columns is not displaying correctly from UTF8.
I recently converted the MySQL database to UTF8 using:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
When displaying data using my PHP script one field appears to not display correctly:
£25
When Im expecting:
£25
Other fields on the same page correctly display £25. The only difference I see is that they are VARCHARs and the problem field is MEDIUMTEXT.
Why does this one field not display correctly (it isn't treated any differently by my script).
Any help ?
You can't just convert the collation like that and not expect issues. There are many problematic characters that are not converted automatically.
What's happened is you likely had a non-UTF8 friendly character (like those MS Office produces) and when you converted it got mangled. If it's just a few fields, I would open them in an editor and remove the offending data. Otherwise you may have to "scrub" your database of these strange characters using a loop and something like utf8_encode
What was the CHARACTER SET of the column before converting?
A3 is the hex for latin1 £.
C2A3 is the hex for utf8 £, but it displays as £ if treated as latin1.
Possibly you only need to declare that your php program is using utf8:
⚈ (deprecated) mysql interface: mysql_set_charset('utf8'); (assuming PHP 5.2.3 & MySQL 5.0.7) Note: "mysql_" interface is deprecated.
⚈ mysqli interface: mysqli_set_charset('utf8') function. See mysqli_set_charset
⚈ PDO interface: set the charset attribute of the PDO dsn.
Please do SELECT HEX(col) FROM tbl WHERE ...
. If you get C2A3...
, then you have correctly converted the table to utf8. If you get C382C2A3...
, you have the dreaded "double encoding". I discuss that further in my blog.
Since you seem to have different fields showing different things, you are likely to have different encodings. Beware, things are getting harder and harder to untangle.