存储非英文字符,得到'?????' - MySQL字符集问题

My site that I am working on is in Farsi and all the text are being displayed as ????? (question marks). I changed the collation of my DB tables to UTF8_general_ci but it still shows ???

I ran the following script to change all the tables but this did not work as well.

I want to know what am I doing wrong

<?php
// your connection
mysql_connect("mysql.ord1-1.websitesettings.com","user_name","pass");
mysql_select_db("895923_masihiat");

// convert code
$res = mysql_query("SHOW TABLES");
while ($row = mysql_fetch_array($res))
{
    foreach ($row as $key => $table)
    {
        mysql_query("ALTER TABLE " . $table . " CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci");
        echo $key . " =&gt; " . $table . " CONVERTED<br />";
    }
}
?>

Bad news. But first, double check:

SELECT col, HEX(col)...

to see what is in the table. If the hex shows 3F, then the data is gone. Correctly stored, the dal character should be hex D8AF; hah is hex D8AD.

What happened:

  • you had utf8-encoded data (good)
  • SET NAMES latin1 was in effect (default, but wrong)
  • the column was declared CHARACTER SET latin1 (default, but wrong)

As you INSERTed the data, it was converted to latin1, which does not have values for Farsi characters, so question marks replaced them.

The cure (for future `INSERTs):

  • Recode your application using mysqli_* interface instead of the deprecated mysql_* interface.
  • utf8-encoded data (good)
  • mysqli_set_charset('utf8')
  • check that the column(s) and/or table default are CHARACTER SET utf8
  • If you are displaying on a web page, <meta...utf8> should be near the top.

The discussion above is about CHARACTER SET, the encoding of characters. Now for a tip on COLLATION, which is used for comparing and sorting.

If you want these to be treated equal: 'بِسْمِ' = 'بسم', then use utf8_unicode_ci (instead of utf8_general_ci) for the COLLATION.