My issue is I have a database which was imported as UTF-8 that has columns that are default latin1. This is obviously an issue so when I set the charset to UTF-8 on php it gives me �
instead of the expected ae
character.
Now, when I originally had my encoding as windows-1252
it worked perfectly but then when I validate my file it says that windows-1252
is legacy and shouldn't be used.
Obviously I'm only trying to get rid of the error message but the only problem is I'm not allowed to change anything in the database at all. Is there any way the data can be output as utf-8
whilst still being stored as latin1
in the DB?
Time ago, I used this function to resolve printing texts in a hellish page of different lurking out-of-control charsets xD:
function to_entities($string)
{
$encoding = mb_detect_encoding($string, array('UTF-8', 'ISO-8859-1')); // and a few encodings more... sigh...
return htmlentities($string, ENT_QUOTES, $encoding, false);
}
print to_entities('á é í ó ú ñ');
1252 (latin1) can handler æ
. It is hex E6
. In utf8 it is hex C3A6
.
�
usually comes from latin1 encodings, then displaying them as utf8. So, let's go back to what was stored.
Please provide SHOW CREATE TABLE
. I suspect it will say CHARACTER SET latin1
, not utf8.
Then, let's see
SELECT col, HEX(col) FROM tbl WHERE ...
to see the hex. (See hex notes above.)
Assuming everything is latin1 so far, then the simple (and perhaps expedient) answer is to check the html source. I suspect it says
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Changing to charset=ISO-8859-1
may solve the black diamond problem.
But... latin1 only handles Western European characters. If you need Slavic, Greek, Chinese, etc, then you do need utf8. I'll provide a different answer in that case.
I have figured out how to do this after looking through the link that Fred provided, thanks!
if anyone needs to know what to do
if you have a database connection file. inside that, underneath the mysqli_connect command add
mysqli_set_charset($connectvar, "utf8");