This question already has an answer here:
When receiving and writing my XML, some of the fields are coming through like this: Benedíková
But when I parse it out with code like
$xml = simplexml_load_file($filename);
print_r($xml);
...the field changes to this:
BenedÃková
How can I parse it cleanly so that characters like á or í are retained?
</div>
As it is documented in the PHP manual, when you read strings out of a SimpleXMLElement
is is always in Unicode using the UTF-8 encoding.
This is independent to the encoding used inside the document.
So if your website does not use UTF-8 as the encoding you might want to switch to it or you might want to re-encode those UTF-8 strings into the encoding of your website.
I would normally suggest the first (convert the website to use UTF-8) but it's not always easily possible to change it (and not always the right thing to do), so both variants have their use.
See as well: