I'm using html_entity_decode($row['Content'])
to display some JSON
data that contains HTML
in a PHP
document. Problem is that some of the data being returned has open HTML tags such as <strong>
which then carry on to the content displayed after.
Is there some way to terminate the HTML?
If you ever accept raw HTML from an outside source to embed into your site, you should always, always, reformat and whitelist it. You have no idea what that 3rd party HTML may contain, and you have no guarantee that it's valid; yet on your site you presumably want guaranteed valid HTML with certain limits on its content (or do you really want to enable the embedding of arbitrary <script>
tags...?!).
That means you want to:
Supposedly the best PHP library which does that is HTML Purifier. Without using a library, you would use a lenient HTML parser, something like DOMDocument
to inspect and filter the content, and then the built-in DOMDocument::saveXML
to produce the new sanitised HTML.