HTML字符正在按字面翻译

I am saving html from a <textarea></textarea> to a Mysql database using a PDO driver in PHP. I have tried using htmlentities() & htmlspecialchars(). When the html is requested from the database to display in the DOM (using either html_entity_decode & htmlspecialchars_decode respectively) it is returning the tags literally.

So, for example:

strong text will return as <strong>strong text</strong>

I'm not entirely sure if it is significant but the pages are encoded in ANSI and my doctype is

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

I am saving html from a <textarea></textarea>

Note that, as far as the browser is concerned, it is sending text. It being HTML or not depends on how you handle it.

I have tried using htmlentities() & htmlspecialchars()

htmlspecialchars() is the correct tool to use if you want to display the submitted text in an HTML document (i.e. if <strong> is typed in the text area and you want to render the text less than, strong, greater than and not have a start tag for a strong element). If you want to treat the submitted text as HTML then don't use these functions.

When the html is requested from the database to display in the DOM (using either html_entity_decode & htmlspecialchars_decode respectively) it is returning the tags literally.

This is your problem.

You are:

  1. Taking some text
  2. Encoding it for rendering as HTML
  3. Inserting the HTML into the database
  4. Getting the HTML out of the database
  5. Decoding the HTML into text
  6. Inserting the text into an HTML document where it will be treated as HTML

What you should do:

  1. Take some text
  2. Insert it into the database
  3. Take it out of the database
  4. Encode it as HTML
  5. Insert the HTML into the HTML document

Use htmlspecialchars at step 4, never decode it (the browser will do that).

If you want the user to be able to enter HTML that will be rendered as HTML, then skip step 4. This will render you vulnerable to XSS attacks so unless you completely trust everyone with access, use a DOM-based, whitelisting HTML sanitizer on the data.