我如何允许innerHTML的<img>和<a>标签,但没有其他标签? (制作论坛)

I am currently programming a forum using only javascript (No JQuery please). I am doing very well, however, there is one issue I would love help with.

Currently I am getting the post from a database, assigning it to variable MainPost, and then attaching it to a div via a text node:

     var theDiv = document.getElementById("MainBody");
     var content = document.createTextNode(MainPost);
     theDiv.appendChild(content);

This is working quite well, however, I would LOVE to be able to do this:

     document.getElementById("MainBody").innerHTML += MainPost;

But I know this would allow people to use ANY html tag they want, even something like "script" followed by javascript code. This would be bad for business, obviously, but I do like the idea of allowing posters to use the "img" tag as well as the "a href" tags. Is there a way to somehow disable all tags except these two for the innerHTML?

Thank you all so much for any help you can offer.

Ok, the first thought that came to my mind when I read this question was to find a regular expression to exclude a specific string in a word. Simple search gave a lot of results from SO.

Starting point - To remove all the HTML tags from a string (from this answer):

 var regex = /(<([^>]+)>)/ig
 ,   body = "<p>test</p>"
 ,   result = body.replace(regex, "");

 console.log(result);

To exclude a string you would do something like this (again from all the source mentioned above):

(?!StringToBeExcluded)

Since you want to exlcude the <a href and <img tags. The suitable regex in your case could be:

(<(?![\/]?a)(?![\/]?img)([^>]+)>)

Explanation :

Think of it as three capturing groups in succession:

  1. (?![\/]?a) : Negative Lookahead to assert that it is impossible to match the regex containing the string "a" prefixed by zero or one backslashes (Should take care of the a href tags)
  2. (?![\/]?img) : Same as 1, just here it looks for the string "img". I don't know why I allowed the </img> tag. Yes, <img> doesn't have a closing tag. You could remove the [\/]? bit from it to fix this.
  3. ([^>]+) : Makes sure to not match > zero or one times to take care of tags that have opening and closing tags.

Now all these capture groups lie between < and >. You might want to try a regex demo that I've created incorporating these three capture groups to take care of ignoring all HTML elements except the image and link tags.

Sidenote - I haven't thoroughly given this regex a try. Feel free to play around with it and tweak it according to your needs. In any case, I hope this gets you started in the right direction.