I am writing a regex find/replace that will insert a <span>
into every <a href>
in a file where a <span>
does not already exist. It will allow other tags to be in the <a href>
like <img>
, <b>
, etc.
Currently I have this regex:
Find: (<a[^>]+?style=".*?color:#(\w{6}).*?".*?>)(.+?)(<\/a>)
Replace: '$1<span style="color:#$2;">$3</span>$4'
It works great except if i run it over the same file, it will insert a <span>
inside of a <span>
and it gets messy.
Target Example:
We want it to ignore this:<a href="http://mywebiste.com/link1.html" target="_blank" style="color:#bfbcba; text-decoration:underline;"><span style="color:#bfbcba;">Howdy</span></a>
But not this:<a href="http://mywebiste.com/link1.html" target="_blank" style="color:#bfbcba; text-decoration:underline;">Howdy</a>
Or this:<a href="http://mywebiste.com/link1.html" target="_blank" style="color:#bfbcba; text-decoration:underline;"><img src="myimg.gif" />Howdy</a>
--EDIT--
Using the PHP DOM library as suggested in the comments, this is what I have so far:
$doc = new DOMDocument();
$doc->loadHTML($input);
$tags = $doc->getElementsByTagName('a');
foreach ($tags as $tag) {
$spancount = $tag->getElementsByTagName("span")->length;
if($spancount == 0){
$element = $doc->createElement('span');
$tag->appendChild($element);
}
}
echo $doc->saveHTML();`
Currently it will detect if there is a span inside an anchor and if there is, it will append a span to the inside of the anchor, however, i have yet to figure out how to get the original contents of the anchor inside the span.
Don't use regex for this, it's not ideal for HTML.
Use a DOM library and getElementsByTagName('a')
then iterate through each anchor and see if it contains a sub span element with getElementsByTagName('span')
, using the length
property. If it doesn't, appendChild
or assign the firstChild
of the anchor node to your new span created with document.createElement('span')
.
EDIT: As for grabbing the inner html of the anchor, if there are lots of nodes inside, try using this:
<?php
function innerHTML($node){
$doc = new DOMDocument();
foreach ($node->childNodes as $child)
$doc->appendChild($doc->importNode($child, true));
return $doc->saveHTML();
}
$html = innerHTML( $anchorRef );
This may also help you out: Change innerHTML of a php DOMElement