I have read many threads on this regex question, but none of them seem to work. I am sure it's a result of me really not understanding regex expressions.
Let's say I have the following string:
string='<i>he<a herf="http://www.cnn.com">ll</b>o</i>'
I would like to grab the html tags buried in the substring "hello". I am using the following php function:
$temp = preg_split('/[0-9A-Za-z]+</', $string);
What I am looking for is an array with the following:
a herf="http://www.cnn.com">, and /b>
I can tack on the leading '<'. My results, using the above regex in my preg_split call seem to be including the first '' tag
My full code:
$string = '<i>he<a herf="http://www.cnn.com">ll</b>o</i>';
$temp = preg_split('/[0-9A-Za-z]+</', $string);
echo('<pre>');print_r($temp);echo('</pre>');
$num = count($temp);
$counter = 1;
foreach($temp as $key=>$tag_stem){
if($counter<$num) {
echo('<xmp>');print_r('tag_stem = ' . $tag_stem);echo('</xmp>');
$temp_tag = '<' . $tag_stem;
echo('<xmp>');print_r('temp tag = ' . $temp_tag);echo('</xmp>');
if (empty($temp2)) {
$temp2 = str_replace($temp_tag, '', $string);
} else {
$temp2 = str_replace($temp_tag, '', $temp2);
}
echo('<xmp>');print_r('string = ' . $temp2);echo('</xmp>');
if (strstr($temp_tag, '</')) {
$temp2 = $temp2 . $temp_tag;
} else {
$temp2 = $temp_tag . $temp2;
}
echo('<xmp>');print_r("new string = " . $temp2);echo('</xmp>');
}
$counter++;
}
$temp_array = explode($word, $string);
echo('<xmp>');print_r("final string = " . $temp2);echo('</xmp>');
My results are as follows:
tag_stem = <i>
temp tag = <<i>
string = <i>he<a herf="http://www.cnn.com">ll</b>o</i>
new string = <<i><i>he<a herf="http://www.cnn.com">ll</b>o</i>
tag_stem = a herf="http://www.cnn.com">
temp tag = <a herf="http://www.cnn.com">
string = <<i><i>hell</b>o</i>
new string = <a herf="http://www.cnn.com"><<i><i>hell</b>o</i>
tag_stem = /b>
temp tag = </b>
string = <a herf="http://www.cnn.com"><<i><i>hello</i>
new string = <a herf="http://www.cnn.com"><<i><i>hello</i></b>
final string = <a herf="http://www.cnn.com"><<i><i>hello</i></b>
Not the first iteration. For whatever reason, it's picking up the first "<i>".