I have a file like:
<div clas='dsfdsf'> this is first div </div>
<div clas='dsfdsf'> this is second div </div>
<div class="remove">
<table>
<thead>
<tr>
<th colspan="2">Mehr zum Thema</th>
</tr>
</thead>
<tbody>
<tr> this is tr</tr>
<tr> this row no 2 </tr>
</tbody>
</table>
</div>
<div clas='sasas'> this is last div </div>
I have get this file content in a variable like this:
$Cont = file_get_contents('myfile');
Now I want to replace div with class name 'remove' by preg_replace. I have tried this:
$patterns = "%<div class='remove'>(.+?)</div>%";
$strPageSource = preg_replace($patterns, '', $Cont);
It did not work. What should be the correct regular expression for this replace?
Try this code.
preg_replace("/<div class='remove'>(.*?)<\/div >/i", "<div class="newClass">Newthings</div> ", $Cont);
As it has been stated in the comments, you should not be using regex to parse HTML. Because there's no sane way for you to extract that <div>
if there're other nested <div>
's inside. I.e.
<div clas='dsfdsf'> this is second div </div>
<div class="remove">
some text <div>nested div</div> more text and some elements<br />
</div>
What you want to do is find the location of your <div class="remove">
and then advance through the HTML (parse it) in the following manner
1) set $nesting_counter = 0
2) proceed through HTML until you encounter either <div> or </div>
a) if found <div>
$nesting_counter++ and go to point 2)
b) if found </div>
if $nesting_counter > 0
$nesting_counter-- and go to point 2)
else
you've found the closing tag for your `<div class="remove">`. remember current position and just remove that substring.