The script below assigns numerical ID's to paragraphs (e.g. [p id="1"]) in articles extracted from my database, except for the last paragraph, which is [p id="last].
$c = 1;
$r = preg_replace_callback('/(<p( [^>]+)?>)/i', function ($res) {
global $c;
return '<p'.$res[2].' id="'.intval($c++).'">';
}, $text);
$r = preg_replace('/(<p.*?)id="'.($c-1).'"(>)/i', '\1id="Last"\2', $r);
$text = $r;
It works, but when I have my error reporting on, I get the following error Undefined offset: 2. It isn't critical, but it's kind of a nuisance when I'm testing my pages. Any idea how I can kill it?
I've improved the regex by:
/<p( [^>]+)?>/i
( [^>]+)?
to ([^>]*)
. This way you don't have an optional group, but the characters inside this group is optional. Which means you will always have this group.~<p([^>]*)>~i
Now let's attack the php code:
$text = '<p>test</p> another <p class="test">test</p> and another one <p style="color:red">';
$c = 1;
$r = preg_replace_callback('~<p([^>]*)>~i', function($res) use (&$c){
return '<p'.$res[1].' id="'.$c++.'">';
}, $text);
var_dump($r, $c);
Note that I used a closure use (&$c)
with a reference &
. This way we can update $c
.