I've this function that parses some content to retrieve homemade link tag and convert it to normal link tag.
Possible input:
<p>blabalblahhh <moolinkx pageid="121">text to click</moolinkx> blablabah</p>
Output :
<p>blabalblahhh <a href="whateverpage.htm">text to click</a> blablabah</p>
Here is my code:
$regex = '/\<moolinkx pageid="(.{1,})"\>(.{1,})\<\/moolinkx\>/';
preg_match_all( $regex, $string, $matches );
It works perfectly well if there is only one in the string. But as soon as there is a second one, it doesn't work.
Input:
<p>blabalblahhh <moolinkx pageid="121">text to click</moolinkx> blablabah.</p>
<p>Another <moolinkx pageid="128">text to clickclick</moolinkx> again blablablah.</p>
That's what I got when I print_r($matches):
Array
(
[0] => Array
(
[0] => <moolinkx pageid="121">text to click</moolinkx> blablabah.</p><p>Another <moolinkx pageid="128">text to clickclick</moolinkx>
)
[1] => Array
(
[0] => 121">text to click</moolinkx> blablabah.</p><p>Another <moolinkx pageid="128
)
[2] => Array
(
[0] => text to clickclick
)
)
I'm not at ease with regex, so it must be something very trivial... but I can't pinpoint what it is :(
Thank you very much in advance!
NB: This is my first post here, though I've been using this terrific Q&A for ages!
Use a negative Regex:$regex = '/<moolinkx pageid="([^"]+)">([^<]+)<\/moolinkx>/';
Explained demo here: http://regex101.com/r/sI3wK5
You are using a greedy selector, which is recognising everything between the first openning tag and the last closing tag as the content between the tags. Change your regex to:
$regex = '/\<moolinkx pageid="(.+?)"\>(.+?)\<\/moolinkx\>/';
preg_match_all( $regex, $string, $matches );
Notice the .{1,}
has changed to .+?
. The +
means one or more instances, and the ?
tells the regex to select the fewest characters it can to fulfil the expression.