This question already has an answer here:
Okay - this has boggled me for days. I've tried regex with negative lookahead, but to no avail.
Basically, in PHP, I need to parse conversation threads and extract the LAST occurrence of http links that can occur by itself, or in a consecutive group of 2 or more. So, in example 1, it should return the last link, but in example 2, it should return the last 3 links.
I don’t need to achieve this with a single regex, but I’m not sure what other approaches to try. Any help would be appreciated!!
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
In pharetra elementum dui vel pretium. Quisque rutrum mauris vitae turpis hendrerit facilisis. Sed ultrices imperdiet ornare.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
In pharetra elementum dui vel pretium. Quisque rutrum mauris vitae turpis hendrerit facilisis. Sed ultrices imperdiet ornare.
http://sample.com/24689.png
http://sample.com/13578.png
http://sample.com/98761.png
</div>
1) Split your Text on the delimiter \s
.
$resultArray = preg_split("@\s@", $conversation)
on example:
$conversation = "Hallo, http://1.de text http://2.de
http://3.de Hello";
(This will produce something like this as intermediate result:)
Array
(
[0] => Hallo,
[1] => http://1.de
[2] => text
[3] => http://2.de
[4] =>
[5] => http://3.de
[6] => Hello
)
2.) Finally, reverse iterate over the result array. Start "matching", if the result starts with "http://" - stop matching if you encounter anything else, Ignore Empty lines as well as lines with whitespaces only.:
$conversation = "Hallo, http://1.de text http://2.de
http://3.de Hello";
$resultArray = preg_split("@\s@", $conversation);
$result = array();
$matching = false;
for ($i = count($resultArray)-1; $i >= 0; $i--){
if (preg_match("@http:\/\/@", $resultArray[$i])){
$matching=true;
$result[] = $resultArray[$i];
} else if (preg_match("@^\s*$@", $resultArray[$i])){
//ignore this bad boy
}else{
if ($matching){
break;
}
}
}
echo "<pre>";
print_r(array_reverse($result));
echo "</pre>";
yields:
Array
(
[0] => http://2.de
[1] => http://3.de
)