I'm trying to get the unique ID from a string using PHP regex but I'm having problems generating the regex pattern.
sample urls
lesson-computer-networking-xpt2-t9295779.html // need to get 9295779
lesson-summary-t9295778.html //need to get 9295778
lesson-part2-t94.html //need to get 94
The length of first portion of the string depends on the page title but last portion is always -txxxxxxxx.html
Can someone help me to generate the pattern?
You could use the below regex to get the number after t
which was before .html
,
-t\K\d+(?=\.html)
OR
(?<=-t).*(?=\.html)
Your PHP code would be,
<?php
$data = <<<'EOT'
lesson-computer-networking-xpt2-t9295779.html
lesson-summary-t9295778.html
lesson-part2-t94.html
EOT;
$regex = '~(?<=-t).*(?=\.html)~';
preg_match_all($regex, $data, $matches);
var_dump($matches);
?>
Output:
array(1) {
[0]=>
array(3) {
[0]=>
string(7) "9295779"
[1]=>
string(7) "9295778"
[2]=>
string(2) "94"
}
}
Explanation:
(?<=-t)
A lookbehind is used to look after specific pattern. In our case the regex engine should set the matching marker just after to the string -t
..*(?=\.html)
Next it matches all the characters upto the string .html
(ie, when .html
is seen by the regx engine, it suddenly stops the matching)You could use something like this:
$url = 'lesson-computer-networking-xpt2-t9295779.html';
$matches = array();
$t = preg_match('/-t(.*?)\.html$/s', $url, $matches);
print_r($matches[1]);
this is my pattren
-t(\d+)\.html