I read how to use function preg_match in http://php.net/manual/en/function.preg-match.php, I don't know what is differerent between using $subject and substr($subject,3) in preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3) and preg_match($pattern, substr($subject,3), $matches, PREG_OFFSET_CAPTURE). Please help me understand and check below function that why does it return empty array?
<?php
$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,"http://www.1gom.us/ti-le-keo-malaysia.html");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$content = curl_exec($ch);
curl_close($ch);
$regex = '/<div class="tabbox" id="tabbox">(.*)<\/div>/';
preg_match($regex, $content, $matches, PREG_OFFSET_CAPTURE, 3);
$table = $matches[1];
print_r($table);
?>
You will obtain the same result. But if you use substr, you will create a new string for nothing, when the last parameter of preg_match only ask to begin the search at a particular offset of the subject string.
The reason you obtain an empty result is probably due to the fact that .
can't match newlines by default. You can change this behavior with the s
modifier:
$regex = '~<div class="tabbox" id="tabbox">(.*?)</div>~s';
(Note the use of a different modifier to not have to escape slashes. Note too the use of a non-greedy quantifier to stop at the first occurence of </div>
)
However, as noticed in comments, extracting informations from an html document is more easy with a combo DOMDocument/DOMXPath (depending of how looks your document and what you are trying to do).
$subject search value in the full string values of $subject while substr($subject,3) search values just a part of the $subject variable/ or values that start from index 3 ..
if $subject = HELLO WORLD then substr($subject,3) = LO WORLD