I need to extract a json object inside a web page's script. This is a part of the web page:
<html>
<script>
.....
</script>
<script type=\"text/javascript\">
$(function(){
$(\"#map5\").gMap({ maptype: G_SATELLITE_MAP,
controls: false,
scrollwheel: false,
markers: [
{.....},{......},],
latitude: 24.70115790054175,
longitude: 46.04358434677124,
zoom: 5
});
});
</script>
</head>
<body>
....
</body>
</html>
I want to extract the the JSON object that starts wit { maptype:
. I thought of using regular expression
approach to achieve this. Here is what I did:
$html = file_get_contents($url);
$regex_pattern = "/\<script.*/";
preg_match_all($regex_pattern,$html,$matches);
However, my pattern seems to select the first line of the object only! I couldn't figure out a way to make it select all the object.
Any help will be appreciated.
Elsalamoe 3aleikom :D
Here's how you do it:
$script = <<<FIL
<script type=\"text/javascript\">
$(function(){
$(\"#map5\").gMap({ maptype: G_SATELLITE_MAP,
controls: false,
scrollwheel: false,
markers: [
{.....},{......},],
latitude: 24.70115790054175,
longitude: 46.04358434677124,
zoom: 5
});
});
</script>
FIL;
preg_match_all('/<script[^>]*>.*?\.gMap\(\s*({.*?})\);.*?<\/script>/mis', $script, $m);
var_dump($m[1]);
The reason your pattern fail is that the dot .
don't match newlines, if you want it does, you must add the s
modifier at the end of your pattern. The multiline mode (m modifier) is not useful here.
Try this:
$json = (preg_match('~\.gMap\s*+\(\s*+\K\{.+?\}(?=\s*+\)\s*+;)~s', $html, $result))?
$result[0] : false;