I want to get the img src value using domxpath.
Let us say I have this sample.html page:
<div id="wrapper">
<div class="item">
<div class="img-wrapper">
<img src="sample1.jpg"/>
<p class="title">Sample 1</p>
</div>
</div>
<div class="item">
<div class="img-wrapper">
<img src="sample2.jpg"/>
<p class="title">Sample 2</p>
</div>
</div>
<div class="item">
<div class="img-wrapper">
<img src="sample3.jpg"/>
<p class="title">Sample 3</p>
</div>
</div>
</div>
Using CURL, DOMDocument and DOMXPath I want to get the img src and the title:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://path/to/sample.html');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
$result = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument();
$dom->loadHTML($result);
$xpath = new DOMXPath($dom);
$entries = $xpath->query('//div[@id="wrapper"]/div[@class="item"]');
$results = array();
foreach ($entries as $entry) {
$result = array();
$result['img'] = $xpath->query("img", $entry)->item(0)->nodeValue;
$result['title'] = $xpath->query("p[@class='title']", $entry)->item(0)->nodeValue;
$results[] = $result;
}
return $results;
This will result to img as null:
[
{
"img": null,
"title": "Sample 1"
},
{
"img": null,
"title": "Sample 2"
},
{
"img": null,
"title": "Sample 3"
}
]
Please help me on how to get the img src value. Thank You!
Your XPath in fetching the values isn't quite correct, should be...
$result['img'] = $xpath->query("//img/@src", $entry)[0]->value;
$result['title'] = $xpath->query("//p[@class='title']", $entry)[0]->nodeValue;
Note the way you get the attributes is by using @attibuteName
. Also the //
at the start allows XPath to find the elements at any point under the start point.
You can get attribute of any DOM element using method getAttribute('attribute_name')
Here in your example use getAttribute('src')
$result['img'] = $xpath->query("//img", $entry)->item(0)->getAttribute('src');
$result['title'] = $xpath->query("//p[@class='title']", $entry)->item(0)->nodeValue;