$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd";
$pageContentData = file_get_contents($urlToScrap);
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$listOfDivs = $doc->getElementsByTagName("div");
foreach ($listOfDivs as $div) {
if($div->getAttribute("class") == "doc-banner-icon"){
$img = $div->getElementsByTagName("img");
var_dump($img->getAttribute("src"));
}
}
returns empty.
I have the following elements in the dom:
<div class="doc-banner-icon"><img src="somesrc"></div>
I'm trying to get the img src and since in the page there are many images, I would like to first get the parent div and then extract the image inside it.
The solution is here:
$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd";
$pageContentData = file_get_contents($urlToScrap);
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$listOfDivs = $doc->getElementsByTagName("div");
foreach ($listOfDivs as $div) {
if($div->getAttribute("class") == "doc-banner-icon"){
$listOfImages = $div->getElementsByTagName("img");
foreach($listOfImages as $img){
var_dump($img->getAttribute("src"));
}
}
}
You aren't missing anything, var_dump
doesn't work as you expect on a DOMNodeList
. Try this instead:
$listOfImages = $doc->getElementsByTagName("img");
foreach ($listOfImages as $img) {
$imgClass = $img->getAttribute('class');
echo $imgClass;
}
In your updated question, just change:
$img->getAttribute("src")
to:
$img->item(0)->getAttribute("src")
Given that your selection criteria is fairly complex, you might consider using XPath instead of navigating manually:
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$xpath = new DOMXPath($doc);
$img = $xpath->query("//div[@class = 'doc-banner-icon']/img");
var_dump($img->item(0)->getAttribute('src'));