I need to parse some html code that looks like this:
<div id="item-1" class="genclass" itemscope itemtype="http://schema.org/Book">
<meta itemprop="isbn" content="XXXXXXXXXXXX" />
<meta itemprop="name" content="Neverending Story" />
<meta itemprop="author" content="Michael Ender" />
<meta itemprop="publisher" content="MyBooks" />
<meta itemprop="datePublished" content="1991" />
<h2 itemprop="offers" itemscope itemtype="http://schema.org/Offer">
<meta itemprop="price" content="6.6" />
<meta itemprop="priceCurrency" content="USD" />
</h2>
</div>
So I'm trying:
libxml_use_internal_errors(TRUE);
$dom->loadHTMLFile($url);
libxml_clear_errors();
foreach($dom->find("meta[itemprop='isbn']") as $books){
switch ($books->itemprop) {
case 'isbn':
$line['isbn'] = $books->content;
break;
case 'name':
$line['name'] = $books->content;
break;
case 'author':
$line['author'] = $books->content;
break;
case 'publisher':
$line['publisher'] = $books->content;
break;
case 'datePublished':
$line['datePublished'] = $books->content;
break;
case 'price':
$line['price'] = $books->content;
break;
default:
break;
}
}
print_r($books);
But the result is always blank. What am I doing wrong? I've tried with get_meta_tags and others...
</div>