I am trying to retrieve the price of an Amazon product. I tried 2 methods:
file_get_contents
-> regex -> it works.I noticed that if javascript is enabled the xpath of the price differs from the xpath while javascript is disabled.
Anyway, how can I retrieve the price using xpath?
This is what I am doing but the code returns nothing (even though it is working on any other website):
(The xpath was taken using firebug)
$url = 'http://www.amazon.com/dp/product/B00TRQPSXM/';
$path = '/html/body/div[3]/form/table[3]/tbody/tr[1]/td/div/table/tbody/tr[2]';
$html = file_get_contents($url);
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
$elements = $xpath->query($path);
if($elements)
{
foreach($elements as $element)
{
echo $element->nodeName.'<br>';
echo $element->nodeValue.'<br>';
}
}
Your request will be blocked after a couple of tries every time, amazon checks for robot access. Instead of scrapping their site which btw is against amazon's terms of service (or whatever it's called), use their API found at http://developer.amazonservices.com. You will get the price information you are after with this operation.
There is also a php sdk you can use.
Either way, file_get_contents()
is not an option here, if you want to scrape the page use curl and make it look like an unique visitor.