I have created a php parser that must extract the price in a span tag, but when I echo the $html so that I could see how the page loads, it shows me a broken page with no contents. Instead only header and footer loads, but not the content. The content seems to load by JavaScript externally and my question is how can I load the html page with Dom so that JavaScript also loads? I need to let the whole content load so that I can get the divs and spans. This is my code:
<?php
require_once('simple_html_dom.php');
$url = 'http://oldnavy.gap.com/browse/product.do?cid=99570&vid=1&pid=714649002';
$dom = new domDocument('1.0', 'UTF-8');
$html = file_get_html($url);
echo $html;
if(is_object($html)){
foreach ( $html->find('span#priceText') as $data){
$raw_price = $data->innertext;
echo $raw_price;
}
}
?>
Alt aproach
The link you are actually looking for (in his minimal expression) is this: http://oldnavy.gap.com/browse/productData.do?pid=714649
Now load that using curl, put a value to the unknownShopperId
cookie, explode it into an array and get the price you need:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL, "http://oldnavy.gap.com/browse/productData.do?pid=714649");
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: unknownShopperId=E853DA3B2607DDAA5F2FE13CE8D32ACF"));
$result = curl_exec($ch);
$explode = explode(',', $result);
echo 'Original price: ' . $explode[92] . '<br/>' .
'New price: ' . $explode[93] . '<br/>' .
'Both prices: ' . $explode[13];
The result will be: '$14.94'
From now on, if you need another price you must know the intem's pid