I want to know Groupon active deals so I write a scraper like:
libxml_use_internal_errors(true);
$dom = new DOMDocument();
@$dom->loadHTMLFile('https://www.groupon.com/browse/new-york?category=food-and-drink&minPrice=1&maxPrice=999');
$xpath = new DOMXPath($dom);
$entries = $xpath->query("//li[@class='slot']//a/@href");
foreach($entries as $e) {
echo $e->textContent . '<br />';
}
but when I run this function browser loading all time, just loading something but don't show any error.
How can I fix it? Not just case with Groupon - I also try other websites but also don't work. WHy?
What about using CURL to loading page data.
Not just case with Groupon - I also try other websites but also don't work
I think this code will help you but you should expect unexpected situations for each website which you want to scrap.
<?php
$dom = new DOMDocument();
$data = get_url_content('https://www.groupon.com', true);
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$entries = $xpath->query("//label");
foreach($entries as $e) {
echo $e->textContent . '<br />';
}
function get_url_content($url = null, $justBody = true)
{
/* Init CURL */
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_HTTPHEADER, []);
$data = curl_exec($ch);
if ($justBody)
$data = @(explode("
", $data, 2))[1];
var_dump($data);
return $data;
}