PHP Curl GET&POST

I'm curling a URL with the following code at the moment, which works fine with either the get attached to the end of the URL or the POST data. But not with the get and the post.

However when I use the advanced rest client (add on for google chrome) it works just fine. Annoyingly though, I can't see the request that it sends to mimic it.

Heres the call i'm making with it.

$fields = array(
        'searchPaginationResultsPerPage'=>500               );
foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
$fields_string = rtrim($fields_string,'&');

$curl = curl_init(); 
curl_setopt( $curl, CURLOPT_URL, 'http://www.microgenerationcertification.org/mcs-consumer/installer-search.php?searchPaginationPage=1' );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);  
curl_setopt($curl,CURLOPT_POST,count($fields));
curl_setopt($curl,CURLOPT_POSTFIELDS,$fields_string);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 80);  
$str = curl_exec($curl);  
curl_close($curl); 

Just using this as a bit of a test more than anything else, but can't seem to get it working. I can get the first 500 results all the time, but not the next 500.

This works

$fields = array (
        'searchPaginationResultsPerPage' => 500,
        'searchPaginationPage' => 1 
);

$headers = array (
        "Connection: keep-alive",
        "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.162 Safari/535.19",
        "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Encoding: gzip,deflate,sdch",
        "Accept-Language: en-US,en;q=0.8",
        "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3" 
);

$fields_string = http_build_query ( $fields );
$cookie = 'cf6c650fc5361e46b4e6b7d5918692cd=49d369a493e3088837720400c8dba3fa; __utma=148531883.862638000.1335434431.1335434431.1335434431.1; __utmc=148531883; __utmz=148531883.1335434431.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); mcs=698afe33a415257006ed24d33c7d467d; style=default';
$ch = curl_init ();
curl_setopt ( $ch, CURLOPT_URL, 'http://www.microgenerationcertification.org/mcs-consumer/installer-search.php?searchPaginationPage=1&searchPaginationResultsPerPage=500' );
curl_setopt ( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt ( $ch, CURLOPT_CONNECTTIMEOUT, 80 );
curl_setopt ( $ch, CURLOPT_COOKIE, $cookie );
curl_setopt ( $ch, CURLOPT_HTTPHEADER, $headers );

$str = curl_exec ( $ch );
curl_close ( $ch );

echo $str;

You needed cookie information and make sure curl is using GET not POST

See Demo : http://codepad.viper-7.com/gTThxX (I hope the cokkies is not expired before you view it )

Not sure why that fails, looks fine.. What happens when you skip CURL and go for the PHP stream method:

$postdata = http_build_query(
    array(
        'searchPaginationResultsPerPage' => 500
    )
); 
$opts = array('http' =>
    array(
        'method'  => 'POST',
        'header'  => 'Content-type: application/x-www-form-urlencoded',
        'content' => $postdata
    )
);

$context  = stream_context_create($opts);

$result = file_get_contents('http://www.microgenerationcertification.org/mcs-consumer/installer-search.php?searchPaginationPage=1', false, $context);

I had a look at the page you are scraping and noticed the following:

  • When you change the results per page it posts your search again
  • They appear to be using the session to store your search parameters

You are not preserving the session ID when using CURL (and doing so is probably a bit more complex than you'd like) so this will not behave the same as on the website.

I did notice however that if you append the searchPaginationResultsPerPage parameter to the URL it works fine. Like this:

http://www.microgenerationcertification.org/mcs-consumer/installer-search.php?searchPaginationPage=0&searchPaginationResultsPerPage=500

That means you could actually use file_get_contents and not worry about the CURL stuff.