I'm using Curl to scrape Youtube like this:
<?php
$url = "http://www.youtube.com/watch?v=RnpyRe_7jZA";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$curl_scraped_page = curl_exec($ch);
curl_close($ch);
$curl_scraped_page = preg_replace("#(<\s*a\s+[^>]*href\s*=\s*[\"'])(?!http)([^\"'>]+)([\"'>]+)#",'$1http://www.youtube.com$2$3', $curl_scraped_page);
echo $curl_scraped_page;
?>
This will load the page but it will not play the youtube video (giving me error). What can I do to make it play? I Googled but there isn't much info on this problem.
This is part of what I see in my console when I hit the play button:
GET http://r1---sn-5hn7zn7r.c.youtube.com/videoplayback?algorithm=throttle-fact…r%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&sver=3&upn=CKRxxB49gXE 403 (Forbidden) www-watch-extra-vflTE8ErJ.js:85
GET http://tc.v21.cache3.c.youtube.com/videoplayback?algorithm=throttle-factor&…r%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&sver=3&upn=CKRxxB49gXE 403 (Forbidden) tc.v21.cache3.c.youtube.com/videoplayback?algorithm=throttle-factor&burst=4…2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&sver=3&upn=CKRxxB49gXE:1
GET http://r1---sn-5hn7zn7r.c.youtube.com/videoplayback?algorithm=throttle-fact…ver=3&upn=CKRxxB49gXE&ptchn=NickiMinajAtVEVO&ptk=vevo&cpn=uXm1XYfZqNkRDPGT 403 (Forbidden) r1---sn-5hn7zn7r.c.youtube.com/videoplayback?algorithm=throttle-factor&burs…r=3&upn=CKRxxB49gXE&ptchn=NickiMinajAtVEVO&ptk=vevo&cpn=uXm1XYfZqNkRDPGT:1
GET http://tc.v21.cache3.c.youtube.com/videoplayback?algorithm=throttle-factor&…RxxB49gXE&ptchn=NickiMinajAtVEVO&playretry=1&ptk=vevo&cpn=uXm1XYfZqNkRDPGT 403 (Forbidden) tc.v21.cache3.c.youtube.com/videoplayback?algorithm=throttle-factor&burst=4…xB49gXE&ptchn=NickiMinajAtVEVO&playretry=1&ptk=vevo&cpn=uXm1XYfZqNkRDPGT:1
If you look at your console, you will see that the server responded with a 403 error. This means "access forbidden". Probably YouTube does not want servers (robots) to download their sites. You could modify the HTTP-headers to look like a normal browser for example in php:
header("User-Agent: Mozilla/5.0 Macintosh; Intel Mac OS X 10.8; rv:19.0) Gecko/20100101
Firefox/19.0");
As I noticed it depended on the Server configuration. While it may not work on yours, I tested it on my test server and it worked.
The header it sent was:
194.166.35.216 - - [07/Apr/2013:13:36:39 +0200] "GET / HTTP/1.1" 200 2739 "-" "-"
Sorry if I could not help you.