I would like to get the HTML code from a page with PHP. So I do this:
$url = 'http://en.wikipedia.org/wiki/New_York_City';
$html = file_get_html($url);
The problem is, Wikipedia doesn't send the <script>
tag to the PHP request, so it doesn't show the JavaScript. I guess that's because Wikipedia sees that the "requester" doesn't have JavaScript enabled, so it doesn't send the <script>
tags.
How can I let Wikipedia know that my PHP is JavaScript enabled?
I heard about stream context, but I don't know how to set JavaScript enabled for it.
Thanks to symcbean, here's the solution.
I added:
ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9');
And now it's sending the corret script block.
;)
You could use an Iframe.
You could also use something like jQuery to grab the page (or certain parts of the page) onto your website.
It looks like the file_get_html()
function is stripping away the <script>
blocks, because I tried to request GET /wiki/Main_Page HTTP/1.1
from Fiddler without any request headers, and it did return the <script>
blocks in the response.
This should work
$url = 'http://en.wikipedia.org/wiki/New_York_City';
$html = file_get_contents($url);
Tested it on my local PHP server.