I have this problem when I'm trying to use wget
to retrieve the OUTPUT of a specific php script, but it looks like this site generates 2 identical PHP files.
The 1st one is smaller and the 2nd one, in the sequence, is the correct one. The problem is every time I try the wget
command, I end-up with the smallest output file, which does not contain the desired info :(
Is there a way to download the correct file, using wget
, by adding some sort of identifier to the link, to make sure I'm downloading the correct file.
Here is the command I've been trying:
$ wget http://www.fernsehen.to/index.php
If your run/play this and use Fidller or Wireshark for capture, you'll end-up with two (2) "http://www.fernsehen.to/index.php" and I need the bigger file of the two.
P.S. To manually get the desired output file, you can open http://www.fernsehen.to/index.php in Firefox or chrome and view source.
Thank you in advance!
What you want is not really practically possible. When you visit that page, they first generate a small file with a load of Javascript, that detects browser features and sends them back to the server in a stateful manner in order to produce the exact code required for your browser, probably including stuff like supported codecs for video mainly. Probably they also do some session fingerprinting for DRM purposes, to stop people like you from exactly what you're trying to do.
wget
cannot emulate this behaviour because it is not a full browser, and cannot execute all that Javascript, nor if it did properly supply browser-like data. You'd have to write an extensive piece of custom code that exactly mimics everything the in-between page is doing to achieve the intended effect. Possible, but not easy, and most certainly not with a basic generic-purpose tool like wget
.