First of all, I'm a newbie to Java and my English is bad, so hope you can understand my problem.
I want to read the text file from this URL: http://www.cophieu68.com/export/metastock.php?id=AAA
Okay, let me explain. This is a Vietnamese stock data website and the link above point to the file aaa.txt which contains the information of the stock with codename is AAA. And I can take the other stocks info by just modifying the value of the id variable.
And my problem is what I get is a bunch of HTML code, not the text file I expect (aaa.txt)
And here is my code:
public static void main(String[] args){
try {
URL url = new URL("http://www.cophieu68.com/export/metastock.php?id=AAA");
URLConnection urlConn = url.openConnection();
System.out.println(urlConn.getContentType()); //it returns text/html
BufferedReader in = new BufferedReader
(new InputStreamReader(urlConn.getInputStream()));
String text;
while ((text = in.readLine()) != null) {
System.out.println(text);
}
in.close();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
Thanks for your help.
The site seems to be sniffing the user-agent to decide what content to send down.
If you spoof the user-agent as shown below, it works as you'd expect - the response is the plain-text file:
urlConn.setRequestProperty ( "User-agent", "Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.9.0.2) Gecko/20121223 Ubuntu/9.25 (jaunty) Firefox/3.8");
As you can probably tell, this pretends that the user-agent is Firefox 3.8 on Ubuntu.
It is probably because the link (http://www.cophieu68.com/export/metastock.php?id=AAA) is send as an attachment. If you have access to the PHP file you should just do nothing but print the data and include
header('Content-Type: text/plain');
in your PHP file