Reading a .txt from a url in Java. Help?

husoski

2012-11-29 17:57:16 UTC

I can't see all of what you're doing because of the Y!A 40 character rule. I use a slightly different method, illustrated below:

import java.net.*;

import java.io.*;

public class URLReader {

public static void main(String[] args) // throws Exception

{

URL yahoo = new URL("http://www.yahoo.com/");

BufferedReader in = new BufferedReader(

new InputStreamReader(

yahoo.openStream() ) );

String inputLine;

while ((inputLine = in.readLine()) != null)

System.out.println(inputLine);

in.close();

}

}

Oops...hit submit instead of preview. Anyway, that works and displays the html source for the yahoo.com main page. When I update it to your text file URL, it gets the same problem you're likely to get: a 403 "Forbidden" error. I get the same response from wget on linux. Apparently Project Gutenberg actively tries to prohibit non-interactive downloads. That's probably to save bandwidth for everyone else.

You'll have to prepare headers to make yourself look exactly like a browser. I don't have docs for that, and it's a bad Karma move anyway.

Kaydell

2012-11-29 18:14:03 UTC

I googled for an answer and found this:

http://docs.oracle.com/javase/tutorial/networking/urls/readingURL.html

I tried out the code an it works for some URLs but for others like your URL, I got an HTTP 403 error which means "forbidden".

So, I think that the Java code is right. I just think that the Gutenberg website has limitations to automated connections downloading their pages.

Susan

2016-05-18 05:23:35 UTC

I'm not sure if I understand what you're asking but you (if you haven't already done so) try putting the text files in the .jar file with the program. That might work. Also, try using the complete path to the file instead of just its name. I hope this helps.