Problems reading from URLs, Java (resolved)

  • Thread starter amp88
  • 2 comments
  • 563 views
3,746
I'm attempting to write a small program to read contents from a website and store them on my PC at the moment.

The basic program flow is to read a URL from a String array, create a URL Connection instance and strip away superfluous data from the webpage, leaving only what I want. Then I simply write the data I want out to a set of text files...

However, things aren't going as well as I'd hoped. The first URL is read correctly and the first file is written without problem. However, the second URL is not initialised using the second value in the String array, the first value is used on all subsequent runs. I haven't done much web stuff in Java (basically the only thing I've done was ripping the Chuck Norris quotes site), and I've basically nicked most of my webcode from JavaAlmanac.com and Google. I've listed some of the source below (not at all complete, almost all mising. I've only left in what I believe to be the problem areas, though I can post the rest of the source if requested), so hopefully someone can give me some hints or tips as to how to solve my problem.

TIA.

Code:
public class Parser
{
	public static void main(String[] args)
	{
		Vector results = new Vector();
		
		String[] URLs = {"1.html", "2.html", "3.html", "4.html", "5.html"};
		
		for(int j=0; j<URLs.length; j++)
		{
			try
			{
				URL newURL = new URL(URLs[j]);
				URLConnection newURLC = newURL.openConnection();
				
				BufferedReader in = new BufferedReader(new InputStreamReader(F1URLC.getInputStream()));
				
				// Strip useless data, keep useful data (about 120 lines of code, no point posting here)

				in.close();
				
				int num = 1;
				
				File outputFile = new File(filename+".txt");
		    	FileOutputStream outFileStream = new FileOutputStream(outputFile);
		    	PrintWriter outStream = new PrintWriter(outFileStream);
		    	
		    	// Output file data
		    	
		    	outStream.close();
			}

			catch(Exception e)
			{
				System.out.println(inputLine);
				e.printStackTrace();
			}
			
			try{ Thread.sleep(2500); }catch(Exception t) {}
		}
	}
}
 
skip0110
Isn't this the right thing to use?

http://java.sun.com/j2se/1.4.2/docs/api/java/net/URL.html#openStream()

Then you can do something like:
Code:
URL url = new URL("http://www.whatever.com/");
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));

EDIT: Oddly enough I was just doing something like this in C#, heh...

That gives exactly the same behvaiour as my original code did :(

Thanks for the quick reply though.

edit: Found the problem after walking through the code. I'd forgotten to re-set a boolean control variable after the first loop :(
 

Latest Posts

Back