Load a Document from a URL

Problem

You need to fetch and parse a HTML document from the web, and find data within it (screen scraping).

Solution

Use the static Jsoup.parse(URL url, int timeoutMillis) method:

URL url = new URL("http://example.com/");
Document doc = Jsoup.parse(url, 3*1000);
String title = doc.title();

Description

The parse(URL url, int timeoutMillis) method fetches and parses a HTML file. If an error occurs whilst fetching the URL, it will throw an IOException, which you should handle appropriately.

The timeout parameter specifies how long to wait for a connection and for content, in milliseconds; if exceeded an IOException is thrown.

This method only suports web URLs (http and https protocols); if you need to load from a file, use the parse(File in, String charsetName) method instead.