Maintaining a request session
Problem
You want to perform multiple HTTP requests using the same configuration, and retain cookies across these requests.
Solution
Use the Jsoup.newSession()
method to create a new session, represented by the Connection
interface:
// Create a new session with settings applied to all requests:
Connection session = Jsoup.newSession()
.timeout(45 * 1000)
.maxBodySize(5 * 1024 * 1024);
// Make the first request:
Document req1 = session.newRequest("https://example.com/auth")
.data("auth-code", "my-secret-token")
.post();
// Make a following request with the same settings, and cookies set from req1:
Document req2 = session.newRequest("https://example.com/admin/")
.get();
Description
The session created by newSession()
supports making multiple requests with the same configuration. Any request-level settings applied on that session will be applied to each actual request.
Cookies set by responses to those requests will be kept in a cookie jar for use in later requests.
The newRequest(String url)
method returns a Connection
object that is pre-configured with the session settings, but those settings can be overridden for that specific request.
Sessions are thread-safe, meaning multiple threads can call newRequest()
on the same session concurrently. Each request object should only be used by a single worker thread at once.
The session's cookie store is accessible via Connection.cookieStore()
. This is maintained in memory for the lifetime of the session. For longer sessions, you can save the cookie store to disk by serializing it.
Cookbook
Introduction
Input
- Parse a document from a String
- Parsing a body fragment
- Load a Document from a URL
- Load a Document from a File
Extracting data
- Use DOM methods to navigate a document
- Use CSS selectors to find elements
- Use XPath selectors to find elements and nodes
- Extract attributes, text, and HTML from elements
- Working with relative and absolute URLs
- Example program: list links
Modifying data
Cleaning HTML
Working with the web
- Maintaining a request session