Class HttpConnection
- All Implemented Interfaces:
-
Connection
Connection.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classstatic classstatic classNested classes/interfaces inherited from interface org.jsoup.Connection
Connection.Method -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final StringMany users would get caught by not setting a user-agent and therefore getting different responses on their desktop vs in jsoup, which would otherwise default toJava.static final Stringstatic final String -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionauth(@Nullable RequestAuthenticator authenticator) Set the authenticator to use for this connection, enabling requests to URLs, and via proxies, that require authentication credentials.static Connectionconnect(String url) Create a new Connection, with the request URL specified.static ConnectionCreate a new Connection, with the request URL specified.cookie(String name, String value) Set a cookie to be sent in the request.cookies(Map<String, String> cookies) Adds each of the supplied cookies to the request.Get the cookie store used by this Connection.cookieStore(CookieStore cookieStore) Provide a custom or pre-filled CookieStore to be used on requests made by this Connection.@Nullable Connection.KeyValdata(String key) Get the data KeyVal for this key, if anydata(String... keyvals) Add one or more requestkey, valdata parameter pairs.data(String key, String value) Add a request data parameter.data(String key, String filename, InputStream inputStream) Add an input stream as a request data parameter.data(String key, String filename, InputStream inputStream, String contentType) Add an input stream as a request data parameter.data(Collection<Connection.KeyVal> data) Adds all of the supplied data to the request data parametersdata(Map<String, String> data) Adds all of the supplied data to the request data parametersexecute()Execute the request.followRedirects(boolean followRedirects) Configures the connection to (not) follow server redirects.get()Execute the request as a GET, and parse the result.header(String name, String value) Set a request header.headers(Map<String, String> headers) Sets each of the supplied headers on the request.ignoreContentType(boolean ignoreContentType) Ignore the document's Content-Type when parsing the response.ignoreHttpErrors(boolean ignoreHttpErrors) Configures the connection to not throw exceptions when an HTTP error occurs. (4xx - 5xx, e.g. 404 or 500).maxBodySize(int bytes) Set the maximum bytes to read from the (uncompressed) connection into the body, before the connection is closed, and the input truncated (i.e. the body content will be trimmed).method(Connection.Method method) Set the request method to use, GET or POST.Creates a new request, using this Connection as the session-state and to initialize the connection settings (which may then be independently changed on the returnedConnection.Requestobject).onResponseProgress(Progress<Connection.Response> handler) Set the response progress handler, which will be called periodically as the response body is downloaded.Provide a specific parser to use when parsing the response to a Document.post()Execute the request as a POST, and parse the result.postDataCharset(String charset) Set the character-set used to encode the request body.proxy(String host, int port) Set the HTTP proxy to use for this request.Set the proxy to use for this request.referrer(String referrer) Set the request referrer (aka "referer") header.request()Get the request object associated with this connectionrequest(Connection.Request request) Set the connection's requestrequestBody(String body) Set a POST (or PUT) request body.requestBodyStream(InputStream stream) Set the request body.response()Get the response, once the request has been executed.response(Connection.Response response) Set the connection's responsesslContext(SSLContext sslContext) Set a custom SSL context for HTTPS connections.sslSocketFactory(SSLSocketFactory sslSocketFactory) Set a custom SSL socket factory for HTTPS connections.timeout(int millis) Set the total maximum request duration.url(String url) Set the request URL to fetch.Set the request URL to fetch.userAgent(String userAgent) Set the request user-agent header.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.jsoup.Connection
newRequest, newRequest
-
Field Details
-
CONTENT_ENCODING
public static final String CONTENT_ENCODING- See Also:
-
DEFAULT_UA
public static final String DEFAULT_UAMany users would get caught by not setting a user-agent and therefore getting different responses on their desktop vs in jsoup, which would otherwise default toJava. So by default, use a desktop UA.- See Also:
-
CONTENT_TYPE
public static final String CONTENT_TYPE- See Also:
-
MULTIPART_FORM_DATA
public static final String MULTIPART_FORM_DATA- See Also:
-
FORM_URL_ENCODED
public static final String FORM_URL_ENCODED- See Also:
-
-
Constructor Details
-
Method Details
-
connect
Create a new Connection, with the request URL specified.- Parameters:
-
url- the URL to fetch from - Returns:
- a new Connection object
-
connect
Create a new Connection, with the request URL specified.- Parameters:
-
url- the URL to fetch from - Returns:
- a new Connection object
-
newRequest
Description copied from interface:ConnectionCreates a new request, using this Connection as the session-state and to initialize the connection settings (which may then be independently changed on the returnedConnection.Requestobject).- Specified by:
-
newRequestin interfaceConnection - Returns:
- a new Connection object, with a shared Cookie Store and initialized settings from this Connection and Request
-
url
Description copied from interface:ConnectionSet the request URL to fetch. The protocol must be HTTP or HTTPS.- Specified by:
-
urlin interfaceConnection - Parameters:
-
url- URL to connect to - Returns:
- this Connection, for chaining
-
url
Description copied from interface:ConnectionSet the request URL to fetch. The protocol must be HTTP or HTTPS.- Specified by:
-
urlin interfaceConnection - Parameters:
-
url- URL to connect to - Returns:
- this Connection, for chaining
-
proxy
Description copied from interface:ConnectionSet the proxy to use for this request. Set tonullto disable a previously set proxy.- Specified by:
-
proxyin interfaceConnection - Parameters:
-
proxy- proxy to use - Returns:
- this Connection, for chaining
-
proxy
Description copied from interface:ConnectionSet the HTTP proxy to use for this request.- Specified by:
-
proxyin interfaceConnection - Parameters:
-
host- the proxy hostname -
port- the proxy port - Returns:
- this Connection, for chaining
-
userAgent
Description copied from interface:ConnectionSet the request user-agent header.- Specified by:
-
userAgentin interfaceConnection - Parameters:
-
userAgent- user-agent to use - Returns:
- this Connection, for chaining
- See Also:
-
timeout
Description copied from interface:ConnectionSet the total maximum request duration. If a timeout occurs, anSocketTimeoutExceptionwill be thrown.The default timeout is 30 seconds (30,000 millis). A timeout of zero is treated as an infinite timeout.
This timeout specifies the combined maximum duration of the connection time and the time to read the full response.
Implementation note: when this
Connectionis backed byHttpURLConnection(rather thanHttpClient, as used in JVM 11+), this timeout is implemented by setting both the socket connect and read timeouts to half of the specified value.- Specified by:
-
timeoutin interfaceConnection - Parameters:
-
millis- number of milliseconds (thousandths of a second) before timing out connects or reads. - Returns:
- this Connection, for chaining
- See Also:
-
maxBodySize
Description copied from interface:ConnectionSet the maximum bytes to read from the (uncompressed) connection into the body, before the connection is closed, and the input truncated (i.e. the body content will be trimmed). The default maximum is 2MB. A max size of0is treated as an infinite amount (bounded only by your patience and the memory available on your machine).- Specified by:
-
maxBodySizein interfaceConnection - Parameters:
-
bytes- number of bytes to read from the input before truncating - Returns:
- this Connection, for chaining
-
followRedirects
Description copied from interface:ConnectionConfigures the connection to (not) follow server redirects. By default, this is true.- Specified by:
-
followRedirectsin interfaceConnection - Parameters:
-
followRedirects- true if server redirects should be followed. - Returns:
- this Connection, for chaining
-
referrer
Description copied from interface:ConnectionSet the request referrer (aka "referer") header.- Specified by:
-
referrerin interfaceConnection - Parameters:
-
referrer- referrer to use - Returns:
- this Connection, for chaining
-
method
Description copied from interface:ConnectionSet the request method to use, GET or POST. Default is GET.- Specified by:
-
methodin interfaceConnection - Parameters:
-
method- HTTP request method - Returns:
- this Connection, for chaining
-
ignoreHttpErrors
Description copied from interface:ConnectionConfigures the connection to not throw exceptions when an HTTP error occurs. (4xx - 5xx, e.g. 404 or 500). By default, this is false; an IOException is thrown if an error is encountered. If set to true, the response is populated with the error body, and the status message will reflect the error.- Specified by:
-
ignoreHttpErrorsin interfaceConnection - Parameters:
-
ignoreHttpErrors- - false (default) if HTTP errors should be ignored. - Returns:
- this Connection, for chaining
-
ignoreContentType
Description copied from interface:ConnectionIgnore the document's Content-Type when parsing the response. By default, this is false, an unrecognised content-type will cause an IOException to be thrown. (This is to prevent producing garbage by attempting to parse a JPEG binary image, for example.) Set to true to force a parse attempt regardless of content type.- Specified by:
-
ignoreContentTypein interfaceConnection - Parameters:
-
ignoreContentType- set to true if you would like the content type ignored on parsing the response into a Document. - Returns:
- this Connection, for chaining
-
data
Description copied from interface:ConnectionAdd a request data parameter. Request parameters are sent in the request query string for GETs, and in the request body for POSTs. A request may have multiple values of the same name.- Specified by:
-
datain interfaceConnection - Parameters:
-
key- data key -
value- data value - Returns:
- this Connection, for chaining
-
sslSocketFactory
Description copied from interface:ConnectionSet a custom SSL socket factory for HTTPS connections.Note: if set, the legacy
HttpURLConnectionwill be used instead of the JVM'sHttpClient.- Specified by:
-
sslSocketFactoryin interfaceConnection - Parameters:
-
sslSocketFactory- SSL socket factory - Returns:
- this Connection, for chaining
- See Also:
-
sslContext
Description copied from interface:ConnectionSet a custom SSL context for HTTPS connections.Note: when using the legacy
HttpURLConnection, only theSSLSocketFactoryfrom the context will be used.- Specified by:
-
sslContextin interfaceConnection - Parameters:
-
sslContext- SSL context - Returns:
- this Connection, for chaining
-
data
Description copied from interface:ConnectionAdd an input stream as a request data parameter. For GETs, has no effect, but for POSTS this will upload the input stream.Use the
Connection.data(method to set the uploaded file's mimetype.String, String, InputStream, String) - Specified by:
-
datain interfaceConnection - Parameters:
-
key- data key (form item name) -
filename- the name of the file to present to the remove server. Typically just the name, not path, component. -
inputStream- the input stream to upload, that you probably obtained from aFileInputStream. You must close the InputStream in afinallyblock. - Returns:
- this Connection, for chaining
- See Also:
-
data
Description copied from interface:ConnectionAdd an input stream as a request data parameter. For GETs, has no effect, but for POSTS this will upload the input stream.- Specified by:
-
datain interfaceConnection - Parameters:
-
key- data key (form item name) -
filename- the name of the file to present to the remove server. Typically just the name, not path, component. -
inputStream- the input stream to upload, that you probably obtained from aFileInputStream. -
contentType- the Content Type (aka mimetype) to specify for this file. You must close the InputStream in afinallyblock. - Returns:
- this Connection, for chaining
-
data
Description copied from interface:ConnectionAdds all of the supplied data to the request data parameters- Specified by:
-
datain interfaceConnection - Parameters:
-
data- map of data parameters - Returns:
- this Connection, for chaining
-
data
Description copied from interface:ConnectionAdd one or more requestkey, valdata parameter pairs.Multiple parameters may be set at once, e.g.:
.data(creates a query string like:"name", "jsoup", "language", "Java", "language", "English"); ?name=jsoup&language=Java&language=EnglishFor GET requests, data parameters will be sent on the request query string. For POST (and other methods that contain a body), they will be sent as body form parameters, unless the body is explicitly set by
Connection.requestBody(, in which case they will be query string parameters.String) - Specified by:
-
datain interfaceConnection - Parameters:
-
keyvals- a set of key value pairs. - Returns:
- this Connection, for chaining
-
data
Description copied from interface:ConnectionAdds all of the supplied data to the request data parameters- Specified by:
-
datain interfaceConnection - Parameters:
-
data- collection of data parameters - Returns:
- this Connection, for chaining
-
data
Description copied from interface:ConnectionGet the data KeyVal for this key, if any- Specified by:
-
datain interfaceConnection - Parameters:
-
key- the data key - Returns:
- null if not set
-
requestBody
Description copied from interface:ConnectionSet a POST (or PUT) request body. Useful when a server expects a plain request body (such as JSON), and not a set of URL encoded form key/value pairs. E.g.:Jsoup.connect(If any data key/vals are supplied, they will be sent as URL query params.url) .requestBody(json) .header("Content-Type", "application/json") .post(); - Specified by:
-
requestBodyin interfaceConnection - Returns:
- this Request, for chaining
- See Also:
-
requestBodyStream
Description copied from interface:ConnectionSet the request body. Useful for posting data such as byte arrays or files, and the server expects a single request body (and not a multipart upload). E.g.:Jsoup.connect(url) .requestBody(new ByteArrayInputStream(bytes)) .header("Content-Type", "application/octet-stream") .post(); Or, use a FileInputStream to data from disk.
You should close the stream in a finally block.
- Specified by:
-
requestBodyStreamin interfaceConnection - Parameters:
-
stream- the input stream to send. - Returns:
- this Request, for chaining
- See Also:
-
header
Description copied from interface:ConnectionSet a request header. Replaces any existing header with the same case-insensitive name.- Specified by:
-
headerin interfaceConnection - Parameters:
-
name- header name -
value- header value - Returns:
- this Connection, for chaining
- See Also:
-
headers
Description copied from interface:ConnectionSets each of the supplied headers on the request. Existing headers with the same case-insensitive name will be replaced with the new value.- Specified by:
-
headersin interfaceConnection - Parameters:
-
headers- map of headers name -> value pairs - Returns:
- this Connection, for chaining
- See Also:
-
cookie
Description copied from interface:ConnectionSet a cookie to be sent in the request.- Specified by:
-
cookiein interfaceConnection - Parameters:
-
name- name of cookie -
value- value of cookie - Returns:
- this Connection, for chaining
-
cookies
Description copied from interface:ConnectionAdds each of the supplied cookies to the request.- Specified by:
-
cookiesin interfaceConnection - Parameters:
-
cookies- map of cookie name -> value pairs - Returns:
- this Connection, for chaining
-
cookieStore
Description copied from interface:ConnectionProvide a custom or pre-filled CookieStore to be used on requests made by this Connection.- Specified by:
-
cookieStorein interfaceConnection - Parameters:
-
cookieStore- a cookie store to use for subsequent requests - Returns:
- this Connection, for chaining
-
cookieStore
Description copied from interface:ConnectionGet the cookie store used by this Connection.- Specified by:
-
cookieStorein interfaceConnection - Returns:
- the cookie store
-
parser
Description copied from interface:ConnectionProvide a specific parser to use when parsing the response to a Document. If not set, jsoup defaults to theHTML parser, unless the response content-type is XML, in which case theXML parseris used.- Specified by:
-
parserin interfaceConnection - Parameters:
-
parser- alternate parser - Returns:
- this Connection, for chaining
-
get
Description copied from interface:ConnectionExecute the request as a GET, and parse the result.- Specified by:
-
getin interfaceConnection - Returns:
- parsed Document
- Throws:
-
IOException- on error
-
post
Description copied from interface:ConnectionExecute the request as a POST, and parse the result.- Specified by:
-
postin interfaceConnection - Returns:
- parsed Document
- Throws:
-
IOException- on error
-
execute
Description copied from interface:ConnectionExecute the request.- Specified by:
-
executein interfaceConnection - Returns:
-
the executed
Connection.Response - Throws:
-
IOException- on error
-
request
Description copied from interface:ConnectionGet the request object associated with this connection- Specified by:
-
requestin interfaceConnection - Returns:
- request
-
request
Description copied from interface:ConnectionSet the connection's request- Specified by:
-
requestin interfaceConnection - Parameters:
-
request- new request object - Returns:
- this Connection, for chaining
-
response
Description copied from interface:ConnectionGet the response, once the request has been executed.- Specified by:
-
responsein interfaceConnection - Returns:
- response
-
response
Description copied from interface:ConnectionSet the connection's response- Specified by:
-
responsein interfaceConnection - Parameters:
-
response- new response - Returns:
- this Connection, for chaining
-
postDataCharset
Description copied from interface:ConnectionSet the character-set used to encode the request body. Defaults toUTF-8.- Specified by:
-
postDataCharsetin interfaceConnection - Parameters:
-
charset- character set to encode the request body - Returns:
- this Connection, for chaining
-
auth
Description copied from interface:ConnectionSet the authenticator to use for this connection, enabling requests to URLs, and via proxies, that require authentication credentials.The authentication scheme used is automatically detected during the request execution. Supported schemes (subject to the platform) are
basic,digest,NTLM, andKerberos.To use, supply a
RequestAuthenticatorfunction that:- validates the URL that is requesting authentication, and
- returns the appropriate credentials (username and password)
For example, to authenticate both to a proxy and a downstream web server:
Connection session = Jsoup.newSession() .proxy("proxy.example.com", 8080) .auth(auth -> { if (auth.isServer()) { // provide credentials for the request url Validate.isTrue(auth.url().getHost().equals("example.com")); // check that we're sending credentials were we expect, and not redirected out return auth.credentials("username", "password"); } else { // auth.isProxy() return auth.credentials("proxy-user", "proxy-password"); } }); Connection.Response response = session.newRequest("https://example.com/adminzone/").execute();The system may cache the authentication and use it for subsequent requests to the same resource.
Implementation notes
For compatibility, on a Java 8 platform, authentication is set up via the system-wide default
Authenticator.setDefault(method via a ThreadLocal delegator. Whilst the authenticator used is request specific and thread-safe, if you have other calls toAuthenticator) setDefault, they will be incompatible with this implementation.On Java 9 and above, the preceding note does not apply; authenticators are directly set on the request.
If you are attempting to authenticate to a proxy that uses the
basicscheme and will be fetching HTTPS URLs, you need to configure your Java platform to enable that, by setting thejdk.http.auth.tunneling.disabledSchemessystem property to"". This must be executed prior to any authorization attempts. E.g.:static { System.setProperty("jdk.http.auth.tunneling.disabledSchemes", ""); // removes Basic, which is otherwise excluded from auth for CONNECT tunnels }- Specified by:
-
authin interfaceConnection - Parameters:
-
authenticator- the authenticator to use in this connection - Returns:
- this Connection, for chaining
-
onResponseProgress
Description copied from interface:ConnectionSet the response progress handler, which will be called periodically as the response body is downloaded. Since documents are parsed as they are downloaded, this is also a good proxy for the parse progress.The Response object is supplied as the progress context, and may be read from to obtain headers etc.
- Specified by:
-
onResponseProgressin interfaceConnection - Parameters:
-
handler- the progress handler - Returns:
- this Connection, for chaining
-