All Classes and Interfaces
Class
Description
A single key + value attribute.
The attributes of an Element.
A Character Data node, to support CDATA sections.
Implementation of ArrayList that watches out for changes to the contents.
CharacterReader consumes tokens off a string.
The safelist based HTML cleaner.
Collects a list of elements that match the supplied criteria.
Base combining (and, or) evaluator.
A comment node.
The Connection interface is a convenient HTTP client and session object to fetch content from the web, and parse them into Documents.
Common methods for Requests and Responses
A Key:Value tuple(+), used for form data.
GET and POST http methods.
Represents a HTTP request.
Represents a HTTP response.
A data node, for contents of style, script tags etc, where contents should not show in text().
Internal static utilities for handling data.
A HTML Document.
A Document's output settings control the form of the text() and html() methods.
The output serialization syntax.
A
<!DOCTYPE>
node.
An HTML Element consists of a tag name, attributes, and child nodes (including text nodes and other elements).
A list of
Element
s, with methods that act on every element in the list.
HTML entities, and escape routines.
Evaluates that an element matches the selector.
Evaluator for any / all element matching
Evaluator for attribute name matching
Abstract evaluator for attribute name/value matching
Evaluator for attribute name prefix matching
Evaluator for attribute name/value matching
Evaluator for attribute name/value matching (value containing)
Evaluator for attribute name/value matching (value ending)
Evaluator for attribute name/value matching (value regex matching)
Evaluator for attribute name != value matching
Evaluator for attribute name/value matching (value prefix)
Evaluator for element class
Evaluator for matching Element (and its descendants) data
Evaluator for matching Element's own text
Evaluator for matching Element (and its descendants) text
Evaluator for matching Element (but not its descendants) wholeText.
Evaluator for matching Element (and its descendants) wholeText.
Evaluator for element id
Evaluator for matching by sibling index number (e = idx)
Abstract evaluator for sibling index matching
Evaluator for matching by sibling index number (e > idx)
Evaluator for matching by sibling index number (e < idx)
Evaluator for matching the first sibling (css :first-child)
Evaluator for matching the last sibling (css :last-child)
css-compatible Evaluator for :eq (css :nth-child)
css pseudo class :nth-last-child)
css pseudo class nth-of-type
css3 pseudo-class :root
Evaluator for matching Element (and its descendants) text with regex
Evaluator for matching Element's own text with regex
Evaluator for matching Element's own whole text with regex.
Evaluator for matching Element (and its descendants) whole text with regex.
Evaluator for tag name
Evaluator for tag name that ends with suffix; used for *|el
Evaluator for tag name that starts with prefix; used for ns|*
A HTML Form Element provides ready access to the form fields/controls that are associated with it.
HTML Tree Builder; creates a DOM from Tokens.
Implementation of
Connection
.
Signals that a HTTP request resulted in a not OK HTTP response.
The core public access point to the jsoup functionality.
A node that does not hold any children.
The base, abstract Node model.
Node filter interface.
Filter decision.
Iterate through a Node and its tree of descendants, in document order, and returns nodes of the specified type.
A depth-first node traversor.
Node visitor interface.
A Parse Error records an error in the input HTML that occurs in either the tokenisation or the tree building phase.
A container for ParseErrors.
Parses HTML or XML into a
Document
.
Controls parser case settings, to optionally preserve tag and/or attribute name case.
Parses a CSS selector into an Evaluator tree.
A Range object tracks the character positions in the original input source where a Node starts or ends.
A Position object tracks the character position in the original input source where a Node starts or ends.
A
RequestAuthenticator
is used in Connection
to authenticate if required to proxies and web servers.
Provides details for the request, to determine the appropriate credentials to return.
Safe-lists define what HTML (elements and attributes) to allow through the cleaner.
CSS-like element selector, that finds elements matching a query.
A SerializationException is raised whenever serialization of a DOM element fails.
A StreamParser provides a progressive parse of its input.
Tag capabilities.
A text node.
A character queue with parsing helpers.
Deprecated.
Signals that a HTTP response returned a mime type that is not supported.
Validators to check that method arguments meet expectations.
Validation exceptions, as thrown by the methods in
Validate
.
Helper class to transform a
Document
to a org.w3c.dom.Document
, for integration with toolsets that use the W3C DOM.
Implements the conversion by walking the input.
An XML Declaration.
Use the
XmlTreeBuilder
when you want to parse XML without any of the HTML DOM rules being applied to the document.
UncheckedIOException
instead.