Package org.jsoup.parser
Class TokenQueue
java.lang.Object
org.jsoup.parser.TokenQueue
- All Implemented Interfaces:
AutoCloseable
A character reader with helpers focusing on parsing CSS selectors. Used internally by jsoup. API subject to changes.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidadvance()Drops the next character off the queue.chompBalanced(char open, char close) Pulls a balanced string off the queue.voidclose()charconsume()Consume one character off queue.voidConsumes the supplied sequence of the queue, case-insensitively.Consume a CSS identifier (ID or class) off the queue.Consume a CSS element selector (tag name, but | instead of : for namespaces (or *| for wildcard namespace), to not conflict with :pseudo selects).Pulls a string off the queue, up to but exclusive of the match sequence, or to the queue running out.consumeToAny(String... seq) Consumes to the first sequence provided, or to the end of the queue.booleanPulls the next run of whitespace characters of the queue.static StringGiven a CSS identifier (such as a tag, ID, or class), escape any CSS special characters that would otherwise not be valid in a selector.booleanisEmpty()Is the queue empty?booleanmatchChomp(char c) If the queue matches the supplied (case-sensitive) character, consume it off the queue.booleanmatchChomp(String seq) If the queue case-insensitively matches the supplied string, consume it off the queue.booleanmatches(char c) Tests if the next character on the queue matches the character, case-sensitively.booleanTests if the next characters on the queue match the sequence, case-insensitively.booleanmatchesAny(char... seq) Tests if the next characters match any of the sequences, case-sensitively.booleanTests if queue starts with a whitespace character.booleanTest if the queue matches a tag word character (letter or digit).Consume and return whatever is left on the queue.toString()static StringUnescape a \ escaped string.
-
Constructor Details
-
TokenQueue
Create a new TokenQueue.- Parameters:
data- string of data to back queue.
-
-
Method Details
-
isEmpty
Is the queue empty?- Returns:
- true if no data left in queue.
-
consume
Consume one character off queue.- Returns:
- first character on queue.
-
advance
Drops the next character off the queue. -
matches
Tests if the next characters on the queue match the sequence, case-insensitively.- Parameters:
seq- String to check queue for.- Returns:
- true if the next characters match.
-
matches
Tests if the next character on the queue matches the character, case-sensitively. -
matchesAny
Tests if the next characters match any of the sequences, case-sensitively.- Parameters:
seq- list of chars to case-sensitively check for- Returns:
- true of any matched, false if none did
-
matchChomp
If the queue case-insensitively matches the supplied string, consume it off the queue.- Parameters:
seq- String to search for, and if found, remove from queue.- Returns:
- true if found and removed, false if not found.
-
matchChomp
If the queue matches the supplied (case-sensitive) character, consume it off the queue. -
matchesWhitespace
Tests if queue starts with a whitespace character.- Returns:
- if starts with whitespace
-
matchesWord
Test if the queue matches a tag word character (letter or digit).- Returns:
- if matches a word character
-
consume
Consumes the supplied sequence of the queue, case-insensitively. If the queue does not start with the supplied sequence, will throw an illegal state exception -- but you should be running match() against that condition.- Parameters:
seq- sequence to remove from head of queue.
-
consumeTo
Pulls a string off the queue, up to but exclusive of the match sequence, or to the queue running out.- Parameters:
seq- String to end on (and not include in return, but leave on queue). Case-sensitive.- Returns:
- The matched data consumed from queue.
-
consumeToAny
Consumes to the first sequence provided, or to the end of the queue. Leaves the terminator on the queue.- Parameters:
seq- any number of terminators to consume to. Case-insensitive.- Returns:
- consumed string
-
chompBalanced
Pulls a balanced string off the queue. E.g. if queue is "(one (two) three) four", (,) will return "one (two) three", and leave " four" on the queue. Unbalanced openers and closers can be quoted (with ' or ") or escaped (with \). Those escapes will be left in the returned string, which is suitable for regexes (where we need to preserve the escape), but unsuitable for contains text strings; use unescape for that.- Parameters:
open- openerclose- closer- Returns:
- data matched from the queue
-
unescape
Unescape a \ escaped string.- Parameters:
in- backslash escaped string- Returns:
- unescaped string
-
escapeCssIdentifier
Given a CSS identifier (such as a tag, ID, or class), escape any CSS special characters that would otherwise not be valid in a selector. -
consumeWhitespace
Pulls the next run of whitespace characters of the queue.- Returns:
- Whether consuming whitespace or not
-
consumeElementSelector
Consume a CSS element selector (tag name, but | instead of : for namespaces (or *| for wildcard namespace), to not conflict with :pseudo selects).- Returns:
- tag name
-
consumeCssIdentifier
Consume a CSS identifier (ID or class) off the queue.Note: For backwards compatibility this method supports improperly formatted CSS identifiers, e.g.
1instead of\31.- Returns:
- The unescaped identifier.
- Throws:
IllegalArgumentException- if an invalid escape sequence was found. Afterward, the state of the TokenQueue is undefined.- See Also:
-
remainder
Consume and return whatever is left on the queue.- Returns:
- remainder of queue.
-
toString
-
close
- Specified by:
closein interfaceAutoCloseable
-