Package org.jsoup.parser
Class TokenQueue
java.lang.Object
org.jsoup.parser.TokenQueue
- All Implemented Interfaces:
-
AutoCloseable
public class TokenQueue extends Object implements AutoCloseable
A character reader with helpers focusing on parsing CSS selectors. Used internally by jsoup. API subject to changes.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
advance()
Drops the next character off the queue.String
chompBalanced
(char open, char close) Pulls a balanced string off the queue.void
close()
char
consume()
Consume one character off queue.void
consume
(String seq) Consumes the supplied sequence of the queue, case-insensitively.String
Consume a CSS identifier (ID or class) off the queue.String
Consume a CSS element selector (tag name, but | instead of : for namespaces (or *| for wildcard namespace), to not conflict with :pseudo selects).String
consumeTo
(String seq) Pulls a string off the queue, up to but exclusive of the match sequence, or to the queue running out.String
consumeToAny
(String... seq) Consumes to the first sequence provided, or to the end of the queue.boolean
Pulls the next run of whitespace characters of the queue.static String
escapeCssIdentifier
(String in) Given a CSS identifier (such as a tag, ID, or class), escape any CSS special characters that would otherwise not be valid in a selector.boolean
isEmpty()
Is the queue empty?boolean
matchChomp
(char c) If the queue matches the supplied (case-sensitive) character, consume it off the queue.boolean
matchChomp
(String seq) If the queue case-insensitively matches the supplied string, consume it off the queue.boolean
matches
(char c) Tests if the next character on the queue matches the character, case-sensitively.boolean
matches
(String seq) Tests if the next characters on the queue match the sequence, case-insensitively.boolean
matchesAny
(char... seq) Tests if the next characters match any of the sequences, case-sensitively.boolean
Tests if queue starts with a whitespace character.boolean
Test if the queue matches a tag word character (letter or digit).String
Consume and return whatever is left on the queue.String
toString()
static String
unescape
(String in) Unescape a \ escaped string.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Constructor Details
-
Method Details
-
isEmpty
public boolean isEmpty()Is the queue empty?- Returns:
- true if no data left in queue.
-
consume
public char consume()Consume one character off queue.- Returns:
- first character on queue.
-
advance
public void advance()Drops the next character off the queue. -
matches
public boolean matches(String seq) Tests if the next characters on the queue match the sequence, case-insensitively.- Parameters:
-
seq
- String to check queue for. - Returns:
- true if the next characters match.
-
matches
public boolean matches(char c) Tests if the next character on the queue matches the character, case-sensitively. -
matchesAny
public boolean matchesAny(char... seq) Tests if the next characters match any of the sequences, case-sensitively.- Parameters:
-
seq
- list of chars to case-sensitively check for - Returns:
- true of any matched, false if none did
-
matchChomp
public boolean matchChomp(String seq) If the queue case-insensitively matches the supplied string, consume it off the queue.- Parameters:
-
seq
- String to search for, and if found, remove from queue. - Returns:
- true if found and removed, false if not found.
-
matchChomp
public boolean matchChomp(char c) If the queue matches the supplied (case-sensitive) character, consume it off the queue. -
matchesWhitespace
public boolean matchesWhitespace()Tests if queue starts with a whitespace character.- Returns:
- if starts with whitespace
-
matchesWord
public boolean matchesWord()Test if the queue matches a tag word character (letter or digit).- Returns:
- if matches a word character
-
consume
public void consume(String seq) Consumes the supplied sequence of the queue, case-insensitively. If the queue does not start with the supplied sequence, will throw an illegal state exception -- but you should be running match() against that condition.- Parameters:
-
seq
- sequence to remove from head of queue.
-
consumeTo
public String consumeTo(String seq) Pulls a string off the queue, up to but exclusive of the match sequence, or to the queue running out.- Parameters:
-
seq
- String to end on (and not include in return, but leave on queue). Case-sensitive. - Returns:
- The matched data consumed from queue.
-
consumeToAny
public String consumeToAny(String... seq) Consumes to the first sequence provided, or to the end of the queue. Leaves the terminator on the queue.- Parameters:
-
seq
- any number of terminators to consume to. Case-insensitive. - Returns:
- consumed string
-
chompBalanced
public String chompBalanced(char open, char close) Pulls a balanced string off the queue. E.g. if queue is "(one (two) three) four", (,) will return "one (two) three", and leave " four" on the queue. Unbalanced openers and closers can be quoted (with ' or ") or escaped (with \). Those escapes will be left in the returned string, which is suitable for regexes (where we need to preserve the escape), but unsuitable for contains text strings; use unescape for that.- Parameters:
-
open
- opener -
close
- closer - Returns:
- data matched from the queue
-
unescape
public static String unescape(String in) Unescape a \ escaped string.- Parameters:
-
in
- backslash escaped string - Returns:
- unescaped string
-
escapeCssIdentifier
public static String escapeCssIdentifier(String in) Given a CSS identifier (such as a tag, ID, or class), escape any CSS special characters that would otherwise not be valid in a selector.- See Also:
-
consumeWhitespace
public boolean consumeWhitespace()Pulls the next run of whitespace characters of the queue.- Returns:
- Whether consuming whitespace or not
-
consumeElementSelector
public String consumeElementSelector()Consume a CSS element selector (tag name, but | instead of : for namespaces (or *| for wildcard namespace), to not conflict with :pseudo selects).- Returns:
- tag name
-
consumeCssIdentifier
public String consumeCssIdentifier()Consume a CSS identifier (ID or class) off the queue.Note: For backwards compatibility this method supports improperly formatted CSS identifiers, e.g.
1
instead of\31
.- Returns:
- The unescaped identifier.
- Throws:
-
IllegalArgumentException
- if an invalid escape sequence was found. Afterward, the state of the TokenQueue is undefined. - See Also:
-
remainder
public String remainder()Consume and return whatever is left on the queue.- Returns:
- remainder of queue.
-
toString
public String toString()- Overrides:
-
toString
in classObject
-
close
public void close()- Specified by:
-
close
in interfaceAutoCloseable
-