jsoup 1.6.3 released

2012-May-28 I am happy to announce that jsoup 1.6.3 has been released and is now available for download. This release brings a number of bug fixes and improvements, particularly with renewed support for the Google App Engine, and some parsing improvements.

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

Improvements

  • Fixed GAE support: load HTML entities from a file on startup, instead of embedding in the class.
  • In HTML whitelists, when defining allowed attributes for a tag, automatically add the tag to the allowed list.
  • If a node has no parent, return null on previousSibling and nextSibling instead of throwing a null pointer exception.
  • Updated Node.siblingNodes() and Element.siblingElements() to exclude the current node (a node is not its own sibling).

Bug fixes

  • Fixed parsing of group-or commas in CSS selectors, to correctly handle sub-queries containing commas.
  • Fixed HTML entity parser to correctly parse entities like frac14 (letter + number combo).
  • Fixed issue where contents of a script tag within a comment could be incorrectly parsed.
  • Fixed NPE when HTML fragment parsing a
  • Fixed issue with :all pseudo-tag in HTML sanitizer when cleaning tags previously defined in whitelist
  • Fixed NPE in Parser.parseFragment() when context parameter is null.

Many thanks to everyone who contributed patches, suggestions, and bug reports. If you have any suggestions for the next release, I would love to hear them; please get in touch via the mailing list or to me directly.