jsoup 1.4.1 released

2010-Nov-23 jsoup 1.4.1 has been released and is now available for download.

Change log

  • Added ability to load and parse HTML from an input stream.
  • Implemented Node.clone() to create deep, independent copies of Nodes, Elements, and Documents.
  • Added :not() selector, to find elements that do not match the selector. E.g. div:not(.logo) finds divs that do not have the "logo" class name.
  • Added Elements.not(String query) method, to remove undesired results from selector results.
  • Implemented DataNode.setWholeData(String) to allow updating of script and style data contents.
  • Relaxed parse rules of H1 - H6, to allow nested content. This is against spec, but matches browser and publisher behaviour.
  • Relaxed parse rule of SPAN to treat as block, to allow nested block content.
  • Fixed issue in jsoup.connect when extracting character set from content-type header; now supports quoted charset declaration.
  • Fixed support for jsoup.connect to follow redirects between http & https URLs.
  • Document normalisation now more enthusiastically enforces the correct document structure.
  • Support Node.outerHtml() method when node has no parent (e.g. when it has been removed from its DOM tree)
  • Fixed support for HTML entities with numbers in name (e.g. ¾, ¹).
  • Fixed absolute URL generation from relative URLs which are only query strings.