HtmlCleaner

Free and open source HTML parser
Download

HtmlCleaner Ranking & Summary

Advertisement

  • Rating:
  • License:
  • BSD
  • Price:
  • FREE
  • Publisher Name:
  • Vladimir Nikic
  • Publisher web site:
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 1.6 MB

HtmlCleaner Tags


HtmlCleaner Description

Free and open source HTML parser HtmlCleaner is a free and open source open-source HTML parser written in Java. HTML found on Web is usually dirty, ill-formed and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring the order to tags, attributes and ordinary text. For the given HTML document, HtmlCleaner reorders individual elements and produces well-formed XML. By default, HtmlCleaner follows similar rules that the most of web browsers use in order to create Document Object Model. However, user may provide custom tag and rule set for tag filtering and balancing. Requirements: · Java 1.6 or later What's New in This Release: · Parsing transformations are developed in order to easily skip or change specified tags or attributes during the cleanup process. · Few more constructors added in class HtmlCleaner giving possibility to reuse same cleaner properties with multiple cleaner instances. · Code cleanup.


HtmlCleaner Related Software