HTML-to-XML Component

Designed for the purpose of transforming HTML into well-formed XML for parsing.
Download

HTML-to-XML Component Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Shareware
  • Publisher Name:
  • Chilkat Software, Inc.
  • Publisher web site:
  • Operating Systems:
  • Windows 7/Vista/2003/XP/2000/2008/98/NT
  • File Size:
  • 784KB

HTML-to-XML Component Tags


HTML-to-XML Component Description

The Chilkat HTML-to-XML component is designed for the purpose of transforming HTML into well-formed XML for parsing. If effect, it is designed to be an HTML Parser / Scraper. Once HTML is converted to XHTML (i.e. well-formed XML), the plethora of existing XML parsing components and libraries can be leveraged for HTML parsing and scraping. Also includes HTML to plain-text conversion. The internal conversion process is much more sophisticated than can be accomplished with the simple regular-expression freeware codes found in the Internet. It is more than simply removing HTML tags from an HTML document. * File-to-file HTML to XML conversion. * Memory-to-memory HTML to XML conversion. * Convert character encoding during conversion process. * Flexibility in controlling how HTML Entities are handled. * Automatically convert HTML entities to corresponding 8-bit characters. * Optionally drop all Text Formatting tags from the output. * Drop/undrop specific tags from the output. * HTML to plain-text conversion.


HTML-to-XML Component Related Software