HTML-to-XML ComponentDesigned for the purpose of transforming HTML into well-formed XML for parsing. | |
Download |
HTML-to-XML Component Ranking & Summary
Advertisement
- License:
- Shareware
- Publisher Name:
- Chilkat Software, Inc.
- Publisher web site:
- Operating Systems:
- Windows 7/Vista/2003/XP/2000/2008/98/NT
- File Size:
- 784KB
HTML-to-XML Component Tags
HTML-to-XML Component Description
The Chilkat HTML-to-XML component is designed for the purpose of transforming HTML into well-formed XML for parsing. If effect, it is designed to be an HTML Parser / Scraper. Once HTML is converted to XHTML (i.e. well-formed XML), the plethora of existing XML parsing components and libraries can be leveraged for HTML parsing and scraping. Also includes HTML to plain-text conversion. The internal conversion process is much more sophisticated than can be accomplished with the simple regular-expression freeware codes found in the Internet. It is more than simply removing HTML tags from an HTML document. * File-to-file HTML to XML conversion. * Memory-to-memory HTML to XML conversion. * Convert character encoding during conversion process. * Flexibility in controlling how HTML Entities are handled. * Automatically convert HTML entities to corresponding 8-bit characters. * Optionally drop all Text Formatting tags from the output. * Drop/undrop specific tags from the output. * HTML to plain-text conversion.
HTML-to-XML Component Related Software