XmlFormatter

Format, compress XML documents
Download

XmlFormatter Ranking & Summary

Advertisement

  • Rating:
  • License:
  • MIT/X Consortium Lic...
  • Price:
  • FREE
  • Publisher Name:
  • P. Andreas Moeller

XmlFormatter Tags


XmlFormatter Description

XmlFormatter is an open source Python class, who provides formatting of XML documents. This formatter differs from others by handling whitespaces by a distnict set of formatting rules (see below) - thinking element content as objects and mixed content as a written text. But formatting is suspended for elements marked as preserve. You might find it most useful for tasks involving corrections or presentations. Typical usage often looks like this::from xmlformatter import Formatterformatter = Formatter(indent="4")print formatter.format_file("/home/pa/doc.xml")The Object Style reflects the storage of object properties. Therefore all surrounding whitespaces are removed, sequences of whitespaces are collapsed::< complex > < real > 4.4E+12< /real > < imaginary >5.4E-11 < /imaginary >< /complex >The following shows the the XML document formatted by Object Style::< complex > < real >4.4E+12< /real > < imaginary >5.4E-11< /imaginary >< /complex >The Text Style reflects the storage of a written text. Text is expected within mixed content. Therfore leading and trailing whitespaces are put from text nodes in nested elements to surrounding text nodes. Note: If no text node can be found, xmlformatter inserts a text node containing a single whitespace out of the nested element. Sequences of whitespaces are collapsed to a single:: < poem > Es< em > war< /em > einmal und < em >ist < /em >nicht mehr...< /poem >The nested elements handled like object properties, but whitespaces are merged with text nodes instead of being removed::: < poem >Es < em >war< /em > einmal und < em >ist< /em > nicht mehr...< /poem >Both styles are used together in a XML documents. The formatting rules are:A: surrounding whitespaces are removed from element contentB: leading whitespaces are removed from element contentC: trailing whitespaces are removed from element contentD: leading whitespaces in nested elements are put to preceding text node (or inserted) within mixed contentE: trailing whitespaces in nested elements are put to following text nodes (or inserted) within mixed contentF: sequences of whitespaces (n>0) are replaced by a single blank " " within element and mixed contentG: linebreak and whitespace indents elements within elements contentThe following example marks the described whitespaces by their labels within a XML document::AAAAAAAABBBB4.4E+12CCC< /number >AAAAAAAA< poem >BBBBEs< em >DDDDwar< /em > einmal und < em >istEEEE< /em >nicht mehrFFFFFein < strong >riesengroßer< /strong >< em >DDDDTeddybär< /em >,Fder aßFFFFdie < em >MilchEEEE< /em >und trank das BrotFFFFund als er starb da < strong >war erEEEE< /strong >< em >tot< /em >.CCCC< /poem >AAAA< /root >The following shows the formatted XML document: All whitespaces replaced by a single blank.:< root > < number >4.4E+12< /number > < poem >Es < em >war< /em > einmal und < em >ist< /em > nicht mehr ein < strong >riesengroßer< /strong > < em >Teddybär< /em >, der aß die < em >Milch< /em >und trank das Brot und als er starb da < strong >war er< /strong > < em >tot< /em >.< /poem >< /root >OptionsFormatting can be influenced by a lot of parameters, while construction of XmlFormatter object. Elements that will left unformatted are given in a list of element names, called preserve. All descendants of preserved elements are left unformatted also.: from xmlformatter import Formatter formatter = xmlformatter.Formatter(preserving=) print format.format_file("/home/pa/doc.xml")The indenting can be raised by indent (default 2). The indenting character can be set by indentChar.from xmlformatter import Formatterformatter = Formatter(indent="1", indentChar="\t")print formatter.format_file("/home/pa/doc.xml")Indenting can be suppressed by setting compressed to true or choosing indent = 0.:from xmlformatter import Formatterformatter = Formatter(compress=True)print formatter.format_file("/home/pa/doc.xml")The encoding of the formatted document can be set by encoding_input. By default encoding is UTF-8 or read from xml declaration. The encoding of the output can be set by encoding_output. are:from xmlformatter import Formatterformatter = Formatter(encoding_input="ISO-8859-1", encoding_output="ISO-8859-1")print formatter.format_file("/home/pa/doc.xml")MethodsXmlformatter can parse XML documents given by path or string.:from xmlformatter import Formatterformatter = Formatter()# fileprint formatter.format_file("/home/pa/doc.xml")# stringformatted = formatter.format_string("< root >XML document< /root >")xmlformat.pyXmlFormatter includes a command line tool, xmlformat.py, for wrapping XmlFormatter class. The parameters are named like the options::xmlformat < --infile file|file >xmlformat.py can read from STDIN, like:: cat /home/pa/doc.xml | python xmlformat.pyNoteXmlFormatter is build on top of the expat parser, and therefore limited by expat. XmlFormatter is published under MIT license.Product's homepage


XmlFormatter Related Software