treecompare

Python library to compare large trees of arbitrary objects
Download

treecompare Ranking & Summary

Advertisement

  • Rating:
  • License:
  • BSD License
  • Price:
  • FREE
  • Publisher Name:
  • Ruy Asan
  • Publisher web site:
  • https://github.com/rubyruy/

treecompare Tags


treecompare Description

treecompare is a Python library for comparing trees of various objects in a way that yields useful "paths" to each difference. Simply knowing that two object blobs differ is hardly useful without knowing where exactly the differences are located. For text blobs, text-diff utilities can solve this problem, but they are ill suited for dealing with arbitrary data structures such as dictionaries where key order doesn't matter.InstallationAvailable from the usual places:sudo pip install treecompareOr just use setuptools for developing:git clone < fork >cd treecomparepython setup.py developUsageBasic usage:>>> from treecompare import diff>>> differences = diff(... expected = {... "glossary": {... "title": "example glossary",... "GlossDiv": {... "title": "S",... "GlossList": {... "GlossEntry": {... "ID": "SGML",... "SortAs": "SGML",... "GlossTerm": "Standard Generalized Markup Language",... "Acronym": "SGML",... "Abbrev": "ISO 8879:1986",... "GlossDef": {... "para": "A meta-markup language, used to create markup languages such as DocBook.",... "GlossSeeAlso": ... },... "GlossSee": "markup"... }... }... }... }... },... actual = {... "glossary": {... "title": "example glossary",... "GlossDiv": {... "title": "S",... "GlossList": {... "GlossEntry": {... "ID": "SGML",... "SortAs": "SGML",... "GlossTerm": "Standard Generalized Markup Language",... "Acronym": "SGML",... "Abbrev": "ISO 8879:1986",... "GlossDef": {... "para": "A meta-markup language, used to create markup leenguages such as DocBook.",... "GlossSeeAlso": ... },... "GlossSee": "markup"... }... }... }... }... }... )>>> print '\n'.join(map(str,differences)): expected 'A meta-markup language, used to create markup languages such as DocBook.', got 'A meta-markup language, used to create markup leenguages such as DocBook.' - diff:- A meta-markup language, used to create markup languages such as DocBook.? ^+ A meta-markup language, used to create markup leenguages such as DocBook.? ^^>>> is the "path" of the difference, it shows exactly how one can navigate the objects to get to the differing parts. As the difference is inside a reasonably large block of text, the difference is highlifhted even further using text-diffs.Matching optionsYou can configure different matching strategies using the options argument:>>> diff(, ): "expected 'A', got 'a'"), Difference(: "expected 'b', got 'B'")]>>> diff(, , options='ignore_case')[]You can also pass multiple matching options using a tuple of strings.Scoped optionsAs each 'node' in a tree of objects has a "path" it is easy to refer to specific sections in your tree using nothign more complicated then regex. You can use this to specify matching options for only parts of your tree:>>> diff(... options = {... r'^\\': ('assert_includes', 'ignore_case')... },... expected = ,... actual = ... ): "expected 'orCHards', got 'orchards'")]Note how the ignore_case option allowed "orAnGe" to match "ORANGE" but not "orchards" with orCHards as r'^\\' does not match the path ''.Supported optionsignore Don't bother matching objects at this path.assert_includes Instead of regular equality matching, verify that object at path is included in tuple at corresponding path of expected object.ignore_key Ignore the 'key' (e.g. order index for lists, string key for dicts when comparing nodes at path. Most useful when performing any-order comparisons.ignore_case Use case insensitive compare for strings at this path.ignore_spacing Ignore absolutely all whitespace (including line endings) except for purposes of separting words.ignore_line_whitespace When comparing strings, normalize line endings and ignore any leading or trailing whitespace.XML DiffThe optional xml diffing module works exactly the same: > > > from treecompare.xml import diff_xml > > > differences = diff_xml("""< ?xml version="1.0" encoding="UTF-8"? >... < menu id="file" value="File" >... < popup >... < menuitem value="New" onclick="CreateNewDoc()" / >... < menuitem value="Open" onclick="OpenDoc()" / >... < menuitem value="Close" onclick="CloseDoc()" / >... < /popup >... < /menu >""",... """< ?xml version="1.1" encoding="UTF-8"? >... < menu id="file" value="File" >... < popup >... < menuitem value="New" onclick="CreateNewDoc()" / >... < menuitem value="Open" onclick="OpenDuck()" / >... < menuitem value="Close" onclick="CloseDoc()" / >... < /popup >... < /menu >""") > > > print '\n'.join(map(str,differences))?xml@version: expected u'1.0', got u'1.1'/0< menu >/1< popup >/3< menuitem >@onclick: expected u'OpenDoc()', got u'OpenDuck()'The numbers before each element in the path refers to the node's index in the parent (including text nodes). You can use the ignore_key option to match certain elements in any order.ExtendingA diff function contains a number a list of implementation classes that perform the actual work. The default diff contains implementations for the major python builtins. diff_xml adds additional implementations for XML nodes.ImplementationBase is, not surprisingly, the base class for Implementations. Your subclass must be able to answer to:1. cls.can_diff(obj) - are you able to diff this object? Note: a default implementation is provided that simply does an instanceof check on cls.diffs_types - setting that class attribute should suffice in most cases.2. self.diff(expected, actual) - the acutal implementation, must return a list of Difference objects.For the vast majority of diff implementations one only really needs to recursively diff certain object attributes, and append something to the current "path" for each attribute. The ChildDiffingMixing allows you to do this very easily - you need only impelment a method that yields each (path,child) pair. Everything else including options handling is handled for you.The XML differ implementation illustrates how easy this is:from treecompare.implementations import ChildDiffingMixing, ImplementationBaseclass DiffXMLElement(ChildDiffingMixing, ImplementationBase): diffs_types = dom.Element def path_and_child(self, el): yield ":tag", el.tagName for name, value in el.attributes.items(): yield "@%s" % name, value for i, child in enumerate(el.childNodes): if hasattr(child, 'tagName'): yield "/%d< %s >" % (i, child.tagName), child else: yield "/%d:text" % i, child.dataNothing else to it!Finally, you have to register your implementation to a differ function. A factory method is provided that can generate your own copy of diff() (with all the default builtin implementations arleady included), with any of your added:from treecompare.differ import make_differcustom_diff = make_differ(MyCustomDiff, SomeOtherImplementation)Note that can_diff is called for each implemenation in order. Only the first match is used. If you want your custom implementation to override a builtin, you may manipulate the custom_diff.implementations list directly.Product's homepage


treecompare Related Software