It is often said there are more generalized diff libraries for Python than stars in the Milky Way. Here I compare them all.
Did I miss one or do you have suggestions for other criteria? Feel free to open an issue!
While regular UNIX diff
is well-suited to comparing files / strings of
characters, there are libraries for most programming languages that allow
comparing structured data like JSON or that language's JSON equivalent (in
Python: dicts, lists and primitives, called "JSON-ish" from here on), producing
patches written in terms of that structure rather than in terms of lines and
characters. That's what I mean by "generalized diff". I guess other terms would
be "data diff", "structure diff", "structured data diff", "JSON diff", or
something along those lines.
Name / URL | Supported data structures | Diff/patch output formats | Can apply patches | Diffing time complexity |
---|---|---|---|---|
jsonpatch |
|
|
✔ | ? |
DeepDiff |
|
|
❌ | O(n) when not ignoring order, otherwise ? |
jycm |
|
|
❌ | ? |
jsondiff |
|
|
✔ (undocumented) | ? |
These projects have names that might suggest they'd fit the bill, but were excluded from the comparison for the stated reasons:
- json-diff: Is just a CLI tool with no (documented) API.
- json-diff-patch: Is just a CLI tool with no (documented) API (yet). Also not released to PyPI yet.