10

All diff tools I've found are just comparing line by line instead of char by char. Is there any library that gives details on single line strings? Maybe also a percentage difference, though I guess there are separate functions for that?

Tor Valamo
  • 32,125
  • 11
  • 71
  • 80
  • 1
    Isn't this a duplicate of http://stackoverflow.com/questions/1721738/using-diff-or-anything-else-to-get-character-level-diff-between-text-files ? – Aleksandr Levchuk May 07 '11 at 21:35

4 Answers4

5

This algorithm diffs word-by-word:

http://github.com/paulgb/simplediff

available in Python and PHP. It can even spit out HTML formatted output using the <ins> and <del> tags.

slebetman
  • 101,977
  • 18
  • 125
  • 159
  • Good, but whitespace should matter too. A tab replaced by a space would be a difference not picked up by this. – Tor Valamo Jan 09 '10 at 20:44
  • The source code looks simple enough. You can easily change it to split on empty string instead of whitespace so you can diff character-by-character. – slebetman Jan 09 '10 at 21:07
  • Actually this one works awesome, by passing the strings directly to diff() instead of through stringDiff(). Works nicely on a char by char basis, because strings are sequences in python. And the output of the function is easy to work with too. I'm wondering about the overhead of looking for largest common substring though, when each item is only one char... though I may be misunderstanding the code... – Tor Valamo Jan 09 '10 at 21:22
4

I was looking for something similar recently, and came across wdiff. It operates on words, not characters, but is this close to what you're looking for?

Michael Williamson
  • 11,018
  • 4
  • 34
  • 32
3

What you could try is to split both strings up character by character into lines and then you can use diff on that. It's a dirty hack, but atleast it should work and is quite easy to implement.

Alternately you can split the string up into a list of chars in Python and use difflib. Check Python difflib reference

JPvdMerwe
  • 3,259
  • 3
  • 25
  • 32
3

You can implement a simple Needleman–Wunsch algorithm. The pseudo code is available on Wikipedia: http://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm

Pierre
  • 33,089
  • 29
  • 109
  • 185