2

I'm looking for a tool that would compare two text strings and return a result being in fact the indicator of their similarity (e.g. 95%). It needs to be implemented on a platform supporting Java libraries.

My best guess is that I need some fuzzy logic comparison tool that would do the fuzzy match and then return the similarity level.

I've seen some posts here related to fuzzy search but I need the exact opposite - meaning I don't want to set some parameters and have similar entries returned. Instead I have the entries on hand but need to have those similarity parameter derived from them...

Can you advise me on that? Many thanks

mikolajek
  • 71
  • 2
  • 9

2 Answers2

2

Apache's StringUtils has something called Levenshtein distance indicator. http://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringUtils.html

Levenshstein distance is an algorithm that outputs the similarity based on "edit distance". Although I'm not sure if this is "fuzzy".

Example: int distance = StringUtils.getLevenshteinDistance("cat", "hat");

mrQWERTY
  • 3,699
  • 11
  • 37
  • 90
2

There is now a library that does exactly that https://github.com/intuit/fuzzy-matcher

mob
  • 477
  • 4
  • 11