I am beginner in the text matching ,indexing related algorithms. Hence need some idea on what should be my approach / algorithms to identify potential household members when i have their addresses and email- The criteria is looking for similar address & email people.
In some cases address may be exactly similar , in some cases format may be slightly different
- ex : {3 kellock road , house 04} vs {house 04 , 3 kellock road}*
What should be my approach from data engrng and algorithm ? Would a simple sorting the text , indexing work ? or any fuzzy string matching algorithm would be better ?
An example would be appreciated . My work will be done in R
Thanks