Hi, thank you for your interest! I believe that most fuzzy search implementations lack accuracy in one aspect or another. The primary goals of my library are accuracy and query performance. However, I haven't looked into Fuse yet. I'm highly interested in hearing feedback from people who have tried both libraries with their datasets.
LoganDark|2 years ago
kmschaal|2 years ago
Distance definitions such as the Levenshtein and Damerau-Levenshtein distances provide a solid basis for discussions on accuracy. However, they are costly to compute and hence not widely adopted in fuzzy search libraries.
I started by using the known filter equation for the Levenshtein distance and computed a quality score with a leightweight formula. Then, I realized that the filter equation can be extended to the Damerau-Levenshtein distance by sorting the characters of the 3-grams.
In my tests, this implementation worked well. Please let me know how it works for you if you test it.