top | item 3489703

Approximate Regex Matching in Python

83 points| coderdude | 14 years ago |hackerboss.com | reply

12 comments

[+] rplnt|14 years ago|reply

Another approximate matching library is FuzzyWuzzy[1]. It has some handy functions and seems easier to use. Although might not be as powerful as this.

1. https://github.com/seatgeek/fuzzywuzzy

[+] schwa|14 years ago|reply

Bonus points for the racist project name too.

[+] Wilduck|14 years ago|reply

This is really cool, but I'm curious what the performance implications are when doing fuzzy matching? I noticed that the list of types of errors did not include transposition. I wonder if the reason for this comes down to performance/technical reasons.

[+] coderdude|14 years ago|reply

The author wrote more about the library here: http://laurikari.net/tre/about/

He talks about the time complexity ("O(M^2N), where M is the length of the regular expression and N is the length of the text"), and he talks more about it under "Predictable matching speed." It doesn't look like he talks about how the fuzziness feature affects that but I could have missed it.

[+] metafunctor|14 years ago|reply

Adding transposition I've thought about, and received a number of requests for, but never got around to doing it. The reason isn't performance, but doing a change like this would get quite involved with the guts of the library, and I simply don't have the time...