(no title)
lvvlvv | 15 years ago
Can you explain your thought process for why worst case complexity is worse than other? Did you do any measurements? I believe you are incorrect. I am author of this page. I've just did quick test for SS = "aaa ...aaaBaaa...aaa", with SS_size=240. My algorithm is faster than 3 BMs out of 4 tested.
> Do not use this algorithm carelessly. For example, if you use it in a thoughtless way in your web server, you may open yourself to a DoS attack.
Same can be said about all BM, naive and BSD's memmem/strstr. Possibility of DOS for substring search algorithm was known for a long time - but it never materialized. The cure is trivial - limit substring size.
> That immediately made me think he might have devised it for genomics research.
No, it wasn't. It was devised when I was preparing for Google interview (which I failed). And of cause algorithms which pre-index haystack will always be faster. I myself do not consider haystack pre-indexing algorithms a "text search" algorithms (maybe incorrectly).
> gross exaggeration
If you count pre-indexing algorithms and edge cases - then you are correct. For most common case - can you show me something faster (from a student or even yourself)?
eru|15 years ago
You write on the page that your algorithm has "O(NM) worst case complexity." Compare Boyer-Moore's time complexity. The grandfather was interested in asymptotics. Not in some examples.
> Same can be said about all BM, naive and BSD's memmem/strstr.
Red herring.
> Possibility of DOS for substring search algorithm was known for a long time - but it never materialized.
What are you talking about?
> If you count pre-indexing algorithms and edge cases - then you are correct. For most common case - can you show me something faster (from a student or even yourself)?
What do you mean by egde cases? All the cases that make your program run slow?
Using your terminology, KMP has complexity O(N + M). That's better than O(MN).