top | item 33566987

(no title)

joedrago | 3 years ago

I read an article a million years ago which I thought described this mechanism in a super friendly way:

https://perl.plover.com/Regex/article.html

It uses the idea of "putting a penny down" on the graph and moving it around, and I found the conversational tone of the description to be very easy to grok. Assuming this is the exact same mechanism (which the author cites), it might resonate for some of you too.

discuss

order

photochemsyn|3 years ago

Nice article. On this bit: " It's about how you would write a regex package from scratch, in a language like C that doesn't already have regexes."

There's a 30-line C regex matching program that Rob Pike wrote and Brian Kernighan wrote about here. It only implements . ^ $ * (where c is any literal):

"A Regular Expression Matcher" (2007)

https://www.cs.princeton.edu/courses/archive/spr09/cos333/be...

> "Rob's implementation itself is a superb example of beautiful code: compact, elegant, efficient, and useful. It's one of the best examples of recursion that I have ever seen, and it shows the power of C pointers. Although at the time we were most interested in conveying the important role of a good notation in making a program easier to use and perhaps easier to write as well, the regular expression code has also been an excellent way to illustrate algorithms, data structures, testing, performance enhancement, and other important topics." - Brian Kernighan

tgv|3 years ago

As I said before: that's a simple backtracker with exponential complexity, and it adding more operators will trigger that behavior faster. I think Thompson's NFA matcher is O(n m), but I didn't check the patent. If it isn't, it's fairly easy to make it that.

jedimastert|3 years ago

Holy smoke, what an amazing article. This has nagged the back of my brain for a very long time.