(no title)
musingsole | 2 years ago
For years I struggled to grok the difference between regular expression and parsing. The conversations I would hear about the topic always left me feeling there must be some interesting meat to the difference...and so I would go read up on it only to find myself with the same problem a few months down the line.
The only difference is layers. Enough regular expression becomes a parser. A slim enough compiler is just a templating engine.
Insisting it's a deep, dark magic and not the advanced application of `str.replace` does the industry a disservice.
serjeem|2 years ago
Respectfully, this perspective contradicts the computer science underlying our work--there are in fact different categories of string "languages" that can be parsed using different technologies, and regular expressions are not enough for most "languages".
See https://en.wikipedia.org/wiki/Chomsky_hierarchy for the different categories of languages and the required tech to parse them.
I consider CS theory to be about the deepest and darkest magic I get to use on a daily basis :P.
> The only difference is layers. Enough regular expression becomes a parser. A slim enough compiler is just a templating engine.
You can't add enough layers of standard regex to parse, say, C++ or even a tree of matching parenthesis. You need a different tech like CFGs.
Hope this helps improve the grokking! The "interesting meat" here actually points at some really deep, fundamental CS theory that's really worth knowing.
musingsole|2 years ago
I appreciate the academic frameworks that have led to the development of systems, however as a working engineer, ignoring where the rubber meets the road is my entire complaint.
senkora|2 years ago
It is fundamentally impossible to correctly parse e.g. HTML with regular expressions. See: https://stackoverflow.com/a/1732454
rob74|2 years ago
mattgreenrocks|2 years ago