(no title)
xiaq | 1 year ago
I didn't have much time in the talk to go into the reason, so here it is:
- You'll need a more complex lexer to parse a shell-like syntax. For example, one common thing you do with lexers is get rid of whitespaces, but shell syntax is whitespace sensitive: "a$x" and "a $x" (double quotes not part of the code) are different things: the first is a single word containing a string concatenation, the second is two separate words.
- If your parser backtracks a lot, lexing can improve performance: you're not going back characters, only tokens (and there are fewer tokens than characters). Elvish's parser doesn't backtrack. (It does use lookahead fairly liberally.)
Having a lexerless parser does mean that you have to constantly deal with whitespaces in every place though, and it can get a bit annoying. But personally I like the conceptual simplicity and not having to deal with silly tokens like LBRACE, LPAREN, PIPE.
I have not used parser generators enough to comment about the benefits of using them compared to writing a parser by hand. The handwritten one works well so far :)
throwaway2016a|1 year ago
But I do get your meaning, I've written a lot of tokenizers by hand as well, sometimes they can be easier to follow the hand written code. Config files for grammars can get convoluted fast.
But again, I was not meaning it as criticism. But your talk title does start with "How to write a programming language and shell in Go" so given the title I think Lexers / Tokenizers are worth noting.
xiaq|1 year ago
The authoritative tone of "how to write ..." is meant in jest, but obviously by doing that I risk being misunderstood. A more accurate title would be "how I wrote ...", but it's slightly boring and I was trying hard to get my talk proposal accepted you see :)