top | item 22346532

Let’s Build a Compiler (1995)

240 points| undreren | 6 years ago |compilers.iecc.com | reply

41 comments

order
[+] eismcc|6 years ago|reply
Looks like it took a number of years for the article to be finished. From “back to the future”:

I won't spend a lot of time making excuses; only point out that things happen, and priorities change. In the four years since installment fourteen, I've managed to get laid off, get divorced, have a nervous breakdown, begin a new career as a writer, begin another one as a consultant, move, work on two real-time systems, and raise fourteen baby birds, three pigeons, six possums, and a duck.

The author has quite a diverse work history:

https://www.linkedin.com/in/jack-crenshaw

[+] merricksb|6 years ago|reply
Previous HN discussions:

https://news.ycombinator.com/item?id=6641117 (2013)

https://news.ycombinator.com/item?id=1727004 (2010)

https://news.ycombinator.com/item?id=232024 (2008)

(Links provided for info, not complaining about dupes.)

[+] brianobush|6 years ago|reply
I really enjoyed Crenshaw's series on a Sweet 16-like interpreter in Embedded Programming (Programmer's Toolbox section) around 1999. He has a charming way of tying technology with history from his point of view.

Found one article: https://m.eet.com/media/1171254/toolbox.pdf

[+] ainar-g|6 years ago|reply
I love this series! Does anybody know, if anyone ever made a version that outputs AMD64 assembly instead of 68000? Or some kind of gentle introduction to a reduced version of the AMD64 instruction set, so that I could do it myself? The instruction set is quite huge, so having a number of primitive commands that get the job done would be nice.
[+] barrkel|6 years ago|reply
You can go along way with not much more than a couple dozen instructions (mnemonics, though when encoded with addressing modes it'll be more opcodes, but Jack's compiler delegates that work to the assembler). MOV, MOVZX, JMP, CALL, RET, CMP, TEST, Jcc, ADD, SUB, MUL, DIV, AND, OR, NOT, XOR, INC, DEC would about cover it for basic control flow and integer operations.

Adding floating point arithmetic needs another dozen or so instructions. The x87 opcodes (FLD, FST, FADD etc.) are very easy to program against, since it's a stack machine - a post-order traversal of an expression tree is usually sufficient if the stack won't overflow, though SSE2 instructions are more usual for x64.

Large portions of the full instruction set are vector operations which you wouldn't realistically be emitting for a didactic toy compiler. You can get by perfectly well without the string operations too.

[+] jimws|6 years ago|reply
This tutorial uses Pascal as the implementation language. Is there a similar tutorial done in a language that is mainstream today? Like Python? Go? Rust?
[+] mhh__|6 years ago|reply
Modern Compiler implementation in [x] follows a similar structure (in much more depth) and can be consumed in Java, C or - the correct choice - ML.
[+] einpoklum|6 years ago|reply
I would be a bit wary of a 30-year-old book on compiler construction. If it were more on the theoretical side then fine, but this sounds kind of hands-on. "Let's build a compiler with 1980s tools!" doesn't sound very appealing.

That is just a shallow impression though. Convince me I'm wrong?

[+] makotoNagano|6 years ago|reply
Sometimes learning on older technologies is very helpful to learning more of the fundamentals than learning practical skills directly.

For example I studied how to program assembly code for an STM32F0 microprocessor. Would never do that in practice. but worked wonders in teaching me the intricacies of a processor at a very low level.

[+] chooseaname|6 years ago|reply
>That is just a shallow impression though. Convince me I'm wrong?

When developers ask, "I am curious about compilers, where should I start?", a common answer, even today, is "the dragon book". This is just bad advice. There are better introductory texts on the subject. However, if we step back , there's no reason not to suggest "Let's Build a Compiler" even though it seems dated at this point. Why is that? Because the dev is only curious and the content of "Let's Build a Compiler" is still, to this day, generally a good introduction to what compilers actually do. It's lite on theory and heavier on practicality and can easily satiate curiosity.

[+] EamonnMR|6 years ago|reply
This sort of tutorial is best not to follow exactly, but rather use it as a framework to, for example, implement a script you invent in a language you're learning. Oh, and it's a bunch of fun, really.
[+] Riverheart|6 years ago|reply
You could just take the code as pseudo code and translate it to the language of your choice. I personally like hand-ons tutorials regardless of complexity or best practice because there's value in basic comprehension of any subject. Gives you a better foundation to learn more modern stuff.
[+] userbinator|6 years ago|reply
The basics haven't changed. In some ways, technology has gone backwards over time, in many cases adding more complexity for no real gain. It's good to start from simplicity.