top | item 11904950

(no title)

like you are saying "it's a problem", and like I'm saying "that's the problem I'd like to see solved"

as an example, what they teach us in school, and what large projects like NASA have do do, is to first agree on a specification for interfaces, then to write code to the interface, then iron out the kinks. Working on a project like that, and the bigger the project, soon we discover that there are many local wins if we can only change the interface that we agreed on because "we didn't know enough when we agreed" etc. etc.

As an example of what I'm saying (as a thought experiment solution) is that if a real live compiler project was written to clean specs (even if the specs came after the code), then there'd be a lexer, parser, etc. and for a little homebrew project like this one, you could write your own lexer from scratch, testing it all the while against the rest of a functioning compiler. Probably, you would not finish it because you would learn in a series of "aha" moments what "the hard parts" are, and how they are solved.

So you could abandon your own piece, but at the same time you would be now equipped to contribute to the real project.

Or you could move on to working on the parser... lather, rinse, repeat.

No need to tell me what all "the reasons that doesn't work is"... I know the reasons, and it's useful to identify the laundry list of them, but the part I'm interested in is the attitude that "hey, this is worth solving" and "hey, this could be solved..."

discuss

abecedarius|9 years ago

I've done something like that for the CPython bytecode compiler: https://github.com/darius/500lines/blob/master/bytecode-comp...

"From clean specs" didn't really hold because the bytecode VM needs better documentation. I started to address that with a version of the VM in Python (cutting down Ned Batchelder's byterun): https://github.com/darius/tailbiter and I've started a very spare-time project to redo the parser as well (in the same repo). It'd be neat to see this getting used in a compiler course -- CPython's about as simple as a popular compiler gets.

(I agree with the grandparent that it took a lot of time to learn all I needed to about CPython internals -- and for the parser there's more to learn.)

tmm|9 years ago

Does the LLVM introduction tutorial[0] kind of fit what you're suggesting? You learn how to implement a toy language called Kaleidoscope on top of the LLVM infrastructure with one data type (64-bit float), if/else, for loop, and a few other things.

It covers the lexer, parser, AST generation, and a few other things.

There's also one for writing a backend targeting a fake hardware architecture.

[0] http://llvm.org/docs/tutorial/index.html

sheepleherd|9 years ago

thanks, that's very good.

quick critique (wanted to contribute to this conversation while it's active rather than delve deeply into LLVM for the rest of the day :) it's (naturally and understandably) written from the perspective of "this is how it is, if you want to connect with what we do here's what you need to do".

As a pedagogical tool (that is still a compiling tool) it could use an intro of more "here is what a lexer needs to do, here's how/why we chose to do it, here is why what is downstream belongs downstream, here is an example using a language syntax that is extremely simple" (C is not), "here is an alternative way you could try to do it", etc.

But definitely you point up a good way to start toward [mystic music] "my dream goal" in this example.

Again tho, I'm wishing that there were tools and "a way" that ALL projects could be managed this way, not just one great complier, but the several great compilers and editors, and all-the-types-of-things-people-keep-having-the-urge-to-reinvent