Olin mentions the Markup language, which I extended so that I could write my thesis and generate both HTML and Latex. Here's the resulting web version: http://draves.org/cmu-research/diss/main.html
All the diagrams and charts were made by calling out to external programs like dot and gplot, with the source all held in one big text file, edited with emacs of course.
"You could also program the -10 in a beautiful, roughly-C-level language from CMU, called Bliss. I see C and I remember Bliss, and I could weep."
I see C and I wonder if that's the best we can do for a systems programming language. Maybe C is just a local optimum that we have just invested too much manpower into to switch to anything else?
> I see C and I wonder if that's the best we can do for a systems programming language. Maybe C is just a local optimum that we have just invested too much manpower into to switch to anything else?
At this point the ecosystem around C is too big to justify the cost of switching for the domains it's used in. Any new contender is going to have to offer something new, without giving up any cost in performance or control over the machine.
With Rust (disclaimer: I work on Rust) we're shooting for safety to give us that edge—having the compiler reliably prevent use-after-frees and buffer overflows instead of discovering them through 0-days, without the traditional approaches that sacrifice performance and control over the machine (GC), seems pretty compelling to me. Even if we do succeed in getting enough new projects to use a safe language to make a difference, though, C will still be immortal; once a language has that much staying power it'll be around forever.
The irony is that no one wishes to port it to "modern" AArch64 and x86_64 with, due to proper modular design and GC and compiler written in high-level language (same as MIT scheme) it at least seems like a manageable effort. The goal is that it could be comparable with Haskell in code quality.
Compact and clean native Scheme compiler (such as Gambit) could be a very nice tool. The problem, as usual, is funding and ignorance of the investors who know no other words but Java.)
So much of this sounds interesting, so what happened to T? Is there any code for the T3 variant? I presume you need T1, T2, and T3 to bootstrap the former, but I would love to see some of this stuff!
>There was, for example, a snooty French paper that sort of dismissed Lamping as an "autodidact," before proceeding to build (with, let me be careful to note, proper credit given to John) on his work
I assumed I might find this article in [Essays], but I couldn't find it except on the 8th page of [Index]. Does this article really only exist in the index?
I think it might be An empirical study of list structure in Lisp [1], by Douglas W. Clark and C. Cordell Green. I haven't read it because it is behind the ACM paywall. I'm pretty sure that the Clark he mentions must be Douglas W. Clark, who has at least five citations in the Jones and Lins Garbage Collection book (1st ed), and is the only Clark listed (under his own name, at least) in the bibliography (of the 1st ed).
If you are looking for a simple GC algorithm that might be suitable for use in a toy lisp language, you might check out Simple Generational Garbage Collection and Fast Allocation by Andrew W. Appel [2]. I think it has many of the same characteristics and is also pretty simple. For a while I had thought that this must be the one that Shivers meant, and that he just misremembered the author.
Edit: The reason I think it is this particular paper by Clark and Green is due to a discussion of it on page 140 (section 6.8) of Jones and Lins (this page is visible on Amazon in search inside):
> Experiments with a recursive copying collector by Douglas Clark and Cordell Green produced a cdr-cell linearisation -- the property that a cell that points to another will be next to each other in Tospace after collection -- of over 98 percent [Clark and Green, 1977]. The incidence of off-page pointers was also low (between 2.7 and 8.4 percent).
This isn't the Clark GC, but if you're writing a lisp compiler and want to just use a GC (and not Boehm), give http://www.ravenbrook.com/project/mps a look. It is pretty nice.
"[..] I had never been to California before, so I was discovering San
Francisco, my favorite city in the US and second-favorite city in the world.
[..]"
Out of curiosity does somebody know what is PG's first-favorite city in the
world?
"[..] It was also a massive validation of a thesis Steele had argued for his
Master's, which was that CPS was a great intermediate representation for a
compiler [..]"
What does CPS mean? Further down in the article
"[..] Richard Kelsey took his front end, which was a very aggressive CPS-based
optimiser, and extended it all the way down to the ground to produce a
complete, second compiler, which he called "TC" for the "Transformational
Compiler." His approach was simply to keep transforming the program from one
simple, CPS, lambda language to an even simpler one, until the language was so
simple it only had 16 variables... r1 through r15, at which time you could
just kill the lambdas and call it assembler. [..]"
So I guess CPS means Continuation-Passing-Style? But then I wonder what PG
means by "CPS was a great intermediate representation for a compiler"?
Even further down we can see PG's own opinion:
"[..] So the lineage of the CPS-as-compiler-IR thesis goes from Steele's
Rabbit compiler through T's Orbit to SML/NJ. At which point Sabry & Felleisen
at Rice published a series of very heavy-duty papers dumping on CPS as a
representation and proposing an alternate called A-Normal Form. ANF has been
the fashionable representation for about ten years now; CPS is out of favor.
This thread then sort of jumps tracks over to the CMU ML community, where it
picks up the important typed-intermediate-language track and heads to Cornell,
and Yale, but I'm not going to follow that now. However, just to tell you
where I am on this issue, I think the whole movement from CPS to ANF is a bad
idea (though Sabry & Felleisen's technical observations and math are as rock solid as one would expect from people of their caliber). [..]
Fascinating stuff, though I do not what IRs are used by the
modern compilers like LLVM, GHC, Java, .Net?
"[..] Jim Philbin, like Kelsey, also went to NEC, where he built an operating
system tuned for functional programming languages, STING (or perhaps it was
spelled "STNG" -- in any event it was pronounced "sting"). He built it in T,
of course, and one could see that it had its roots in his work on T3.
(Implementing the runtime for a functional language, in some sense, requires
you to implement a little virtual OS on top of the real, underlying OS.) [..]"
- Does anybody know what happened to this OS? The only link that I found is
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.4...
The summary of this article then goes to say "[..] Sting is currently a
prototype implemented on an eight processor Silicon Graphics PowerSeries 480
running Unix. The next step is to integrate Sting into the micro-kernel of an
operating system such as Mac [..]" - Would be interesting if this fruited in
some new works for VMs for functional languages.
> Fascinating stuff, though I do not what IRs are used by the modern
> compilers like LLVM, GHC, Java, .Net?
There are a couple of things going on with CPS as implemented in this style of compilers that make it particularly nice for writing optimization passes:
1) There are only function calls ("throws") and no returns. Though there's no theoretical reduction in the stuff you have to track (since what is a direct-style return is now a throw to the rest of the program starting at the return point), there's a huge reduction in in the complexity of your optimizations because you're not tracking the extra thing. For some examples, you can look in the Manticore codebase where even simple optimizations like contraction and let-floating are implemented in both our early direct-style IR (BOM) and our CPS-style IR (CPS). The latter is gobs more readable, and there's no way at all I'd be willing to port most of the harder optimizations like reflow-informed higher-order inlining or useless variable elimination to BOM.
2) IRs are cheap in languages like lisp and ML. So you write optimizations as a tree-to-tree transformation (micro-optimization passes). This style makes it much easier to enforce invariants. If you look at the internals of most compilers written in C++, you'll see far fewer copies of the IR made and a whole bunch of staged state in the object that's only valid at certain points in the compilation process (e.g., symbols fully resolved only after phase 2b, but possibly invalid for a short time during optimization 3d unless you call function F3d_resolve_symbol...). Just CPS'ing the representation of a C++ compiler without also making it easy to write your optimization passes as efficient tree-to-tree transformations will not buy you much, IMO.
[+] [-] melling|12 years ago|reply
http://stackoverflow.com/questions/7654848/what-are-the-new-...
[+] [-] spot|12 years ago|reply
All the diagrams and charts were made by calling out to external programs like dot and gplot, with the source all held in one big text file, edited with emacs of course.
that was finished in 1997.
[+] [-] spot|12 years ago|reply
[+] [-] xradionut|12 years ago|reply
I see C and I wonder if that's the best we can do for a systems programming language. Maybe C is just a local optimum that we have just invested too much manpower into to switch to anything else?
[+] [-] pcwalton|12 years ago|reply
At this point the ecosystem around C is too big to justify the cost of switching for the domains it's used in. Any new contender is going to have to offer something new, without giving up any cost in performance or control over the machine.
With Rust (disclaimer: I work on Rust) we're shooting for safety to give us that edge—having the compiler reliably prevent use-after-frees and buffer overflows instead of discovering them through 0-days, without the traditional approaches that sacrifice performance and control over the machine (GC), seems pretty compelling to me. Even if we do succeed in getting enough new projects to use a safe language to make a difference, though, C will still be immortal; once a language has that much staying power it'll be around forever.
[+] [-] Todd|12 years ago|reply
It does seem to have some interesting features for a systems level language--for example, expressions over statements and references over pointers.
[+] [-] emmelaich|12 years ago|reply
[+] [-] unknown|12 years ago|reply
[deleted]
[+] [-] dschiptsov|12 years ago|reply
Compact and clean native Scheme compiler (such as Gambit) could be a very nice tool. The problem, as usual, is funding and ignorance of the investors who know no other words but Java.)
[+] [-] freefrancisco|12 years ago|reply
[+] [-] 616c|12 years ago|reply
[+] [-] juliangamble|12 years ago|reply
Is there a reference to this paper available?
[+] [-] cwzwarich|12 years ago|reply
http://users.soe.ucsc.edu/~abadi/Papers/proofs.ps
The reference to Lamping being 'autodidactical' is at the top of the second column.
[+] [-] graemian|12 years ago|reply
<font size="4" face="verdana" style="line-height: 2em">
Right now, a great sadness comes over me when I think about reading one of your articles :-)
[+] [-] jvehent|12 years ago|reply
[+] [-] _delirium|12 years ago|reply
[+] [-] Double_Cast|12 years ago|reply
[+] [-] EdwardCoffin|12 years ago|reply
which is linked to from the page Lisp Links: http://paulgraham.com/lisplinks.html
which is linked to from the top level page Lisp http://paulgraham.com/lisp.html
[+] [-] papaf|12 years ago|reply
[+] [-] EdwardCoffin|12 years ago|reply
If you are looking for a simple GC algorithm that might be suitable for use in a toy lisp language, you might check out Simple Generational Garbage Collection and Fast Allocation by Andrew W. Appel [2]. I think it has many of the same characteristics and is also pretty simple. For a while I had thought that this must be the one that Shivers meant, and that he just misremembered the author.
Edit: The reason I think it is this particular paper by Clark and Green is due to a discussion of it on page 140 (section 6.8) of Jones and Lins (this page is visible on Amazon in search inside):
> Experiments with a recursive copying collector by Douglas Clark and Cordell Green produced a cdr-cell linearisation -- the property that a cell that points to another will be next to each other in Tospace after collection -- of over 98 percent [Clark and Green, 1977]. The incidence of off-page pointers was also low (between 2.7 and 8.4 percent).
[1] http://dl.acm.org/citation.cfm?id=359427
[2] http://www.cs.ucsb.edu/~ckrintz/racelab/gc/papers/appel88sim...
[+] [-] BruceM|12 years ago|reply
[+] [-] avdd_|12 years ago|reply
[+] [-] rplacd|12 years ago|reply
[+] [-] reirob|12 years ago|reply
"[..] I had never been to California before, so I was discovering San Francisco, my favorite city in the US and second-favorite city in the world. [..]"
Out of curiosity does somebody know what is PG's first-favorite city in the world?
"[..] It was also a massive validation of a thesis Steele had argued for his Master's, which was that CPS was a great intermediate representation for a compiler [..]"
What does CPS mean? Further down in the article
"[..] Richard Kelsey took his front end, which was a very aggressive CPS-based optimiser, and extended it all the way down to the ground to produce a complete, second compiler, which he called "TC" for the "Transformational Compiler." His approach was simply to keep transforming the program from one simple, CPS, lambda language to an even simpler one, until the language was so simple it only had 16 variables... r1 through r15, at which time you could just kill the lambdas and call it assembler. [..]"
So I guess CPS means Continuation-Passing-Style? But then I wonder what PG means by "CPS was a great intermediate representation for a compiler"?
Even further down we can see PG's own opinion:
"[..] So the lineage of the CPS-as-compiler-IR thesis goes from Steele's Rabbit compiler through T's Orbit to SML/NJ. At which point Sabry & Felleisen at Rice published a series of very heavy-duty papers dumping on CPS as a representation and proposing an alternate called A-Normal Form. ANF has been the fashionable representation for about ten years now; CPS is out of favor. This thread then sort of jumps tracks over to the CMU ML community, where it picks up the important typed-intermediate-language track and heads to Cornell, and Yale, but I'm not going to follow that now. However, just to tell you where I am on this issue, I think the whole movement from CPS to ANF is a bad idea (though Sabry & Felleisen's technical observations and math are as rock solid as one would expect from people of their caliber). [..]
Fascinating stuff, though I do not what IRs are used by the modern compilers like LLVM, GHC, Java, .Net?
"[..] Jim Philbin, like Kelsey, also went to NEC, where he built an operating system tuned for functional programming languages, STING (or perhaps it was spelled "STNG" -- in any event it was pronounced "sting"). He built it in T, of course, and one could see that it had its roots in his work on T3. (Implementing the runtime for a functional language, in some sense, requires you to implement a little virtual OS on top of the real, underlying OS.) [..]" - Does anybody know what happened to this OS? The only link that I found is http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.4...
The summary of this article then goes to say "[..] Sting is currently a prototype implemented on an eight processor Silicon Graphics PowerSeries 480 running Unix. The next step is to integrate Sting into the micro-kernel of an operating system such as Mac [..]" - Would be interesting if this fruited in some new works for VMs for functional languages.
[+] [-] larsberg|12 years ago|reply
There are a couple of things going on with CPS as implemented in this style of compilers that make it particularly nice for writing optimization passes:
1) There are only function calls ("throws") and no returns. Though there's no theoretical reduction in the stuff you have to track (since what is a direct-style return is now a throw to the rest of the program starting at the return point), there's a huge reduction in in the complexity of your optimizations because you're not tracking the extra thing. For some examples, you can look in the Manticore codebase where even simple optimizations like contraction and let-floating are implemented in both our early direct-style IR (BOM) and our CPS-style IR (CPS). The latter is gobs more readable, and there's no way at all I'd be willing to port most of the harder optimizations like reflow-informed higher-order inlining or useless variable elimination to BOM.
2) IRs are cheap in languages like lisp and ML. So you write optimizations as a tree-to-tree transformation (micro-optimization passes). This style makes it much easier to enforce invariants. If you look at the internals of most compilers written in C++, you'll see far fewer copies of the IR made and a whole bunch of staged state in the object that's only valid at certain points in the compilation process (e.g., symbols fully resolved only after phase 2b, but possibly invalid for a short time during optimization 3d unless you call function F3d_resolve_symbol...). Just CPS'ing the representation of a C++ compiler without also making it easy to write your optimization passes as efficient tree-to-tree transformations will not buy you much, IMO.
[+] [-] dteoh|12 years ago|reply
[+] [-] spot|12 years ago|reply
[+] [-] mzl|12 years ago|reply