top | item 15499160

C0, an Imperative Programming Language for Novice Computer Scientists (2010) [pdf]

106 points| philonoist | 8 years ago |citeseerx.ist.psu.edu | reply

111 comments

order
[+] macintux|8 years ago|reply
For everyone complaining about how they taught themselves pointers while in diapers so why can't college freshmen already do this stuff...

1st: Restricting the set of students who can succeed in CS 101 to those who have already done extensive programming is one way to make sure the field never diversifies.

2nd: Computer science is not (supposed to be) about programming. Let's teach them how to think before we teach them how to be language lawyers.

[+] mac01021|8 years ago|reply
Computer science is (and must be) about programming.

It need not be about C, or present-day digital computers, or even anything to do with the von Neumann model.

But there must always be a formal notation for describing computations, and students of Computer Science must always learn to treat that notation with rigor and come to grips with its nuances.

Probably too they must always deal with the concept of indirection, and think about references to (objects that contain) other references.

I'll grant you, though, a language with a saner notation for describing pointers (and types in general) would be easier to learn than C, and could still be equally useful - if not moreso.

[+] norswap|8 years ago|reply
But should a language based on C -- of all things, be taught?

I suspect they do this to transition students to real C later... Won't that give them bad habits later down the line?

Seems like a lot of work just to avoid going with Java or Go. With Java, eschew inheritance (not a bad habit at all) and wildcards at first, and you're fine.

[+] grayhatter|8 years ago|reply
1st, are you sure that's a good thing? I don't mean diversification, I mean graduating students who can't grasp concepts that experts are sure is easy/simple. Pointers, like everything else, is hard... until you understand them. (Perhaps CS 101 needs an additional prerequisite)

2nd, sure, then pick something easy, like [any of the thousand of other languages] Don't hand them a "broken" version of C, just because it's easier.

2nd part 2... What do you think CS is if not about programming?

[+] murph-almighty|8 years ago|reply
I was an ECE major at CMU.

C0 basically served as an introduction to C, with contract annotations. The contract annotations were used to enforce more thinking about your code (i.e. proving your code worked via pre/post/loop conditions). There were still pointers, there was just a bit of syntactic sugar surrounding it.

About 2/3's through the course we switched to full-blown C, at which point we had to learn pointers. The intro to computer systems course (think malloc, proxy, and buffer overflow labs) used C as well, and that in turn required us to use pointers. There's no "losing out" on learning pointers. It's basically a step-stool to C.

At the time, I had a pretty good grasp of pointers and thought C0 was a bit silly- I'd never used contracts in my life. I currently work as a full-stack web developer (bit of a twist from my intended path) and I've never seen anyone "write out contracts" like I did in C0. I now think it was a valuable exercise in that it forces you to define the parameters that allow your code to run.

[+] dbcurtis|8 years ago|reply
C with training wheels?

So it seems to me that this course is trying to accomplish two things that are mostly orthogonal: 1) teach principles of computation, 2) teach where the semicolons and curly braces go.

#2 is a valid goal if you eventually want to ramp someone up on C/C++ without suffering language switching turbulence. But overall, I would expect that an intro course should be teaching fundamentals first. Language syntax is a completely secondary issue with respect to learning good imperative programming practices.

Languages come and go. Semicolons come and go. Yes, language switching causes speed bumps, I switch between C (embedded) and Python regularly. Twitchy-semicolon-pinky is a thing. Inventing another language to teach the basic corpus of imperative programming ideas seems like a lot of effort for little lasting payoff.

[+] tinalumfoil|8 years ago|reply
Although I learned to program past when it was popular, I can't shake the feeling that BASIC is the best way to teach programming. The basis of any computer is the sequential execution of instructions, and it's this model of computation that informs things like algorithmic complexity or Turing completeness. Even if teaching BASIC wasn't possible, I would still prefer to see a subset of x86 assembly taught than something ridiculous like "a simpler C" which obscures the core functionality of a computer with abstractions, without teaching a practical language.
[+] hackermailman|8 years ago|reply
The course it was made for at CMU (15-122) is primarily about writing contracts for imperative programs, I guess they decided C0 was the best way to do that since students have to take C anyway in 15-213 afterwards. Robert Harper's blog he apparently pushed for freshmen to land directly in a SML course where they would learn both imperative and functional methods but for legacy reasons they all agreed C should still be taught but with an emphasis on correctness/contracts so C0 was born.
[+] lmm|8 years ago|reply
If you're willing to break compatibility with C, why not make a clean break and use a language that better supports the things you want? The authors go on (correctly) at great length about the difficulty of union types in C compared to an ML-family language; why not use an ML-family language? (My university used Standard ML for their first programming course and I see it as a good choice)
[+] instruction-ptr|8 years ago|reply
The course that uses C0 goes on later to C, and is literally titled “Principles Of Imperative Computation”. We use Standard ML for “Introduction to Functional Programming”.

- source: I’m a TA for this course

[+] instruction-ptr|8 years ago|reply
Also - we don’t break complete compatibility with C. We just restrict the amount of C things they are allowed to use
[+] Avshalom|8 years ago|reply
I mean I get the idea but

  No floating point datatypes
  No ... switch statements or do...while loops
seem like terrible things to leave out, occasional gotchas of floating point not with standing being only integers means no division
[+] lou1306|8 years ago|reply
No floating point types is a little harsh, but IMHO one should only use them when they really understand their implications. Otherwise they might be tempted to do Bad Stuff like using float for currencies, etc.
[+] instruction-ptr|8 years ago|reply
“No division”? How do you reason this? Integer division is a thing
[+] rootbear|8 years ago|reply
The first structured programming language I learned was a home grown language called Simpl-T, at the University of Maryland. It did not have floating point, but one could still do an awful lot with it. Other members of the Simpl-X family did have floating point, as I recall.
[+] thecompilr|8 years ago|reply
C language is as simple as it gets. The only concept people struggle with is pointers, but even that is a very simple concept. Teaching something that is simpler than C to computer scientists will breed very poor engineers.
[+] flafla2|8 years ago|reply
I am currently a TA for 15-122 at Carnegie Mellon, the class that C0 was originally made for. I would argue that C0 does an excellent job at breeding good programmers because of the following features:

1. Contracts in C0 bake preconditions, post conditions, and loop invariants right into the language. Contracts are only checked when in "dynamic recompilation" mode (basically debug mode). This allows you to check the correctness of your code more easily (and, importantly, in an easily gradable way).

2. The language is designed in such a way to introduce C to programmers that are used to other higher level languages. In the C0 phase of the course we introduce students to fundamental concepts like memory allocation, code reasoning and contracts, binary representations of numbers, and unit testing. Next we transition to C1 (included in the C0 install) which introduces typedefs and other features of C. At this point we focus on introducing data structures. Finally we make the final transition to C, where we talk about undefined behavior, gcc, stack vs heap, etc.

I've taken this course myself, and I would argue that this measured approach to learning C is far superior to throwing students (most of whom have no low-level programming experience) straight into C.

[+] dasimon|8 years ago|reply
While well-constructed C programs are as minimal and straightforward as programs come, the language is certainly full of pitfalls (just look at the underhanded C contest). Keep in mind that most computer science students in their first intro course have not yet developed the mental models that experienced programmers take for granted. Students will try doing all sorts of things which don't make much sense, and C is overly permissive of these things unless you introduce additional tooling, e.g. valgrind, which can become very overwhelming for intro-to-CS students who are still feeling lost.

I agree that C0 on its own is overly simplistic, but the way it is used at CMU is as 'C with training wheels' - strict type checking, no potentially-unsafe pointers to the stack, dynamic array bounds checking, etc. Then, two thirds into the semester, students transition to real C, and learn how to correctly manage those potential pitfalls.

This is how students at CMU have been learning C since 2010, and graduates don't seem to be any worse off for it.

[+] SeanDav|8 years ago|reply
> "The only concept people struggle with is pointers, but even that is a very simple concept. "

The concept of pointers is simple, but their implementation has all sorts of subtle layers:

- interactions and apparent fungibility between pointers, arrays and strings

- how about: ( * (void( * )())0)();

- what about: * (p++) vs ( * p)++ or "++ * p", "* p++" and "* ++p" (Quotes for display reasons only)

- & vs * and various combinations of these

- pointers to pointers

- pointers to arrays of strings

- pointers to allocated memory that is/are not in scope anymore.

- Null pointer

- Many others!

[+] fenwick67|8 years ago|reply
> C language is as simple as it gets

Two types of files, compiler directives, pointers, malloc and friends, are you really sure about that? Any non-trivial C program will use these, on TOP of the basic syntax that first timers have to learn.

There is a reason it is usually not taught as a first language, and Java, Python, C# or even Assembly are.

[+] CodesInChaos|8 years ago|reply
C is a very tricky language, because a lot of constructs that work when naively translated to assembly (on common CPUs like x86 or ARM) are actually undefined behaviour.
[+] camus2|8 years ago|reply
C is not simple, C is language with a limited syntax and constructs which is not quite the same. Being proeficient in C means learning correct manual memory management, macros, an obtuse std lib, a compilation tool chain and management tools like Make, autotools and co. This is absolutely not simple. And don't get me started on string management with C.
[+] eru|8 years ago|reply
C only has a few parts, but their interaction is tricky. That relates also to C's lack of (non-leaking) abstractions.

At the very least, you want something with friendlier behavior in the face of errors than what C gives you.

[+] ci5er|8 years ago|reply
> The only concept people struggle with is pointers

Why is that?

I've tried to explain pointers to people who struggle with the very small number of pointer-related concepts, and I ... fail.

But I don't know why I am failing. These are not stupid people - and I speak the English language well enough - so those aren't the reasons...

[+] maxxxxx|8 years ago|reply
Maybe my brain is weird but I read Kernighan/Ritchie and had no problem grasping pointers. It's actually one of the simpler concepts compared to concurrency and others. But I know a lot of smart people who don't get pointers. I wonder if it's the teaching.
[+] eru|8 years ago|reply
> The only concept people struggle with is pointers, but even that is a very simple concept.

Recursion?

[+] SeanDav|8 years ago|reply
I would have thought that Pascal is a great candidate for a simpler, safer language that still has enough features to write serious applications.
[+] JTbane|8 years ago|reply
One can see why they chose a more-safe version of C for an intro course- beginner C programmers can fall into a wide variety of difficult to debug pitfalls (memory leaks, out of bounds reads and writes, undefined behavior) that can make the task of teaching and grading a nightmare.

My university opted to use C++ as the introductory programming language, since it is somewhat more safe. Now they are transitioning to even safer languages like Python and Javascript. While I don't agree with it, I can see why it is necessary to reduce the headache for instructors.

C0 seems like a good tool to use for teaching, although it may lead to bad habits later.

To wrap it up, in my opinion students should definitely be exposed to the concepts of the C family of languages early on: imperative programming, pointers, dynamic memory allocation, et cetera. All the pitfalls associated with these concepts also need to be explored.

The language doesn't necessarily matter as long as the students are allowed to make mistakes and learn from them.

[+] Shoop|8 years ago|reply
Here is some more information about C0:

C0 is used in 15-122, Carnegie Mellon's second semester introductory programming class. I am currently taking this class. The class teaches C0 for the first two thirds of the semester and then transitions to C for the final part of the semester.

C0 has essentially the same syntax as C, but for pedagogical purposes the semantics of the language have been changed to be much more similar to java. Instead of manual memory management, there is a garbage collector, and user defined structs can only be allocated on the heap. Additionally, there is only one integer type which is a signed 32-bit two's complement integer. Overflow is defined as wrapping. strings are opaque immutable types implemented at a language level, rather than just null-terminated buffers. C0 catches out of bounds array accesses at run time. While C0 internally stores array lengths to perform this check, the length is not accessible from C0 code so array lengths must be passed around just the same as C.

C0's biggest feature beyond C is that it allows for contracts. Functions can have preconditions and postconditions. There are also loop invariant contracts and assertions. These contracts are checked at run time when specific flag is passed to the compiler or interpreter (C0 has both, and uses a bytecode internally). A major focus of the class is proving the safety and correctness of the programs we write using these contracts. We are encouraged to prove safety and correctness using only logical reasoning from contract to contract, not operational reasoning. A co-requisite of the class is a course in proofs. There seems to be a heavy influence on the teaching style from Dijkstra's "On the cruelty of really teaching computing science" [1].

The 15-122 class has written assignments focusing on the proofs part of the course and programming assignments which for the most part are automatically graded by autolab [5]. Our code is tested against a number of unit tests to ensure its correctness and our contracts are also tested to see if they can catch bugs in faulty implementations.

C0's implementation is not open source, but it appears to be used in ongoing research [2]. Here is the language website and course website [3], [4].

[1] https://www.cs.utexas.edu/~EWD/transcriptions/EWD10xx/EWD103...

[2] http://www.cs.cmu.edu/~fp/papers/cc016.pdf

[3] http://c0.typesafety.net/

[4] http://www.cs.cmu.edu/~15122/

[5] http://www.autolabproject.com/

[+] acmustudent|8 years ago|reply
A cmu freshmen on hn and over 900 days old. The one thing i don’t like about this school is that most cs majors aren’t the hn type
[+] azhenley|8 years ago|reply
Does anyone know of a website containing resources for teaching this? I'd also be interested in seeing who is using it now a days.
[+] pklausler|8 years ago|reply
Using a weird local subset of C as an introductory imperative programming language seems like a really bad idea to me, unless you're using your weird local subset of C as the source language for a compiler construction course.

There's just so many other good options.

[+] fpig|8 years ago|reply
I thought this would be for kids, I don't see the point of a simplified language for college students.

However, if they as someone said use it in a compiler-writing course, for that purpose it makes sense to have a simplified language. I had to do that back in college and basically used a C subset for it (for the compiled language, not the implementation language).

[+] instruction-ptr|8 years ago|reply
To quote a fellow TA below:

“I am currently a TA for 15-122 at Carnegie Mellon, the class that C0 was originally made for. I would argue that C0 does an excellent job at breeding good programmers because of the following features:

1. Contracts in C0 bake preconditions, post conditions, and loop invariants right into the language. Contracts are only checked when in "dynamic recompilation" mode (basically debug mode). This allows you to check the correctness of your code more easily (and, importantly, in an easily gradable way).

2. The language is designed in such a way to introduce C to programmers that are used to other higher level languages. In the C0 phase of the course we introduce students to fundamental concepts like memory allocation, code reasoning and contracts, binary representations of numbers, and unit testing. Next we transition to C1 (included in the C0 install) which introduces typedefs and other features of C. At this point we focus on introducing data structures. Finally we make the final transition to C, where we talk about undefined behavior, gcc, stack vs heap, etc.

I've taken this course myself, and I would argue that this measured approach to learning C is far superior to throwing students (most of whom have no low-level programming experience) straight into C.”

[+] marcosdumay|8 years ago|reply
It is garbage collected, and includes both a heap of safety features and better compiler messages. This is a very nice set of features for people learning about data structures and memory management.
[+] augustk|8 years ago|reply
Why not use Oberon instead?
[+] baldfat|8 years ago|reply
Here is the Tutorial Wiki