top | item 7924582

C++ design goals in the context of Rust (2010)

80 points| krat0sprakhar | 11 years ago |pcwalton.blogspot.com | reply

56 comments

[+] toolslive|11 years ago|reply

I want to comment on one quote from the article: """It's impossible to be "as fast as C" in all cases while remaining safe""".

C is not that fast. One of the major problems is that it's close to hardware. 1970s hardware that is. Ken Thompson reportedly once said: "I'm not going to do nibbles. I have an 8 bit processor".

A good example of how bad it has become is that modern processors have a rather good understanding of the 'string' concept, and offer instructions to process them. C offers a char*.

Another problem is that C has strict contracts on how parameters are to be passed through, and combined with separate compilation units, this hurts compilers when they try to optimize things.

I see greater potential for a safe higher level language to be able to align closer to modern day hardware than C. Some nice examples: Linear types can avoid garbage altogether, and coroutines can be expressed clearly and correctly using monads. On the other hand, raw performance is rarely needed, and most cycles are burned interpreting things like python and php.

[+] StephenFalken|11 years ago|reply

As a counterpoint, Alex Stepanov (the primary designer and implementer of the C++ Standard Template Library), had this crucial insight about the C programming language pervasiveness:

Let's consider now why C is a great language. It is commonly believed that C is a hack which was successful because Unix was written in it. I disagree. Over a long period of time computer architectures evolved, not because of some clever people figuring how to evolve architectures---as a matter of fact, clever people were pushing tagged architectures during that period of time---but because of the demands of different programmers to solve real problems. Computers that were able to deal just with numbers evolved into computers with byte-addressable memory, flat address spaces, and pointers. This was a natural evolution reflecting the growing set of problems that people were solving. C, reflecting the genius of Dennis Ritchie, provided a minimal model of the computer that had evolved over 30 years. C was not a quick hack. As computers evolved to handle all kinds of problems, C, being the minimal model of such a computer, became a very powerful language to solve all kinds of problems in different domains very effectively. This is the secret of C's portability: it is the best representation of an abstract computer that we have. Of course, the abstraction is done over the set of real computers, not some imaginary computational devices. Moreover, people could understand the machine model behind C. It is much easier for an average engineer to understand the machine model behind C than the machine model behind Ada or even Scheme. C succeeded because it was doing the right thing, not because of AT&T promoting it or Unix being written with it.

[+] personZ|11 years ago|reply

A good example of how bad it has become is that modern processors have a rather good understanding of the 'string' concept, and offer instructions to process them. C offers a char.

The x86 string functionality that I'm aware of was directly derived from C string functionality, and any decent library of course uses them. You don't directly map to those underlying opcodes because that would be silly, and would completely undermine any platform independence you might have.

The same for vectorization. You don't explicitly express vectorization in your code, but of course all decent C compilers can easily and robustly generate such code.

C isn't close to the hardware (beyond very high level notions like "contiguous memory"). But it's a simple enough language that it's very heavily optimizable.

[+] joelthelion|11 years ago|reply

I see this comment pop back every once in a while. I'm sure your arguments have value; yet, for some reason, C is always among the winners in almost every benchmark I've ever seen, and also in my personal experience (except for numerical stuff where fortran might be a little faster).

Why is that so? Do more modern languages need another twenty years before they can compete with C (and eventually beat it performance-wise) ?

[+] Peaker|11 years ago|reply

> C is not that fast. One of the major problems is that it's close to hardware.

These days the most important factor to performance, in almost all programs, is memory accesses.

Writing cache-friendly code is made relatively easy by C. You control the placement of your data into aligned cache lines. You control the order in which the cache lines are accessed.

The cost of the computational instructions is usually swallowed by the memory misses.

C's common ABI (for x86/64) now passes up to 6 parameters via registers, and doesn't unnecessarily spill to the stack. Additionally, link-time optimizations now allow cross-compilation-unit inlining which can optimize away the ABI costs.

[+] WalterBright|11 years ago|reply

One performance issue C has is 0 terminated strings. String processing code is constantly hampered by having to do strlen or the equivalent to find the end of the string. 0 terminated strings also means that 'slices' of strings cannot be strings themselves, a copy must be made.

Of course, a C program doesn't have to use C strings, but so many libraries and APIs do (including the C standard library) that they are unavoidable for most practical purposes.

D eliminates this problem by using dynamic arrays to represent strings. A dynamic array is a pointer/length pair.

[+] userbinator|11 years ago|reply

Another problem is that C has strict contracts on how parameters are to be passed through, and combined with separate compilation units, this hurts compilers when they try to optimize things

Calling conventions can be a bottleneck but this is not unique to C, any other compiled language allowing separate compilation units to be linked together has the same issue. Techniques like LTCG can avoid it.

But given that C is almost always at or near the top of benchmarks both for size and speed, maybe it's not that much of a bottleneck after all.

[+] unknown|11 years ago|reply

[deleted]

[+] actsasbuffoon|11 years ago|reply

The post is interesting, but was written in 2010. As you'd expect, some of the details have changed since then. For instance, garbage collected pointers have been moved out of the core language into a library.

[+] sanderjd|11 years ago|reply

I didn't realize how old it was until I got to the bottom. It's cool how the project has stayed true to the goals lined out way back then. While lots of things have been changed and removed, as far as I can tell, none of those high level goals has been compromised, while some have even been reinforced.

[+] haberman|11 years ago|reply

> The runtime features a separate C stack that's very quick (8 instructions) to switch to

I hadn't heard of this -- is it still true now that segmented stacks are gone? If so, why does C need a separate stack?

[+] pcwalton|11 years ago|reply

This is out of date. Calling from Rust into C is now as fast as calling from C into C.

[+] ehsanu1|11 years ago|reply

Not sure this is up-to-date, but: http://doc.rust-lang.org/guide-ffi.html#stack-management

tldr; C's stack is interleaved with Rust's stack.

[+] Narishma|11 years ago|reply

I was confused by the talk about keeping GC to a minimum until I noticed the article is 4 years old.

[+] unknown|11 years ago|reply

[deleted]

[+] a8da6b0c91d|11 years ago|reply

I don't know much about Ada but it seems like Rust vs. Ada is perhaps the more salient question than Rust vs. C++. Why was Ada not good enough in terms of a very "safe" language?

[+] pjmlp|11 years ago|reply

Ada was good enough, but it suffers from a bad reputation.

On the early days it was deemed too complex to implement, although I would say C++ became even more complex.

The companies that sold Ada compilers had customers with deep pockets, so Ada compilers were too expensive and required worksations to be used properly.

When affordable Ada compilers became available, not many cared about it.

Nowadays it has found its place where human lifes are at risk. Many avionic systems, train control systems, hospital devices are coded in Ada.

I do attend FOSDEM regularly and also get the feeling its use is increasing in Europe thanks to the security exploits in languages tainted by C compatibility.

[+] pcwalton|11 years ago|reply

Ada doesn't have the same memory safety guarantees when using uniquely-owned data.

[+] Dewie|11 years ago|reply

Isn't Ada geared more towards runtime assertions of correctness of programs (contracts and such) rather than ensuring memory safety? Can you statically guarantee memory safety in Ada when using manual memory management?

It seems that one can choose to both use runtime assertions and static proofs of correctness proofs in Spark. But I don't know if that extends to statically ensuring memory safety in low-level code.

[+] spelunky|11 years ago|reply

Coming from a modern language background like Swift it appalls me that I still have to use semicolons in Rust.

[+] coldtea|11 years ago|reply

Swift came out less 2 weeks ago. As a beta announcement. So most certainly nobody comes from a "modern language background like Swift".

Plus, it's not like semicolons are any big deal. If they are in a language, you add them and move on. Dead simple to add, minimal noise, instantly familiar to most C-derivative programmers. The only comminity that regularly complaints about them are hipster (for lack of a better term) javascript programmers.

[+] JOHN_BONER|11 years ago|reply

Well, semicolons have special meaning in Rust. In some contexts, like the last expression in a function or other expression block, if you don't use a semicolon it will return the value that the expression evaluates to. If you do use a semicolon, it won't return that expression's value. This is good, because it lets you avoid writing the "return" keyword all over the place, but it also lets you avoid returning a value with just a single character!