top | item 13312484

Getting Past C

461 points| ingve | 9 years ago |blog.ntpsec.org

486 comments

order
[+] e3b0c|9 years ago|reply
Rust has some very desirable properties to me. Writing Rust programs from scratch is not as scary as I've heard of from the internet either. The documentation is excellent, the compiler diagnostic messages are very helpful and the notorious borrow checker didn't stand in my way that much. And I love Cargo and Cargo.io. I have some projects where Rust is the saner choice than Go or other GC based languages.

That said, there are actually drawbacks of Rust compared with Go, IMHO. When facing a moderately large project written by others, the ergonomics for diving into the project is not as smooth as Go. There is no good full-source code indexer like cscope/GNU Global/Guru for symbol navigation across multiple dependent projects. Full text searching with grep/ack does not fill the gap well either since many symbols, with their different scopes/paths, are allowed to have the same identifier without explicitly specifying the full path. That makes troubleshooting/tracing a large, unfamiliar codebase quite daunting compared with Go.

[+] xorxornop|9 years ago|reply
Hmm, I've had a very nice experience using Rusty Code in VS Code. Some useful refactoring functionality is missing for sure, but a lot of that will become possible quite shortly from RLS (Rust Language Server, a la how TypeScript works in VSC), and if your preferred editor has support for the language server spec (it's a open source common spec, not specific to Rust), it will support it at parity, too.
[+] Manishearth|9 years ago|reply
YouCompleteMe on Rust has pretty good JumpToDefinition support for Rust.

You can also use https://github.com/nrc/rust-dxr to index rust code via DXR.

IIRC ctags also works with Rust.

RLS should cover this pretty well too once it happens.

[+] wocram|9 years ago|reply
I haven't had too many issues with Intellij-rust or racer failing to jump between symbols.

There are also many other tools that provide indexing, eg. ide [plugins], kythe, and the rust language server.

[+] steveklabnik|9 years ago|reply
RLS will hopefully full that gap.
[+] dreta|9 years ago|reply
Can anybody make a strong case to me as to why are buffer overflows considered an issue in C when it takes like 10 minutes to write and test an array implementation that prevents that from ever happening? I do agree that C has issues (though in my opinion neighter Rust nor Go address almost any of them) i just don't understand why are buffer overflows such a huge problem in C when the same thing is going to come up when trying to work with memory in Rust.
[+] MaulingMonkey|9 years ago|reply
> Can anybody make a strong case to me as to why are buffer overflows considered an issue in C when it takes like 10 minutes to write and test an array implementation that prevents that from ever happening?

The CVE database. Just because you 'can' write such an array implementation doesn't mean you will, doesn't mean your third party libs will, doesn't mean any of your legacy code uses it, and certainly doesn't mean you will properly test said array implementation correctly.

The number of mitigations added to C compilers and OSes dealing mostly with C and C++ code. ASLR, W^X, /GS, -fstack-protector-all, AddressSanitizer, ... - note the lack of similar tools, or demand for them, for, say, JavaScript - despite it enjoying a similar ubiquity.

I ask this in bad faith: I encourage you to share a single nontrivial codebase which actually creates the abstraction you've described and religiously adheres to using it throughout. As to why this is in bad faith: I'm definining "nontrivial" here to mean using 3rd party APIs - which will operate on C style arrays, not your project specific safe wrappers - and thus by definition won't be "religiously" sticking to said abstractions when using said APIs. By these definitions, the codebase I'm asking for doesn't exist - by definition. Even relaxing the "third party" rule, I haven't actually worked on a nontrivial C or C++ codebase without buffer overflow problems.

Now, e.g. Rust will have the same problems when interacting with C APIs - and nontrivial programs will end up doing so eventually. However, by virtue of the language itself embracing safe-by-default, you're less likely to run into the same problems when consuming Rust APIs.

You can also use third party static analysis tools to ensure you're using a "safe C subset" (such as MIRSA C), but "nobody" does that.

[+] dbaupp|9 years ago|reply
The lack of generics means your array implementation is either going to either:

- be implemented with macros and token pasting, and result in a ton of mental overhead because you'll have a pile of types like array_foo for an array of `foo`s, and array_bar for an array of `bar`s, along with a pile of corresponding `foo * array_foo_get(array_foo, size_t)` and `bar * array_bar_get(array_bar, size_t)` functions.

- or, have a runtime cost and lose type safety by storing void* and casting when accessing.

The first case is even worse than it sounds: e.g. I don't know how you handle arrays of types with spaces in them (like `unsigned char`, or `struct bar`) with a macro. And, we haven't even thought about const correctness yet, which would probably require having const_array_foo, const_array_bar (etc.) types defined too.

(And, of course, these only solve one facet of the problems with C's pointers: there's no way to defend against use-after-free or dangling pointers.)

[+] staticassertion|9 years ago|reply
Because your 'safe' implementation will certainly have a performance cost, and won't be the default. This is why, despite C++ providing std::array, you'll still find buffer overflows in C++ code. C++'s std::array provides the safe 'at' function but you're opting into a performance penalty and it's not the more familiar [] syntax.

Rust arrays/ vectors are safe-by-default. To use the unchecked, unsafe version requires using the 'unsafe' keyword.

let v = vec![0, 1, 2]; unsafe { let x = v.get_unchecked(5); }

This means you can basically grep audit for vulnerabilities, and the above code should be very rare.

[+] naasking|9 years ago|reply
> i just don't understand why are buffer overflows such a huge problem in C when the same thing is going to come up when trying to work with memory in Rust.

False. Buffer overflows in C can overwrite the program's memory, so it can be hijacked and supplanted with the attacker's code. This cannot happen in Rust (unless unsafe code has the vulnerability), or any memory safe language.

Sure you can implement a safe array/buffer abstraction and use it in your C programs that abort on invalid indexing. Now how many actually do this? Very few given the prevalence of C programs on vulnerability disclosure lists.

[+] jdmichal|9 years ago|reply
Obviously, one can program C to do anything, and write all the provably safe abstractions wished. But, that's not really the point. The point is that doing such is not the default. It requires engagement and knowledge of the programmer, especially on distributed projects with loose communication, such as many open source projects. And it only takes one programmer mistake to bring the whole house of cards down.

Why allow programmers to make mistakes? That was fine in the 70's when resources for compiler execution were limited. I don't see any reason for it today.

I mean, just look at the underhanded C contestants and especially winners for ways in which your program can completely blow up for extremely subtle reasons.

[+] qznc|9 years ago|reply
Libc does not use your safe array. You cannot pass your safe array to read and other syscalls.
[+] pcwalton|9 years ago|reply
Because:

1. Buffer overflows aren't considered the most insidious issue in C nowadays. That award would probably go to use after free, which is not so easy to fix.

2. In C, it is easier and faster to do the wrong thing. Compare "char buf[256]; strcpy(buf, foo); ..." to "array_t buf = array_create(strlen(foo) + 1); strcpy(buf.ptr, foo); ... array_destroy(buf);"

3. Buffer overflows do not in fact come up routinely in Rust the way they do in C.

[+] josteink|9 years ago|reply
> Can anybody make a strong case to me as to why are buffer overflows considered an issue in C when it takes like 10 minutes to write and test an array implementation that prevents that from ever happening?

That it's not a by-default and forced language-feature and that most developer aren't going to spend those 10 minutes when they need an array.

They'll just use the language-provided array-implementation instead. Which in C is very, very unsafe.

[+] strictfp|9 years ago|reply
Because noone ever does that? Or at the very least people don't think of that as idiomatic c.
[+] Sanddancer|9 years ago|reply
Compiler vendors have been resistant towards putting in such features. Bounds checking slows things down, and the performance race is very much a thing in C compiler implementations -- a compiler that can deliver a few percentage points better code can be a big win to teams working on compute heavy problems. C11 has Annex K which has a lot of safety features, like memory safe arrays. Unfortunately, none of the vendors have implemented it even as an option. Which is a shame because it would solve a lot of problems, with requiring minimal rewrites for a lot of code.
[+] ssalazar|9 years ago|reply
I think it breaks down when that array has to interact with system libraries or the C stdlib in any way. A lot of C string functions have weird gotchas related to terminators and sizes, and any IO you're doing will involve raw buffers being passed into or out of a system IO function that doesn't understand custom array types.
[+] dibanez|9 years ago|reply
Its useful to be able to remove safety checks for speed. I have a C++ code where all data is in array objects. Bounds checking is a compile time option, and it makes the overall code 2X slower. I can do testing with bounds checking on, but once it gets to a supercomputer that needs to be removed. Address sanitizing by compilers is an even more effective tool for this, especially for C. Bounds checking is critical for security, but if you're only concerned with correct execution then a segfault is not much different from an exception.
[+] ChemicalWarfare|9 years ago|reply
technically you probably could limit yourself to using a "safe" subset of C - basically no pointer arithmetic, no strcopy() etc - but that would defeat the purpose of using C in a first place.
[+] JKCalhoun|9 years ago|reply
Or perhaps just some helper functions in C that wrap array and pointer allocation/access to provide sanity checks. Seems like moving to a new language is rather extreme....
[+] gsdean|9 years ago|reply
I believe your post is the strong argument you desire aka hubris
[+] jstewartmobile|9 years ago|reply
When writing modern C in a disciplined way, it is not as bad as the Rustophiles will make it out to be, but still a problem.

Scenarios where I frequently end up fixing other people's memory errors:

1. No Error Handling: not checking an error condition on a function that allocates, then using the uninitialized pointer anyway

2. Sloppy Error Handling: jumping to abort from an error without freeing what has already been allocated

3. Faith in \0: still using the old string functions

I'm on the fence about the whole thing, so others may be able to field something more compelling.

[+] ArkyBeagle|9 years ago|reply
C is the new "goto".

Y'all please, please note that dreta said "... an array implementation that prevents that from ever happening..."

[+] jstewartmobile|9 years ago|reply
A 62 KLOC secure NTP server seems like an ideal project for this kind of experiment. I imagine it would be self-contained enough to actually use Rust or Golang instead of just treating them like FFI scripters.
[+] awinter-py|9 years ago|reply
> One such cleanup: we’ve made a strong start on banishing unions and type punning from the code. These are not going to translate into any language with the correctness properties we want.

Really? This sounds like idiomatic rust to me (heavy with enums).

[+] Ericson2314|9 years ago|reply
C unions have no discriminant. But yeah it's a pity—I'd try to hack up discriminated unions then. Or convert to Rust unions and then enums and remove unsafety.
[+] aaron-lebo|9 years ago|reply
Outside of the language war bubble it's really great to see a post like this. Practical concerns, reasonable advantages/disadvantages of each language, a real project dealing with real timelines. Thanks!
[+] jstimpfle|9 years ago|reply
Was going to say the same. In the past ESR has come across as a patently arrogant gun maniac, but the first part of this post is great for all the reasons you mentioned.

(What irritated me though was the switch to first-person narrative at the end).

[+] eduren|9 years ago|reply
I'm excited to see where this goes because it could go a long way towards providing concrete data for the large "work to replace old infrastucture C code with (Rust||Go||Modern C++)" discussion that has been taking place.

More data points will help to inform discussion, or at the very least add structure to the flame wars.

[+] tptacek|9 years ago|reply
This is literally the only thing I can think of that "NTPsec" can do that would result in the project having any relevance. I understand why some very specific sites are chained to the ntpd codebase, but the vast, overwhelming majority of the ntpd deployed base not only isn't tied to ntpd, but also doesn't need 99% of what ntpd does. Trying to "secure" that codebase always seemed to me like a very silly windmill to tilt at.
[+] giancarlostoro|9 years ago|reply
I wish more mention of D would happen. It is compatible with C and C++ libraries and features GC without sacrificing the good things of C and C++. I always loved the idea of Rust and Go but they are nowhere near C or C++ where it matters to me. D fits the bill, otherwise I just use Python. I like being able to design software in my own way as opposed to being told how to do it.
[+] falcolas|9 years ago|reply
How much concern would non-standard architecture support matter for ntp? Given how many architectures Linux supports, I would think that C would still be the best choice, until these other languages gain support for those missing architectures.

Or perhaps it's a good opportunity for a language which offers transpilation with ANSI C as the target?

[+] w8rbt|9 years ago|reply
I would start by getting the code to compile with g++, then begin migrating the dangerous C constructs to safe C++ constructs. IMO, that would be a safe, reasonable thing to do.
[+] zzzcpan|9 years ago|reply
After reading this post the idea of a C-to-C translator that injects bound checking, etc. comes to mind. Such translator could be used by OS distributions to provide safety in the least intrusive way and possibly completely automatically for many C codebases they have in their repositories. Translating into Go or Rust, on the other hand, cannot scale beyond some individual projects, that decide to undertake such efforts. Mainstream C compilers could implement safety features too, but realistically it cannot happen, as it's not something most people care about. So, C-to-C translator might be a best bet with the most impact.
[+] acjohnson55|9 years ago|reply
It's not yet ready for primetime, but Scala Native (http://scala-native.readthedocs.io/en/latest/) might just make a splash in the systems space. I don't think it has anything like ownership yet, but I wouldn't be surprised if it eventually develops that capability. I think you can get it to run without GC, too, but using C Stdlib memory management. Although, that largely defeats the memory-safety.

Just throwing it out there as something to keep an eye on!

[+] jaco8|9 years ago|reply
Looking at the current new and coming languages I would take a hard look at NIM. It may be is not there yet but it looks highly appealing, is as fast as the often mentioned Rust and compiles significantly faster. http://nim-lang.org/
[+] nimmer|9 years ago|reply
Nim compile times are faster and often the execution time is faster, too. Also it's very expressive, almost like Python.
[+] rafinha|9 years ago|reply
I didn't understand why rust and go are natural alternatives to C. Wouldn't C++ be a more natural option? (Despite the fact that both go and rust are developed by third party companies)
[+] jstewartmobile|9 years ago|reply
I did a lot of C++ years ago, so maybe things have changed since then, but I think Rust and Go addressed a lot of the design flubs of C++.

My experiences getting things to compile across gcc and visual c++, dealing with strings (especially Microsoft's WCHAR), reliable integer sizes (pre stdint.h), and debugging templates were not things I would wish on anyone.

Re-doing some of my side projects in Go and Rust was a lot more enjoyable. I could focus on what I was doing instead of trying to work around deficiencies in the language and its libraries.

[+] jimbokun|9 years ago|reply
"out of C into a language with no buffer overruns, and in general much stronger security and correctness guarantees."

Doesn't sound like C++.

[+] AgentME|9 years ago|reply
None of the safety issues brought up about C there are solved by C++. C++ is (nearly) a superset of C, so it inherits all of those issues.
[+] blub|9 years ago|reply
C++ would be a great alternative, assuming the developer(s?) working on the project didn't hate it, as is the tradition: http://esr.ibiblio.org/?p=532

Oh well, at least they're moving from C, which will be a big win either way.

[+] tannhaeuser|9 years ago|reply
My experience is that using C for main(argc,argv)-style programs is rarely a problem. Trouble comes when using long running single-address space containers for service-like abstractions with pthreads etc.; in that kind of environment, malloc() and co. don't cut it because even if you get memory allocation right, unless using pooled memory allocators, memory fragmentation is becoming a serious (ie. unsurmountable) problem.

It's been said over and over since at least the Java times that creating OS processes for individual service invocations is bad for performance, but I've never seen proof for this statement in the form of a benchmark.

Even the OpenBSD developers (who know a thing or two wrt. security of memory allocation schemes) diss process-per-service-invocation architectures in their httpd implementation (eg. calling their CGI bridge "slowcgi" and favouring fcgi over it).

Isn't that inconsequential? I mean if there's a performance problem with CGI-like process-per-service invocations, why not target these problems at the OS level (or via pooling of network connections or whatever the bottleneck is)?

[+] amadvance|9 years ago|reply
Rust is surely fine, and an improvement over C, but its main advantage is that all the rust code is written now, when everyone takes more care about security.

It doesn't have to deal with 40 years of bad legacy code written by sloppy developers.

You can obtain similar quality in a C modern code-base, using tools like static and dynamic analyzers. In fact, today the hardest issues came from multi-threading. I won't even dare to write multi-threading apps without helgrind/TSAN.

And Rust doesn't help in this regard. From: https://doc.rust-lang.org/nomicon/races.html 'So it's perfectly "fine" for a Safe Rust program to get deadlocked or do something incredibly stupid with incorrect synchronization.'

[+] ericfrederich|9 years ago|reply
I like go, I'd love to write libraries in it but as far as I can tell you can't really create a C compatible shared library from it. That it still the common denominator if you want to call into it from other languages. I'd love to write Python programs with performance critical stuff in Go
[+] asdaksdhksajd|9 years ago|reply
why does it take 62KLOC of C to distribute time? thats more code than the whole plan9 kernel.
[+] arunmu|9 years ago|reply
I do not contest on the opinion that Rust is a good language, but it slightly hurts me when people club C and C++ together. One can easily write correct by construction code using modern C++. Use of meta-programs allows you to create typesafe constructs. It provides you with zero cost abstractions to specify ownership of resources and ..... <I can go on> One has to just strive to not use the C baggage that comes with it.
[+] kcudrevelc|9 years ago|reply
> Under Linux, some SECCOMP initialization and capability dances having to do with dropping root and closing off privilege-escalation attacks as soon as possible after startup.

I was under the impression that these specific things were actually quite hard to do in Go. I believe that both setuid/setgid and seccomp_load change the current OS thread (only), and since Go multiplexes across multiple threads and gives programmers very little control over which ones are used for what goroutines, I'm not sure how you would, for example, apply a seccomp context across all threads in a Go program. setuid/setgid are currently unsupported for this reason, with the best method being "start a subprocess and pass it file descriptors" (https://github.com/golang/go/issues/1435).

I'd be interested to hear if others have found ways to actually do this reliably for all OS threads underlying a running Go process.

[+] eggy|9 years ago|reply
I didn't see Ada or ATS mentioned upon a quick read here. I don't program in either beyond exploratory play, but they seem to fit the need here no?