Segfaults are our friends and teachers

[+] kibwen|10 years ago|reply

To be absolutely clear, the behavior exhibited in the OP is indeed considered a bug by the Rust developers: see https://github.com/rust-lang/rust/issues/16012#issuecomment-... for the latest discussion. TL;DR: this currently isn't exploitable on Windows, and patches to LLVM adding support for stack probes will ideally allow this to be easily solved for non-Windows platforms as well.

[+] Annatar|10 years ago|reply

From the article:

> The Rust program got a segmentation fault because it attempted to write to inaccessible memory, but only through the stack pointer. None of the undefined behaviors disallow this, which I think is why it’s ok for this Rust program to segfault.

What I got out of that is that Rust does not work as advertised if there are still situations where a program could segfault. The entire premise of Rust, as I understood it at least, is that it does things in a safe manner and the programmer does not have to worry about it. Now I learned that there are undefined behaviors. In my view, for a language that bills itself as safe, there should not exist such things as undefined behaviors. As far as I am concerned, then, based on the advertising of Rust, this is false advertising.

[+] steveklabnik|10 years ago|reply

It's not false advertising; it's a bug. Rust is not perfect. It's also not undefined behavior. https://github.com/rust-lang/rust/issues/16012

(Basically, stack probes are only implemented on Windows, and we need them on the other platforms, but it hasn't been implemented yet, for various reasons.)

[+] kalmar|10 years ago|reply

> In my view, for a language that bills itself as safe, there should not exist such things as undefined behaviors. As far as I am concerned, then, based on the advertising of Rust, this is false advertising.

The key thing from the reference [0]:

> "Type checking provides the guarantee that these issues are never caused by safe code."

It's subtle, but I think the situation is that "some segfaults are caused by undefined behaviour, and some undefined behaviour causes segfaults". Neither fully contains the other. One thing I was trying to get across is that a segmentation fault has a very specific meaning, and that meaning is not "bad thing was done with pointers".

[0]: http://doc.rust-lang.org/reference.html#behavior-considered-...

[+] tatterdemalion|10 years ago|reply

It would be better if everyone would refrain from making comments like this about any project ever. You have made an inflammatory, unnuanced, and ill-informed interpretation of a complicated issue and used it to publicly thrash a lot of people's hard works. You are lowering the discourse.

I know that Rust in particular tries to be very open about the caveats to its claims, but this is the kind of commentary that causes any ambitious project to try to minimize and hide its weaknesses instead of openly and honestly discussing them. I opened this discussion knowing that there would be a comment just like yours, and it made my heart feel heavy.

[+] wofo|10 years ago|reply

Of course there are cases in which you can get segfaults, and not only in Rust. The single fact of having a C FFI is enough to break all safety guarantees of your language.

What Rust does promise is that undefined behavior will never happen if you don't use code blocks marked as unsafe (which include the C FFI). In other words, if you are able to trigger undefined behavior without using unsafe code, that is a compiler or library bug.

EDIT: as noted by steveklabnik, this is not undefined behavior. Actually you should get a better error message instead of a segmentation fault (see the link to the issue on GitHub in steveklabnik's comment).

[+] Sean1708|10 years ago|reply

For those wondering about segfaults specifically in Rust (I know it's not the point of the blog post but it might be interesting to others), this thread talks about why they occur/whether they'll ever be eliminated entirely:

https://users.rust-lang.org/t/rust-guarantees-no-segfaults-w...

[+] steveklabnik|10 years ago|reply

One small update to that thread: hitting the guard page no longer sends SIGSEV, but SIGILL: https://github.com/rust-lang/rust/pull/31333

[+] EliRivers|10 years ago|reply

The first sample code; "This program segfaults because the entire stack is set to 0 at program start."

I'd be surprised; as a strong general rule, the stack does not get zeroed [Edit: see end of thread! It's the OS zeroing everything - learn something very day]. I'd expect it to segfault because the pointer value is whatever leftover non-zero value happens to be in that piece of memory, so it points into random memory the user program shouldn't be messing with (sticking in a printf to output the value of the pointer confirms this on at least one system). Wouldn't be surprised if some implementations took security really seriously and zero everything, or if a debug build was zero happy, but under normal circumstances, the stack doesn't get zeroed.

[+] Annatar|10 years ago|reply

> It's the OS zeroing everything - learn something very day

No, the operating system (kernel, actually) does not zero out anything. The runtime linker would be the one initializing static memory declared in the ELF BSS sections at execution time. The rest (including the stack and heap) is setup by the prologue (crti.o, depending on the OS crt1.o, and crts.o).

[+] dkopi|10 years ago|reply

edit: exDM69 actually makes more sense.

[+] Szpadel|10 years ago|reply

> Curiously, I found that if I had a buffer size of even 1 byte over (8 MB - 8 KB), I still got the segfault. I’m not yet sure what’s going on there!

This is because of gcc padding. Programs have to allocate whole page from OS. So if you want just 1 int, you have to get whole page for it (compilers can optimize it in some conditions). This is result of MMU that works for memory block and not for single bytes (performance issue I think) But as I know by default page size have 4KB.

Another reason may be that compiler tries to allocate 2^n bytes because of performance. and 8KB is close enough I think.

[+] dfox|10 years ago|reply

The main problem there is that local non-static variables get placed onto stack. Stack space is allocated by generated code by just decrementing stack pointer without any explicit calls to OS. On typical modern unix, only few pages of stack are actually mapped and kernel handles page faults on neighboring pages by allocating more stack pages. Because it is possible that function needs more than one page of stack space, there is more than one such "magic" stack page, but still there is some finite number of them (othervise there would be no way to distinguish between accesses beyond the end of stack that should grow stack and accesses to random unmapped memory). Thus if you allocate some ridiculously large things on stack and access them in the direction opposite of stack growth, you may get segfault (accessing these structures in "the right order" is by no means sufficient for this to be safe because there are things like red-zones, signals, other local variables...).

This is true only for first thread, other threads have fixed stack size specified on thread creation (on Linux it's 8MB by default), but usually even thread's stack pages are really allocated only on first access.

On UNIX if you really want to bump stack by arbitrary amounts the most portable way is to preallocate your own stack of sufficient size and then use that (either by abusing sigaltstack() or via makecontext()/setcontext() or possibly by creating new thread). But generally, having large local variables is not exactly good idea.

[+] kalmar|10 years ago|reply

Ah so the idea is that there's already 8 KB of actual pages set for stack by this point. So when I try to get another (8 MB - 8 KB + 1 B), that blows things up? I wonder if I can watch this happen in /proc/$pid/maps or somewhere else around there.

[+] amelius|10 years ago|reply

> Segfaults are our friends and teachers

Too bad memory is not better segmented then. For instance, when linking against a library, that library's memory ends up in the same "segment" as the program itself. Therefore, right now, you can totally screw up a library's internal data structures without even causing a segfault directly.

[+] userbinator|10 years ago|reply

These are called guard pages. Attempts to write there would result in a segmentation fault.

...which are caught by the OS and used to either truly kill the process when the stack overflows, or to dynamically allocate more memory as the stack grows downwards. That's how it works on Windows, at least; I'm not as clear about Linux.

[+] catern|10 years ago|reply

The robust solution to this problem is not hardcoding the pipe buffer size and changing the size of pipe buffers within your program to match your hard coded value, but rather calling fgetconf to query the pipe buffer size for the pipe FD you are working with.

50 comments