A 30 minute introduction to Rust

[+] ChuckMcM|12 years ago|reply

This is an excellent summary Steve, it also points out one of the challenges of 'System' languages, which is the requirement for 'unsafe.'

One of the first things I did when I started working on Oak (which became Java) was to see about writing an OS in it. The idea being that if you could write an OS in a 'safe' language then you could have a more reliable OS. But we were unable to write it completely in Oak/Java and that lead to some interesting discussions about what might be the minimum set of 'unsafe' actions might be required by a systems language.

Sadly we did not get to explore that very much, although I did pass it on as a possible thesis topic to some interns who came through Sun at the time. I'd be interested in your experience with what actions require 'unsafe' and if you have seen a canonical set that might point toward a process to get to a 'safe' OS.

[+] steveklabnik|12 years ago|reply

Glad you liked it, Chuck.

Some of my closest friends in college specialized in operating systems, and we (mostly them) worked on http://xomb.org , an exokernel in D. I'd hope that today we'd choose Rust instead.

Julia Evans has been writing _fantastic_ series about a kernel in Rust: http://jvns.ca/blog/categories/kernel/

As she says "This is typical of a lot of Rust code I’m writing – I need to write a lot of unsafe code."

I'll have to give this 'minimum set' idea some thought. I think that safety will be more useful for things like kernel modules than in the kernel itself, though I'm not sure why I think that, exactly. Hmmmmm...

[+] kibwen|12 years ago|reply

For reference, the following actions are what are enabled within `unsafe` blocks in Rust:

1. Dereferencing raw pointers (a.k.a. "unsafe" pointers). Note that's just dereferencing: there's nothing inherently unsafe about just creating and passing around the pointers themselves.

2. Calling a Rust function that has been marked with the entirely-optional `unsafe` keyword.

3. Calling an external function via the C FFI, all of which are automatically considered unsafe.

[+] scott_s|12 years ago|reply

You'll probably be interested in the Singularity project at Microsoft Research: http://research.microsoft.com/en-us/projects/singularity/

Much of the kernel is implemented in managed code.

[+] seanmcdirmid|12 years ago|reply

Sun eventually did JavaOS, though I'm not sure how much of its code was Java.

IBM did Jikes RVM (old name: Jalapeno) which was mostly, if not purely Java. They handled the bootstrapping problem with some clever ahead-of-time compilation + meta-programming. Even their GC is implemented in Java.

[+] minimax|12 years ago|reply

I think we can be a little bit more charitable towards C++. Modern compilers will let you know if you try to do something as obviously incorrect as returning a pointer to a stack variable.

    $ cat > foo.cpp <<EOF
    > int *dangling(void)
    > {
    >     int i = 1234;
    >     return &i;
    > }
    > EOF
    
    $ clang++ -Werror -c foo.cpp
    foo.cpp:4:13: error: address of stack memory associated with local variable 'i'
          returned [-Werror,-Wreturn-stack-address]
        return &i;
                ^
    1 error generated.

[+] steveklabnik|12 years ago|reply

Thank you! Maybe I should explicitly show a new/free example instead, or does that end up having a similar warning?

I haven't written serious C++ in years, so I have some blind spots. Others on the Rust team have done quite a bit, so they tend to pick up my slack in exactly this manner.

[+] tptacek|12 years ago|reply

This isn't so much an introduction to Rust as it is an introduction to Rust's concurrency model.

The example of returning a reference to an automatic variable isn't super compelling, since every competent C/C++ programmer knows not to do it. That bug does pop up every once in awhile, but almost always in the context of a function that returns a reference to one of many different possible variables depending on some condition in the function.

Does Rust really call its threads "green threads"? Green threads have a weird reputation.

Copy like "this allows you to, well, read and write the data" could be tightened up; it's an attempt at conversational style that doesn't add much. "That doesn't seem too hard, right?" is another example of the same thing.

How much of Rust concurrency is this covering? How much of its memory model? Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?

[+] pcwalton|12 years ago|reply

> This isn't so much an introduction to Rust as it is an introduction to Rust's concurrency model.

Ownership is really central to Rust. It's central to both memory management and concurrency: to work with Rust you need to understand it.

> The example of returning a reference to an automatic variable isn't super compelling, since every competent C/C++ programmer knows not to do it.

That's just a simple example. The same logic also prevents iterator invalidation and use-after-free, which are things that do occur in the real world and lead to security vulnerabilities.

> Does Rust really call its threads "green threads"? Green threads have a weird reputation.

They're M:N threads, multiplexed among multiple hardware threads, like Go or Erlang.

> Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?

Sort of, but I think that's an uncharitable way to say it. The trick is that the language allows these unsafely-implemented primitives to be safely used from safe code. As long as the unsafe code is correct, the safe code is guaranteed to be safe. From a trust point of view this is really no different from building the features into the compiler: either you trust the compiler (which, as it's a compiler, is unsafe) or you trust the unsafe portions of the standard library. But it's way easier, and more flexible, to hack on libraries than to hack things into the compiler—as you get to write code, not code to generate code.

Furthermore, the safe part of the language is so powerful that you rarely ever need "unsafe": you can do practically everything you might need to do, including shared memory with locks, in the safe language, without GC. Only if you really need to squeeze out the last amount of performance, or if you need to interface with C libraries, do you need "unsafe".

[+] steveklabnik|12 years ago|reply

Salespeople qualify leads by determining if you're ready to buy or not. If you're not, they stop wasting time on you. The general idea for a quick introduction is to qualify your lead. So this isn't a "introduction to Rust's syntax" it's "an introduction to why you should (or should not) care about Rust."

> since every competent C/C++ programmer knows not to do it.

Everyone knows, yet programs still segfault. The point is that the language helps you be competent. Static analysis is very useful.

> Does Rust really call its threads "green threads"? Green threads have a weird reputation.

I agree. Rust has N:M mapped threads by default, but recently added 1:1 as well.

> it's an attempt at conversational style that doesn't add much.

I happen to write like I talk, it gets good and bad reviews. A more neutral style would be more appropriate if/when this gets pulled into Rust itself, thanks, that's a great point.

> Does the whole concept of Rust concurrency and memory protection boil down to "the language provides an 'unsafe', and then people write libraries to do things with it"?

I think this is an unfair characterization, but I gave it to you, so that's a criticism of me, not you. I tried to point out that unsafe exists for exceptional cases only: it's not something that you need unless you're doing something dangerous for a specific reason. I personally have only ever written unsafe when wrapping something with FFI.

Introductions are hard because you never know how much depth to go into; maybe I should go into these bits a little more in depth.

Thanks for the great feedback. :)

[+] saosebastiao|12 years ago|reply

tptacek, I've been meaning to ask this question to someone with some extensive security experience: Is there a compelling story for security researchers and engineers for low-level languages with an emphasis on memory safety (like Rust or Cyclone)? From my admittedly limited perspective, it seems like it could eliminate a lot of mistakes that lead to insecure software, but then again, I don't know how common memory-flaw exploits are.

[+] lmm|12 years ago|reply

Green threads are the standard term for language-level threads that are not OS-level threads.

[+] pjmlp|12 years ago|reply

> since every competent C/C++ programmer knows not to do it.

They are hard to come by, in this time and age, of cutting down costs everywhere while offshoring components.

[+] lucian1900|12 years ago|reply

Rust's tasks can either use native threads or green threads multiplexed on top of a pool of native threads, without change in API.

[+] pcwalton|12 years ago|reply

I like this tutorial because dives straight into the most unique/unfamiliar parts of Rust (ownership/references) and gets them out of the way. It's a "learn the hard way"-style tutorial, and I think that's the best approach. Once you learn how ownership and borrowing work, along with ARCs and concurrency, everything else is really simple and just naturally falls out.

[+] brson|12 years ago|reply

I like this a lot, and think it's the best intro to Rust yet. The thing that concerns me a bit is that it presents the special cases in concurrency without impressing some of the most important points. Primarily, the channel example presents the send as copying, which in this case it is, but one of the main advantages of Rust's channels and owned types is that message passing of heap-allocated types do not need to copy. It probably doesn't stress hard enough that Rust tasks do not share memory before saying, 'oh, but really you can share memory if you need to', though I see that the Arc and RWArc examples are good ways to introduce the concept of using unsafe code to provide safe abstractions.

[+] noelwelsh|12 years ago|reply

The focus on C++ as point of comparison is understandable given Mozilla's background, but in Internet land most systems software runs on the JVM, and is written in Java, or increasingly, Scala (see LinkedIn and Twitter, for example).

The issues of memory layout and the like come up here, and unlike Rust the JVM doesn't give much control of this aspect. See Martin Thompson's blog for an example of someone very concerned with issues of performance on the JVM (http://mechanical-sympathy.blogspot.co.uk/) I believe Rust could see a lot of adoption within this community as a "better" Scala -- a modern high-level language that allows dropping down to bit-twiddling when performance is an issue. It needs higher kinded types before it will work for me, but I hear that is on the road-map.

BTW, I've read a few Rust tutorials and they all fail for me in the same way: too much waffle and not enough getting down to the details. I understand the difference between stack allocation, reference counting, and GC, I get why shared mutable state is a bad idea, etc. What I want is a short document laying out the knobs Rust provides (mutable vs immutable, ownership, allocation) and how I can twiddle said knobs.

[+] Aloisius|12 years ago|reply

but in Internet land most systems software runs on the JVM, and is written in Java

[[Citation needed]].

The Internet land I've lived in mostly lives in C with a smattering of non-JVM scripting languages (Python, Ruby, PHP, etc) on top.

[+] steveklabnik|12 years ago|reply

I think that's totally true. It's not just so much because of Mozilla, but also because Rust is probably closer to C++ than Java...

The official tutorial contains much of that information.

[+] samth|12 years ago|reply

I think the emphasis on "unsafe" isn't helpful. As far as I can tell, the only thing that "unsafe" is enabling is that Arc and RWArc are written in Rust rather than in C in the runtime (the way they'd be in Go, or Erlang, or Haskell). The things that make Rust able to do what it does are ownership and tasks and lifetimes and affine types -- all the things the post covers before talking about "unsafe".

Also, it gives the impression that there's something fundamentally unsafe about all of this, whereas the whole point is that these abstractions are _safe_ to use.

[+] danso|12 years ago|reply

A little OT...but what's with Svbtle's apparent default styling of links? There's no indication that any particular word or sentence contains a link, which basically makes those links invisible to readers. Or do lots of people read web articles by randomly hovering the mouse around the text?

But relevant to the OP...I generally try to save useful tutorials like this on my pinboard, which often doesn't pick up the meta-description text. So I double-click to copy the first paragraph and paste it into pinboard...except in the OP, I kept on clicking on text that was hiding links underneath.

It's a strange UI decision, and one that seems to discourage the use of outbound links...if you can't see the links, then what is the purpose of them? For spiders?

[+] Zecc|12 years ago|reply

> There's no indication that any particular word or sentence contains a link

There is a subtle grey underline, which I'm sure can be nearly invisible depending on your screen.

[+] steveklabnik|12 years ago|reply

Yeah, I think that they have been made lighter recently. Bummer :/

[+] acqq|12 years ago|reply

"Rust does not have the concept of null."

How can I have the pointer to something that is maybe allocated or maybe present? Do I have to have additional booleans for such uses? Isn't that a waste?

How can I effectively build complex data structures like graphs, tries etc then?

I'd like to see that covered too.

[+] kibwen|12 years ago|reply

Rust uses the `Option` type for that (name taken from Scala and ML, it's called `Maybe` in Haskell).

http://static.rust-lang.org/doc/master/std/option/index.html

What's neat is that if you stuff a pointer inside an `Option`, then not only is it guaranteed to be memory-safe but it also compiles down to a plain old nullable pointer at runtime, so there's no extra overhead while still retaining safety.

[+] Skinney|12 years ago|reply

You use the Option type. Option can either be None or Some(T). I think Rust optimizes this to (non)null, so there is very little, if any, overhead compared to the equivelent C++ code.

[+] maxerickson|12 years ago|reply

I personally dislike the style of tutorial that has lots of 'we' and 'lets' in it.

I suppose part of that comes from the tendency for such tutorials to provide revelations instead of motivators. For example, in this tutorial there is 'look at this C++ code because I said to' and then two sentences later it explains that the C++ code ends up in a garbage value.

But this is probably very much a point of style and I'm sure lots of people think my view is stupid.

[+] patrickaljord|12 years ago|reply

Thanks for the the tutorial! Rust seems a bit too complex to me. Like a C++ on steroid that wants to do and be everything. Nothing wrong with that but not my cup of tea. I'd rather stick to C if I need tight memory management, it is way simpler and straight forward. And if I need concurrency, I'll stick to Golang (or erlang). Really, it's such a pleasure to read some golang after reading this 30 minutes of Rust. Anyway, just my opinion.

[+] pcwalton|12 years ago|reply

> Rust seems a bit too complex to me. Like a C++ on steroid that wants to do and be everything.

No, that's explicitly not a goal. The goal is to enable zero-cost abstractions and memory safety. Everything here is in service of that goal.

> I'd rather stick to C if I need tight memory management, it is way simpler and straight forward.

The problem is that the "simple and straightforward" model of C leads to a lot of very not-simple-and-straightforward time in front of Valgrind to get the program to work, or worse, to fix security vulnerabilities.

[+] duaneb|12 years ago|reply

I would argue that rust is both simpler and more consistent than either C or C++. No preprocessor, no pre/post increment, a real type system (instead of templates)—there's just not the mountain of undefined parts of the language that tends to be a large problem. Scala, in spite of being on a VM, is far closer to C++ IMHO in terms of attitude.

I actually find golang harder to read without parametric types (sans built-in slices and maps).

[+] masklinn|12 years ago|reply

> Like a C++ on steroid

Rust most definitely isn't "like a C++ on steroid" from a complexity standpoint. It tries (and — I think — mostly succeed) to be a significantly simpler and more coherent language.

It does make things which are implicit in C or C++ (e.g. ownership) explicit. That's a good thing, you need to know your ownership in C or C++, the language just doesn't help you much.

[+] steveklabnik|12 years ago|reply

> I'd rather stick to C if I need tight memory management, it is way simpler and straight forward.

Well, if you don't want your C to segfault, you have to understand these concepts anyway, they're just implicit to the language. C's memory management may be simple, but it's surely not easy.

[+] pjmlp|12 years ago|reply

> C if I need tight memory management, it is way simpler and straight forward

If you work alone yes. Good luck on a 50+ developer team size, with high atrition rates.

[+] azth|12 years ago|reply

> Really, it's such a pleasure to read some golang after reading this 30 minutes of Rust.

Incidentally, I find reading Go to be much more verbose and even heaver cognitively compared to reading Rust code.

[+] rcthompson|12 years ago|reply

Thanks for this straightforward and accessible intro to some of Rust's unique features!

[+] steveklabnik|12 years ago|reply

You're very welcome.

[+] steveklabnik|12 years ago|reply

I like the overall structure, but I'm not sure about throwing so much syntax without explaining it in detail.

[+] scott_s|12 years ago|reply

I think the amount of syntax you've shown is fine. Since there's C++ and Rust code that both do the same thing, it allows readers familiar with C++ to infer what the Rust syntax means. This is a much quicker process than reading "Here's how you write an if clause ..." prose.

When it comes to showing new languages to experienced programmers, I prefer showing code, and explaining what it accomplishes. Experienced programmers will start building up a Bayesian model of the syntax without being explicitly told.

I think of this as the "Dive Into Python" approach: http://www.diveintopython.net/

[+] niix|12 years ago|reply

Thanks for this. I've been thinking about getting into Rust recently and this motivates me to do so now.

[+] Jemaclus|12 years ago|reply

The last time I touched C code was my sophomore year in college, so maybe 12 years ago? As a result, the last time I had to deal with pointers and such was back then, as well.

I'm primarily a web-dev. Ruby, PHP, and Javascript are the languages I'm most familiar with at the moment.

Are there any Rust for Dummies-style tutorials floating around? As simple as this introduction is, it was still over my head...

[+] steveklabnik|12 years ago|reply

I have you covered in that case as well: http://www.rustforrubyists.com/

I want to provide a version of the 30 minute intro that's not strictly for systems people as well, but you have to start somewhere.

[+] unknown|12 years ago|reply

[deleted]

[+] bsaul|12 years ago|reply

Is rust borrowing any kind of code from what is used for objective C ARC technology relative to detecting the lifetime of a variable and automaticaly freeing the resource ? Is it a common known algorithm ?

[+] steveklabnik|12 years ago|reply

Everything is written in Rust, so we're not borrowing the code.

Reference counting is very common: http://en.wikipedia.org/wiki/Reference_counting

[+] 0x001E84EE|12 years ago|reply

I'm a big fan of that kudos button! Very nice website and an interesting introduction to Rust.

[+] nfoz|12 years ago|reply

I find it annoying to the point of offensive. Actions being actived on mouse-hover is terrible; I did not intend to give "kudos" but was curious what might be linked under it, and suddenly an action is recorded. Now I can't consider any webpage safe and have to watch where my mouse goes; that's wrong.

[+] steveklabnik|12 years ago|reply

Thanks! I don't make any of that, it's just Svbtle.

[+] 45g|12 years ago|reply

> You can see how this makes it impossible to mutate the state without remembering to aquire the lock.

Not quite true. Looking at the type signature of e.g. RWArc::write I see this:

    fn write<U>(&self, blk: |x: &mut T| -> U) -> U

which means I could probably do:

    let mut n = local_arc.write(|nums| {
         nums[num] += 1;
         return ~(*nums);
     });
    n[2] = 42;

[+] eridius|12 years ago|reply

You're not letting anything escape. You're copying `nums` and then mutating your copy. And this only works because `nums` can be implicitly copied like this. If you changed it from `[int, ..3]` to `~[int]` you'd get a compiler error (as `~[T]` cannot be implicitly copied).

[+] unknown|12 years ago|reply

[deleted]

156 comments