top | item 15345488

Why did we choose Rust to develop TiKV?

159 points| cyber1 | 8 years ago |pingcap.github.io | reply

200 comments

order
[+] eikenberry|8 years ago|reply
TLDR; the author likes rust and wanted to use it. The article reads like some dev's rationalizing what they want to do to the management. These types of things are fine, but as a dev to a dev it is obvious that they just want to use this cool tech. Good for them.
[+] dullgiulio|8 years ago|reply
Even more, Go is excluded as (author's opinion) goleveldb is not as mature as RocksDB. Thus they should have used CGo, which is way suboptimal, slower etc.

The title should have been: "C++11 or Rust? We chose Rust". In a greenfield project like this one seems to be, it's I choice I would approve.

A personal note regarding future comments to this thread: I have had enough of negative advertising against Go in every language related thread. People who use Go are not stupid: they know the language limits and tradeoffs and are okay with them. Deal with it.

[+] alexnewman|8 years ago|reply
Nonsense. I love go and rust but I would never develop a database in a garbage collected language again. Btw I have done it a few times. Go is better than java but it still

Can’t call into c as fast Has micro gc pauses which effect performance Doesn’t give me explicit control of the hardware

Go is great for middleware but he authors go it right

[+] lobster_johnson|8 years ago|reply
That's not a fair summary at all. For example, they explain why C++ was off the table. They also use Go for TiDB (which is the high-level distributed query engine layered on top of TiKV), but for TiKV they needed fast and easy access to C, which Go, in its current state, can never provide.
[+] pornel|8 years ago|reply
I can't wait until Rust is no longer be perceived as hipster trendy choice.

It's a very solid language in the no-GC niche and shouldn't need a blog post from every project that uses it.

Does it have to be 30 years old before it's not "new" and weird?

[+] greydius|8 years ago|reply
Haskell is almost 30 years old and still considered 'new' and 'weird'.
[+] wyldfire|8 years ago|reply
Here at HN Rust seems to come up frequently and it's indeed why I decided to take a look.

For my many colleagues who do not read HN, Rust is something that they might have heard of but haven't given any real attention to. For that matter, most haven't looked at golang either and they probably couldn't tell you what the difference between the two are.

IMO it will take a while before it grabs the attention of the mainstream software community.

[+] luckydude|8 years ago|reply
I'm (very) old school, C is still my language of choice (though it could use some help). I like the syntax, it's pretty simple.

I tried playing with Rust and found the syntax to be off putting. I really wonder why each new language feels it is important to come up with a different syntax to say the same stuff. Go did a lot better than Rust in this respect, at least in my opinion.

It may be that appealing to C programmers isn't a thing any more, but if it is, then Rust could have done better. And, yes, I get that the syntax isn't the selling point of Rust, trust me, I get it. I just don't get why make people wade through some weird syntax when you don't need to.

[+] mac01021|8 years ago|reply
“new and weird" is always relative to the industry/domain on application. In each domain , whether it be avionics or web development, managers are not going to want to use it until it has a proven record of success in that domain.
[+] zerr|8 years ago|reply
On the other hand I can't wait when Swift becomes general, non-Apple language, available on most platforms (including the most popular one) "with batteries" - that will be the end of Go and Rust I think :)
[+] cyber1|8 years ago|reply
I think TiKV is a good example where team chose Rust over Modern C++. Rust gives the same performance and is close to metal like C when it is necessary. All possible memory management mistakes it catches at compile time if it's not "unsafe" and this is a really great!

With Rust I can hack without fear! I shouldn't remember tons of C++ rules which are described in http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines, "C++ programming language" book, also here https://herbsutter.com/gotw/, etc and I can focus on algorithms and implementation.

C++ combines a lot of different paradigms m.b. more correct I would like to say "C++ paradigms hell"! Which of C++ subsets is the right way, no one understands. Even Bjarne Stroustrup said, "Within C++, there is a much smaller and cleaner language struggling to get out." - and where is this "smaller and cleaner language"?. What is the idiomatic style in C++? Is it Google guidelines, CoreCpp guidelines or other enormous guides?

I look inside a lot of C++ projects and each of them has different styles, use different paradigms, sometimes look like different languages!

Rust, Go, C, Java code bases look the same, they have their own idiomatic style, their own way.

I think Rust is the next step in the evolution of system programming language.

[+] blub|8 years ago|reply
C and Java code bases certainly don't look the same. In fact, claiming that about C is simply insulting to the intellect of anyone reading your overenthusiastic message.

I also sincerely doubt that Rust code bases will look the same in ten years. It supports functional and OO paradigms and it's attracting very different classes of programmers. Recently someone wrote a post about difficulties with some OO concepts in Rust, and a top reply said that they never encountered such issues because they program in a functional way.

Go is an exception here, but as soon as they extend the language in a significant way (e.g. templates) differences will start to appear.

C++ doesn't have an idiomatic style because it's used in very different ways by different people. It's impossible to have a fixed style and address the mass market.

[+] jcelerier|8 years ago|reply
> After years of usage of GC, it is very hard to go back time for manually managing the memory.

... are you guys sure of your "experienced C++ developers" ? There's as much memory management in modern C++ than in GC'ed language: none. Create your objects with `make_unique` or `make_shared` according to what makes sense (or just enforce `make_shared` if you're really dubious of the coding abilities of your team but at this point you'll have problems whatever you do).

[+] fsloth|8 years ago|reply
"...make_shared"

Sorry, I'm going to be a bit harsh now.

There is a host of distinctive differences between garbage collection and reference counting. Yes, both are memory handling strategies. That's where the similarities end.

"Just slap it in a shared pointer" is never a good advice without knowledge what 'it' is or in what kind of system it exists in.

[+] blub|8 years ago|reply
Yeah, that read like some Java advertising from the 90s.

I wish more of these posts were honest and said "we picked X cause we think it's cool and we're gonna get paid to learn it". But they have to make up some convoluted explanation that sounds rational and acceptable instead.

[+] p0nce|8 years ago|reply
I think you are being needlessly inflammatory. The list of 3 items against C++ makes it pretty clear they do have experienced C++ developers.
[+] tentaTherapist|8 years ago|reply
But then your program uses slow, cache-unfriendly and much-reviled reference counting. I'd even prefer Ocaml and its fully-featured GC if C++ is only fast in artificial benchmarks and not in idiomatic code, which apparently must use RC.

Note: I haven't used C++ at all on any project larger than a single file.

[+] pjc50|8 years ago|reply
And do all the C++ libraries take only smart pointers as arguments and return only smart pointers as return values?
[+] latch|8 years ago|reply
First sentence is why I dread reading comment. Why you gotta be like that?
[+] stepik777|8 years ago|reply
And how would smart pointers help you if you need to return pointer to a member from a function? Does C++ protects you from moved from unique_ptr? Or from iterator invalidation? Or maybe you can safely use non-atomic shared pointer if you don't need to send it across threads?
[+] jhasse|8 years ago|reply
You'll have to manually break cycles using weak_ptr though.
[+] sriram_malhar|8 years ago|reply
Considering that their team likes Go, it seems strange to me that they would consider Rust over Go for the storage layer. A storage layer should be IO-bound, and should hardly trouble the CPU; the choice of language really should not be a determining factor. The big wins in that space are architectural, not language specific.
[+] jerf|8 years ago|reply
"A storage layer should be IO-bound, and should hardly trouble the CPU; the choice of language really should not be a determining factor."

This used to be true, but it's out-of-date now. You can now get a network pipe in to a system that a rather beefy multi-core CPU using a user-space TCP stack can barely keep up with, let alone do any real work, and if you can scrape up the PCI express lanes, putting a few of the latest SSDs into a system can start getting you theoretical maximum bandwidth numbers that just a few years ago looked more like what you'd expect for a RAM bandwidth number.

I'm of the opinion that it was already not as true as commonly supposed 5 years ago (in my experience using slow languages on putatively IO-bound tasks was still noticeably slower than using fast languages), but the latest in network pipes and SSDs have really ended it. It's true that on most desktop systems you've still got more CPU than you know what to do with, but as you step into the serious database space that's not true anymore. For a serious database I wouldn't be perturbed if someone looked at Go's performance and just plain discarded it on the spot, even before considering GC issues. It's very fast for a scripting language; it's fairly slow for a compiled language. "The compiler spends hardly any time on optimization" is not what you want to read about your database implementation language.

(I've got one of the nvme SSDs in my laptop, and it is interesting to see just how many CPU bottlenecks there still are in systems nowadays. In some sense, I really shouldn't ever see a "loading" screen because you "ought" to be able to read things off of my SSD fast enough to completely fill my RAM in 5-10 seconds; "merely" loading Firefox ought to be somewhere in the 50ms range. In practice I still see loading screens and load waits, because the CPUs are still doing things. Lots of things that used to be dominated by and hidden in the load time, but aren't anymore.)

[+] AYBABTME|8 years ago|reply
Developing a storage layer as elaborate as RocksDB from scratch is quite an endeavor, and wanting to just use RocksDB instead of making your own is a smart decision. From there, Go is sort of easy to throw out of the picture: using cgo kills performance and safety. I say this as a person who uses Go as my workhorse, have used it for many years and has a favorable opinion of it.
[+] sanxiyn|8 years ago|reply
Rust would use less memory than Go. (Dropbox also likes Go and used Rust over Go for the storage layer, and when asked, memory usage was their primary reason.)
[+] Shorel|8 years ago|reply
And lack of experience with D.

Just kidding, great job!

[+] bpicolo|8 years ago|reply
Anybody have experience with TiDB? How does it stack up against CockroachDB? Seems hard to find comparison. Probably hear less about it mostly because it's developed in China? Looks like it's an impressive piece of tech, though.
[+] lobster_johnson|8 years ago|reply
A big difference is that TiDB is not ready for production yet.

Having followed to the project for a while, another distinction is that TiDB is operationally more complex. You need to build and deploy TiDB (high-level query engine), TiKV (key/value store) and PD ("placement driver", which coordinates sharding and data migration) separately. TiDB is stateless and can be scaled freely, but TiKV and PD are both stateful and implement their own distributed consensus systems. PD actually embeds Etcd, whereas TiKV has its own Raft implementation in Rust. Compare this to Cockroach, which has a single monolithic daemon that you deploy everywhere, which contains both the distributed query engine, the key/value store, consensus/cluster coordinator, etc. (There may be benefits or drawbacks to the difference in design; I don't know the internals of either project well enough to debate that.)

For an internal project I'm working on, running TiKV standalone actually looks very interesting, but it's not very well documented yet.

[+] baldfat|8 years ago|reply
> its innovation in the type system and syntax gives it a unique edge in developing Domain-Specific Libraries (DSL).

I think Racket still has the edge for producing DSL?

[+] noncoml|8 years ago|reply
The one reason I would give is Algebraic Data Types.
[+] baq|8 years ago|reply
it'd be enough for me to say 'rust is kinda like c++ in terms of performance and complexity but without the 0 pointer'
[+] jhasse|8 years ago|reply
Rust also has a "0 pointer": https://doc.rust-lang.org/std/option/ The equivalent of a null pointer exception in Rust is an unwrap panic.

(IIRC if you use Option<Box<...>> None will even be represented by a null pointer internally)

[+] gypsyharlot|8 years ago|reply
I'd say: "It's like C++, only you get the runtime errors at compile-time".
[+] amelius|8 years ago|reply
If you had lots of circularly referencing data structures, would it make more sense to choose a garbage-collected language like Go?
[+] int_19h|8 years ago|reply
Most circular data structures still have some node that is semantically the owner of the whole thing. Having true circular ownership is much rarer.
[+] StreamBright|8 years ago|reply
It is kind of funny how software engineers can engage in lengthy discussion about tooling. Imagine the same for architects. Instead of looking at the building they would talk about the type of hammer they used while building it.
[+] arjie|8 years ago|reply
With software the material you construct your creations influences the means. Architects most certainly do argue about whether they should use cross-laminated timber, reinforced concrete, glulam, or steel. They talk about these things and write long pieces on them. The materials influence the design of the building.

They don't talk about it on blogs on the Internet because that's not where the audience is. But they do talk about this.

[+] AYBABTME|8 years ago|reply
I think "hammer" is pretty diminutive as a parallel for a choice of language. I feel a better parallel would be "architects talking about steel alloys versus composite materials" which is not crazy.
[+] mrkgnao|8 years ago|reply
Is this a hammer question, or one of e.g., concrete vs. wood?
[+] klakier|8 years ago|reply
When I see posts like "Why did we choose something over something other", I'm like "Nobody cares"
[+] sgift|8 years ago|reply
And with that attitude we get the same problems over and over again.
[+] oelmekki|8 years ago|reply
Looks like you care enough for opening the comment page and dropping yours :)
[+] meche123|8 years ago|reply
+1

This post is just an advertise for their product.