top | item 15845419

Using Rust in Mercurial

341 points| oblio | 8 years ago |mercurial-scm.org

147 comments

order
[+] yeukhon|8 years ago|reply
So my understanding is the hg developers are planning to rewrite a large part of hg in Rust. If so, this is an unfortunate blow to Python, because often I hear (and I do as well) Python developers cite hg as one of the largest Python-based application (regardless of the C-extension). I certainly feel sad if that is the case.

I studied hg quite in depth in undergraduate for a semester, when I was implementing "bitbucket" myself. To be really honest, the codebase was easy to navigate, and function names were pretty consistent with the actual hg commands/internal spec. While the code itself is probably hard to write any true unit tests (you'd have to monkeypatch like crazy) using mock -- which means the function has a lot of code, overall the codebase quality was pretty good for a complex software. I just had to know the variable name abbreviations, get used to them and referred back to the Hg paper.

In anyhow, whatever the decision is, I'd learn a bit Rust to help out :-)

P.S. I am still trying to find an answer to this: if FB uses Hg, then what about their git code?

[+] durin42|8 years ago|reply
> rewrite a large part of hg in Rust

That's not really the right story. The motivating factor is that we'd like to write /less C/, not /less Python/. It's likely in the long-term that parts of the code that are performance sensitive will move out of Python, but we've been doing that for years, and we just keep finding new things to optimize as the scale of Mercurial repositories keeps increasing.

[+] kibwen|8 years ago|reply
> If so, this is an unfortunate blow to Python

As both a Rust and Python user, I don't necessarily agree. This isn't a rewrite to purge Python; it's clear from the post that Mercurial values Python's flexibility, and wants to bend over backwards to ensure as little of that flexibility is lost in whatever transition may occur. And Python's credentials would be secure with or without Mercurial being written in it. Frankly, I think that swapping out critical bits for a low-level language is a great way to scale a dynamic-language codebase without completely sacrificing the usual ergonomics of dynamism.

[+] cat199|8 years ago|reply
> If so, this is an unfortunate blow to Python

I doubt this will have much impact - bazillions of other things use python

[+] etanol|8 years ago|reply
> this is an unfortunate blow to Python

Just as unfortunate as how Python turned its back to Mercurial

[+] nindalf|8 years ago|reply
> what about their git code?

Could you elaborate?

[+] nixpulvis|8 years ago|reply
I'm very happy to hear about any and all blows to Python! Especially ones caused by Rust.
[+] puddums|8 years ago|reply
Initial reaction to headline was it sounded like (a) no more python, and (b) this is a decided future direction.

Instead, it sounds like this is a proof-of-concept for flipping the main 'hg' command from being python + C extensions, to instead being a rust binary with an embedded python interpreter. Part of the rationale appears to be performance, but also smoothing out cross platform experience, especially on Windows.

Pulling out some related snippets:

-----

While Python is still a critical component of Mercurial and will be for the indefinite future, I'd like Mercurial to pivot away from being pictured as a "Python application" and move towards being a "generic/system application." In other words, Python is just an implementation detail.

-----

Desired End State

hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today). Python code seemlessly calls out to functionality implemented in Rust. Fully self-contained Mercurial distributions are available (Python is an implementation detail / Mercurial sufficiently independent from other Python presence on system)

-----

"Standalone Mercurial" is a generic term given to a distribution of Mercurial that is standalone and has minimal dependencies on the host (typically just the C runtime library). Instead, most of Mercurial's dependencies are included in the distribution. This includes a Python interpreter.

-----

This patch should be considered early alpha and RFC quality.

[+] tiles|8 years ago|reply
What are these quotes from? Sounds like it contains some info the original post doesn't.
[+] puddums|8 years ago|reply
... and now looks like the HN headline was improved to be more representative of the content compared to the earlier headline.
[+] bulldoa|8 years ago|reply
what does embed python interpreter mean? are they actually writing a python interpreter using rust so they can write python code and compile to rust?
[+] krschultz|8 years ago|reply
Leaving aside the actual project at hand, this is a great example of a well thought out project plan. There is a clear rationale, clear end state, a bunch of known problems to tackle, a front loading of risk, and it delivers incremental value along the way.
[+] steveklabnik|8 years ago|reply
I was wondering how serious this was. I don't know a lot about how mercurial is developed, but https://twitter.com/indygreg/status/937527180292014080

https://gregoryszorc.com/work.html

> I am a significant contributor to the Mecurial open source version control system.

> I serve on the Mercurial Steering Committee, which is the governance group for the Mercurial Project. I also have reviewing privileges, allowing me to accept incoming patches for incorporation in the project.

So not sure, but it seems like at least one person on the team is into it?

[+] neandrake|8 years ago|reply
It looks like in their last sprint meeting (end of Sept[0][1]) there was a lot of planning and talk about moving parts of mercurial to rust. From the history of sprints[2] it sounds like facebook first started experimenting with some mercurial implementations in rust and may be one of the big contributors spearheading this (though I also see indygreg and durin42 around the phabricator projet giving mercurial advice). I'm a fan of both rust and mercurial so this is exciting news to hear.

[0] https://www.mercurial-scm.org/wiki/4.4sprint#Rust

[1] https://public.etherpad-mozilla.org/p/sprint-hg4.4-NOSPAMREM... (remove everything after-and-including the last hyphen, I left it in since it seems like they don't want a direct link that's easily scraped?)

[2] https://www.mercurial-scm.org/wiki/4.0sprint

[+] masklinn|8 years ago|reply
Augie Fackler seems surprised, but not necessarily negatively, they're a pretty big contributor (top 5~10 by commits).
[+] ngoldbaum|8 years ago|reply
This is a proposal not a plan. Right now it’s 100% vaporware.
[+] kbrosnan|8 years ago|reply
How about reaching to GPS? You both are employed by Mozilla.
[+] bla2|8 years ago|reply
> The nice things we want to do in native code are complicated to implement in C because cross-platform C is hard. The standard library is inadequate compared to modern languages. While modern versions of C++ are nice, we still support Python 2.7 and thus need to build with MSVC 2008 on Windows. It doesn't have any of the nice features that modern versions of C++ have. Things like introducing a thread pool in our current C code would be hard. But with Rust, that support is in the standard library and "just works." Having Rust's standard library is a pretty compelling advantage over C/C++ for any project, not just Mercurial.

Sounds like the main reason for rust is that Python has a weird dep on a fixed MSVC version.

[+] neandrake|8 years ago|reply
There was also this point that was phrased as a comparison to python but I think works equally as well for choosing rust over c:

> In addition to performance concerns, Python is also hindering us because it is a dynamic programming language. Mercurial is a large project by Python standards. Large projects are harder to maintain. Using a statically typed programming language that finds bugs at compile time will enable us to make wide-sweeping changes more fearlessly. This will improve Mercurial's development velocity.

[+] Niten|8 years ago|reply
Which they mention they'd still have to implement workarounds for if they adopt Rust, so I'm not sure I understand that selling point.
[+] kyrra|8 years ago|reply
For those that don't want to deal with the Python startup time, the Mercurial team already has an attempt at fixing this with a tool called CHg[0]. It is a C binary that interacts with the Mercurial CommandServer[1], which is just a long running version of Mercurial CLI that you can interact with over a pipe.

Using Rust as the primary client would simplify this a lot, but is a lot more work than what CHg accomplished.

[0] https://www.mercurial-scm.org/wiki/CHg

[1] https://www.mercurial-scm.org/wiki/CommandServer

[+] ngrilly|8 years ago|reply
This is already explained in the linked article.
[+] johnny_1010|8 years ago|reply
"Rewrite everything in Rust, exactly the same way but in Rust it solve all problem." - some Rust programmer
[+] the_mitsuhiko|8 years ago|reply
It’s not being rewritten in Rust. They want to use rust instead of C for extension modules.
[+] sctb|8 years ago|reply
We've updated the submission title, which was “Mercurial (hg and C extensions) being rewritten in Rust” to that of the article. It's surprisingly easy to mislead, which is one of the reasons we ask submitters to use the original title and not to editorialize.
[+] jxcl|8 years ago|reply
One of us misread the article. I understood it to say that they want to rewrite the core of the Python code to Rust so that they don't have to wait for the Python interpreter to start every time someone uses the hg command. The Rust binary will be able to call Python code, which is basically a complete reversal of what they have now.
[+] m12k|8 years ago|reply
> Desired End State: hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today)

I'd say it seems like they want to also flip the ratio of the core around, so it's Rust for the central parts and Python only as needed (whereas today it's Python for the core and C for certain performance critical things).

[+] dragonwriter|8 years ago|reply
No, they want to flip the relationship of Python and native code; instead of hg being Python with C extensions, they want it to be Rust with an embedded Python interpreter. It's not a total rewrite in Rust, but it's also a more significant change than just replacing C extensions with Rust extensions while maintaining a Python core. Quoting the source material on the “Desired End State”: “hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today)”
[+] ahupp|8 years ago|reply
From the page:

"Desired End State: hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today)"

[+] oconnor663|8 years ago|reply
Could we get a mod edit on the title here? The article itself is called "Using Rust in Mercurial".
[+] gcb0|8 years ago|reply
really wonder why not Go.

original code was python with lots of C. the same can be done with Go, while keeping much of the same philosophy of python.

This change will probably alienate most of the contributors since rust and python or C are worlds apart.

[+] seba_dos1|8 years ago|reply
I would wonder why Go if Rust is available.

Rust isn't hard. The only somewhat hard thing in Rust is writing with borrow checker, and honestly, if you want to seriously write in C, you really need to go through that experience and understand it enough to feel somewhat comfortable with it.

And even if that would really be a thing, giving up compiler checks in order to allow more low quality code to be contributed doesn't really seem like a good trade off to me.

[+] jhugg|8 years ago|reply
One reason might be that it seems easier to mix languages with Rust vs. Go. The C interop is pretty sane and Go has a few more gotchas in that department.

The rest of the Rust v Go debate may be similar to the 1000 times people have had it before.

[+] oconnor663|8 years ago|reply
My understanding is that Go would be a fine candidate if you wanted to rewrite the entire codebase in it, but not for writing compiled extensions that Python can dynamically link, and they want one language to do both of those things?
[+] rubber_duck|8 years ago|reply
One thing that comes to mind is having to bridge two GC/runtimes in the same binary, where Rust is a drop in replacement for C in that regard (systems programming language)
[+] durin42|8 years ago|reply
Go isn't really a huge win for hg, and it doesn't support being used as a shared library, so we couldn't easily use it as a replacement for C for native extensions. main() has to be in Go, and even then you've got two GCs in the mix which is not great.
[+] chme|8 years ago|reply
There are places where C and other places where programming C++ makes more sense; the same is true with Rust and Go.

When it comes to low-level, high performance, no overhead stuff and less unexpected behaviour C and Rust fit better.

Where high implementation speed and fast changing, experimental and highly modular architecture systems are more important C++ and Go are better.

For Mercurial and Git Rust or C makes more sense. Especially if you put experimental and fast changing stuff into a scripting language. IMO.

[+] rkangel|8 years ago|reply
I'm curious, why is Go+Python easier than Rust+Python?
[+] erpellan|8 years ago|reply
Given a situation where (pre-rust) the usual choice would be to use C or C++, it's totally appropriate to choose Rust. That's one of Rust's express design goals.

Go is intended to make it very easy for junior devs to build concurrent network applications. That's its design goal.

It's not a zero-sum game. Choosing rust doesn't weaken go or vice versa.

[+] panic|8 years ago|reply
Forget Rust and Go, why not ATS? You get C compatibility with dependent types!
[+] petre|8 years ago|reply
I wonder why not Dlang. It has better C/C++ interop than Rust, C parts coukd be used/wrapped before they're rewritten.
[+] Varriount|8 years ago|reply
Or Swift, or Nim.

Really, There are so many other languages that are both performant and more readable than Rust (I consider Rust readability to be on par with C++).