So my understanding is the hg developers are planning to rewrite a large part of hg in Rust. If so, this is an unfortunate blow to Python, because often I hear (and I do as well) Python developers cite hg as one of the largest Python-based application (regardless of the C-extension). I certainly feel sad if that is the case.
I studied hg quite in depth in undergraduate for a semester, when I was implementing "bitbucket" myself. To be really honest, the codebase was easy to navigate, and function names were pretty consistent with the actual hg commands/internal spec. While the code itself is probably hard to write any true unit tests (you'd have to monkeypatch like crazy) using mock -- which means the function has a lot of code, overall the codebase quality was pretty good for a complex software. I just had to know the variable name abbreviations, get used to them and referred back to the Hg paper.
In anyhow, whatever the decision is, I'd learn a bit Rust to help out :-)
P.S. I am still trying to find an answer to this: if FB uses Hg, then what about their git code?
That's not really the right story. The motivating factor is that we'd like to write /less C/, not /less Python/. It's likely in the long-term that parts of the code that are performance sensitive will move out of Python, but we've been doing that for years, and we just keep finding new things to optimize as the scale of Mercurial repositories keeps increasing.
As both a Rust and Python user, I don't necessarily agree. This isn't a rewrite to purge Python; it's clear from the post that Mercurial values Python's flexibility, and wants to bend over backwards to ensure as little of that flexibility is lost in whatever transition may occur. And Python's credentials would be secure with or without Mercurial being written in it. Frankly, I think that swapping out critical bits for a low-level language is a great way to scale a dynamic-language codebase without completely sacrificing the usual ergonomics of dynamism.
Initial reaction to headline was it sounded like (a) no more python, and (b) this is a decided future direction.
Instead, it sounds like this is a proof-of-concept for flipping the main 'hg' command from being python + C extensions, to instead being a rust binary with an embedded python interpreter. Part of the rationale appears to be performance, but also smoothing out cross platform experience, especially on Windows.
Pulling out some related snippets:
-----
While Python is still a critical component of Mercurial and will be
for the indefinite future, I'd like Mercurial to pivot away from
being pictured as a "Python application" and move towards being
a "generic/system application." In other words, Python is just
an implementation detail.
-----
Desired End State
hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today).
Python code seemlessly calls out to functionality implemented in Rust.
Fully self-contained Mercurial distributions are available (Python is an implementation detail / Mercurial sufficiently independent from other Python presence on system)
-----
"Standalone Mercurial" is a generic term given to a distribution
of Mercurial that is standalone and has minimal dependencies on
the host (typically just the C runtime library). Instead, most of
Mercurial's dependencies are included in the distribution. This
includes a Python interpreter.
-----
This patch should be considered early alpha and RFC quality.
Leaving aside the actual project at hand, this is a great example of a well thought out project plan. There is a clear rationale, clear end state, a bunch of known problems to tackle, a front loading of risk, and it delivers incremental value along the way.
> I am a significant contributor to the Mecurial open source version control system.
> I serve on the Mercurial Steering Committee, which is the governance group for the Mercurial Project. I also have reviewing privileges, allowing me to accept incoming patches for incorporation in the project.
So not sure, but it seems like at least one person on the team is into it?
It looks like in their last sprint meeting (end of Sept[0][1]) there was a lot of planning and talk about moving parts of mercurial to rust. From the history of sprints[2] it sounds like facebook first started experimenting with some mercurial implementations in rust and may be one of the big contributors spearheading this (though I also see indygreg and durin42 around the phabricator projet giving mercurial advice). I'm a fan of both rust and mercurial so this is exciting news to hear.
> The nice things we want to do in native code are complicated to implement in C because cross-platform C is hard. The standard library is inadequate compared to modern languages. While modern versions of C++ are nice, we still support Python 2.7 and thus need to build with MSVC 2008 on Windows. It doesn't have any of the nice features that modern versions of C++ have. Things like introducing a thread pool in our current C code would be hard. But with Rust, that support is in the standard library and "just works." Having Rust's standard library is a pretty compelling advantage over C/C++ for any project, not just Mercurial.
Sounds like the main reason for rust is that Python has a weird dep on a fixed MSVC version.
There was also this point that was phrased as a comparison to python but I think works equally as well for choosing rust over c:
> In addition to performance concerns, Python is also hindering us because it is a dynamic programming language. Mercurial is a large project by Python standards. Large projects are harder to maintain. Using a statically typed programming language that finds bugs at compile time will enable us to make wide-sweeping changes more fearlessly. This will improve Mercurial's development velocity.
For those that don't want to deal with the Python startup time, the Mercurial team already has an attempt at fixing this with a tool called CHg[0]. It is a C binary that interacts with the Mercurial CommandServer[1], which is just a long running version of Mercurial CLI that you can interact with over a pipe.
Using Rust as the primary client would simplify this a lot, but is a lot more work than what CHg accomplished.
We've updated the submission title, which was “Mercurial (hg and C extensions) being rewritten in Rust” to that of the article. It's surprisingly easy to mislead, which is one of the reasons we ask submitters to use the original title and not to editorialize.
One of us misread the article. I understood it to say that they want to rewrite the core of the Python code to Rust so that they don't have to wait for the Python interpreter to start every time someone uses the hg command. The Rust binary will be able to call Python code, which is basically a complete reversal of what they have now.
> Desired End State: hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today)
I'd say it seems like they want to also flip the ratio of the core around, so it's Rust for the central parts and Python only as needed (whereas today it's Python for the core and C for certain performance critical things).
No, they want to flip the relationship of Python and native code; instead of hg being Python with C extensions, they want it to be Rust with an embedded Python interpreter. It's not a total rewrite in Rust, but it's also a more significant change than just replacing C extensions with Rust extensions while maintaining a Python core. Quoting the source material on the “Desired End State”: “hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today)”
Rust isn't hard. The only somewhat hard thing in Rust is writing with borrow checker, and honestly, if you want to seriously write in C, you really need to go through that experience and understand it enough to feel somewhat comfortable with it.
And even if that would really be a thing, giving up compiler checks in order to allow more low quality code to be contributed doesn't really seem like a good trade off to me.
One reason might be that it seems easier to mix languages with Rust vs. Go. The C interop is pretty sane and Go has a few more gotchas in that department.
The rest of the Rust v Go debate may be similar to the 1000 times people have had it before.
My understanding is that Go would be a fine candidate if you wanted to rewrite the entire codebase in it, but not for writing compiled extensions that Python can dynamically link, and they want one language to do both of those things?
One thing that comes to mind is having to bridge two GC/runtimes in the same binary, where Rust is a drop in replacement for C in that regard (systems programming language)
Go isn't really a huge win for hg, and it doesn't support being used as a shared library, so we couldn't easily use it as a replacement for C for native extensions. main() has to be in Go, and even then you've got two GCs in the mix which is not great.
Given a situation where (pre-rust) the usual choice would be to use C or C++, it's totally appropriate to choose Rust. That's one of Rust's express design goals.
Go is intended to make it very easy for junior devs to build concurrent network applications. That's its design goal.
It's not a zero-sum game. Choosing rust doesn't weaken go or vice versa.
[+] [-] yeukhon|8 years ago|reply
I studied hg quite in depth in undergraduate for a semester, when I was implementing "bitbucket" myself. To be really honest, the codebase was easy to navigate, and function names were pretty consistent with the actual hg commands/internal spec. While the code itself is probably hard to write any true unit tests (you'd have to monkeypatch like crazy) using mock -- which means the function has a lot of code, overall the codebase quality was pretty good for a complex software. I just had to know the variable name abbreviations, get used to them and referred back to the Hg paper.
In anyhow, whatever the decision is, I'd learn a bit Rust to help out :-)
P.S. I am still trying to find an answer to this: if FB uses Hg, then what about their git code?
[+] [-] durin42|8 years ago|reply
That's not really the right story. The motivating factor is that we'd like to write /less C/, not /less Python/. It's likely in the long-term that parts of the code that are performance sensitive will move out of Python, but we've been doing that for years, and we just keep finding new things to optimize as the scale of Mercurial repositories keeps increasing.
[+] [-] kibwen|8 years ago|reply
As both a Rust and Python user, I don't necessarily agree. This isn't a rewrite to purge Python; it's clear from the post that Mercurial values Python's flexibility, and wants to bend over backwards to ensure as little of that flexibility is lost in whatever transition may occur. And Python's credentials would be secure with or without Mercurial being written in it. Frankly, I think that swapping out critical bits for a low-level language is a great way to scale a dynamic-language codebase without completely sacrificing the usual ergonomics of dynamism.
[+] [-] cat199|8 years ago|reply
I doubt this will have much impact - bazillions of other things use python
[+] [-] etanol|8 years ago|reply
Just as unfortunate as how Python turned its back to Mercurial
[+] [-] nindalf|8 years ago|reply
Could you elaborate?
[+] [-] nixpulvis|8 years ago|reply
[+] [-] puddums|8 years ago|reply
Instead, it sounds like this is a proof-of-concept for flipping the main 'hg' command from being python + C extensions, to instead being a rust binary with an embedded python interpreter. Part of the rationale appears to be performance, but also smoothing out cross platform experience, especially on Windows.
Pulling out some related snippets:
-----
While Python is still a critical component of Mercurial and will be for the indefinite future, I'd like Mercurial to pivot away from being pictured as a "Python application" and move towards being a "generic/system application." In other words, Python is just an implementation detail.
-----
Desired End State
hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today). Python code seemlessly calls out to functionality implemented in Rust. Fully self-contained Mercurial distributions are available (Python is an implementation detail / Mercurial sufficiently independent from other Python presence on system)
-----
"Standalone Mercurial" is a generic term given to a distribution of Mercurial that is standalone and has minimal dependencies on the host (typically just the C runtime library). Instead, most of Mercurial's dependencies are included in the distribution. This includes a Python interpreter.
-----
This patch should be considered early alpha and RFC quality.
[+] [-] tiles|8 years ago|reply
[+] [-] puddums|8 years ago|reply
[+] [-] bulldoa|8 years ago|reply
[+] [-] krschultz|8 years ago|reply
[+] [-] steveklabnik|8 years ago|reply
https://gregoryszorc.com/work.html
> I am a significant contributor to the Mecurial open source version control system.
> I serve on the Mercurial Steering Committee, which is the governance group for the Mercurial Project. I also have reviewing privileges, allowing me to accept incoming patches for incorporation in the project.
So not sure, but it seems like at least one person on the team is into it?
[+] [-] neandrake|8 years ago|reply
[0] https://www.mercurial-scm.org/wiki/4.4sprint#Rust
[1] https://public.etherpad-mozilla.org/p/sprint-hg4.4-NOSPAMREM... (remove everything after-and-including the last hyphen, I left it in since it seems like they don't want a direct link that's easily scraped?)
[2] https://www.mercurial-scm.org/wiki/4.0sprint
[+] [-] masklinn|8 years ago|reply
[+] [-] ngoldbaum|8 years ago|reply
[+] [-] kbrosnan|8 years ago|reply
[+] [-] bla2|8 years ago|reply
Sounds like the main reason for rust is that Python has a weird dep on a fixed MSVC version.
[+] [-] neandrake|8 years ago|reply
> In addition to performance concerns, Python is also hindering us because it is a dynamic programming language. Mercurial is a large project by Python standards. Large projects are harder to maintain. Using a statically typed programming language that finds bugs at compile time will enable us to make wide-sweeping changes more fearlessly. This will improve Mercurial's development velocity.
[+] [-] Niten|8 years ago|reply
[+] [-] kyrra|8 years ago|reply
Using Rust as the primary client would simplify this a lot, but is a lot more work than what CHg accomplished.
[0] https://www.mercurial-scm.org/wiki/CHg
[1] https://www.mercurial-scm.org/wiki/CommandServer
[+] [-] ngrilly|8 years ago|reply
[+] [-] ruke|8 years ago|reply
And yes.. it's for extension modules (maybe just the beginning)
[+] [-] agentgt|8 years ago|reply
[+] [-] nnethercote|8 years ago|reply
[+] [-] johnny_1010|8 years ago|reply
[+] [-] bedros|8 years ago|reply
[+] [-] masklinn|8 years ago|reply
[0] https://docs.python.org/3/extending/embedding.html
[+] [-] merb|8 years ago|reply
[+] [-] the_mitsuhiko|8 years ago|reply
[+] [-] sctb|8 years ago|reply
[+] [-] jxcl|8 years ago|reply
[+] [-] m12k|8 years ago|reply
I'd say it seems like they want to also flip the ratio of the core around, so it's Rust for the central parts and Python only as needed (whereas today it's Python for the core and C for certain performance critical things).
[+] [-] dragonwriter|8 years ago|reply
[+] [-] ahupp|8 years ago|reply
"Desired End State: hg is a Rust binary that embeds and uses a Python interpreter when appropriate (hg is a Python script today)"
[+] [-] oconnor663|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] gcb0|8 years ago|reply
original code was python with lots of C. the same can be done with Go, while keeping much of the same philosophy of python.
This change will probably alienate most of the contributors since rust and python or C are worlds apart.
[+] [-] seba_dos1|8 years ago|reply
Rust isn't hard. The only somewhat hard thing in Rust is writing with borrow checker, and honestly, if you want to seriously write in C, you really need to go through that experience and understand it enough to feel somewhat comfortable with it.
And even if that would really be a thing, giving up compiler checks in order to allow more low quality code to be contributed doesn't really seem like a good trade off to me.
[+] [-] jhugg|8 years ago|reply
The rest of the Rust v Go debate may be similar to the 1000 times people have had it before.
[+] [-] oconnor663|8 years ago|reply
[+] [-] rubber_duck|8 years ago|reply
[+] [-] durin42|8 years ago|reply
[+] [-] chme|8 years ago|reply
When it comes to low-level, high performance, no overhead stuff and less unexpected behaviour C and Rust fit better.
Where high implementation speed and fast changing, experimental and highly modular architecture systems are more important C++ and Go are better.
For Mercurial and Git Rust or C makes more sense. Especially if you put experimental and fast changing stuff into a scripting language. IMO.
[+] [-] rkangel|8 years ago|reply
[+] [-] erpellan|8 years ago|reply
Go is intended to make it very easy for junior devs to build concurrent network applications. That's its design goal.
It's not a zero-sum game. Choosing rust doesn't weaken go or vice versa.
[+] [-] panic|8 years ago|reply
[+] [-] petre|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] Varriount|8 years ago|reply
Really, There are so many other languages that are both performant and more readable than Rust (I consider Rust readability to be on par with C++).