top | item 19294512

Case Study: Npm uses Rust for its CPU-bound bottlenecks [pdf]

369 points| yarapavan | 7 years ago |rust-lang.org | reply

292 comments

order
[+] javagram|7 years ago|reply
Title could be improved, as I was wondering if the command line tool npm had itself been rewritten.

It’s actually one of npm’s web services that was rewritten.

FTA: “Java was excluded from consideration because of the requirement of deploying the JVM and associated libraries along with any program to their production servers. This was an amount of operational complexity and resource overhead that was as undesirable as the unsafety of C or C++.”

I find this comparison a bit odd. Even if not using containers, the JVM isn’t hard to deploy as distro package managers include it. Unless a team is managing servers manually rather than an automated tool this doesn’t seem that complex. Am I missing something here?

[+] bloopernova|7 years ago|reply
Well, speaking from experience with a JBoss application layer in a recent software project I worked on:

New java versions do break existing libraries or apps, and need to be tested thoroughly. When the company hasn't budgeted for that expense, it becomes difficult to update.

Often an architect or software team will insist on using the Oracle JVM rather than the included openjvm. That adds extra steps to download, store as an artifact, distribute, verify, etc etc.

The people who wrote the build pipeline have since been laid off, and an updated set of libraries requires a lot of work to trace back through poorly documented and understood code to make changes.

(Not to disagree with you here, it's more that I'm trying to illustrate how, with poor foresight, Java dependencies can get difficult to manage)

[+] tyingq|7 years ago|reply
I just like that someone at npm would avoid something because it has lots of dependencies and overhead. The irony is strong with this one.
[+] brazzledazzle|7 years ago|reply
If I have a choice between dealing with it and not I’ll choose not every time. It’s an annoying dependency that got even more annoying when Oracle decided to be a pain in the ass. Just because something is “easy” doesn’t mean it’s easier than the alternative.
[+] vbezhenar|7 years ago|reply
I just unpack JDK, I have no idea what so complex about it. It was complex on Windows because I used VM to install Java and then copy folder, but they made it easier with 11.
[+] strictfp|7 years ago|reply
To me it reads as "We didn't want to use Java".
[+] jrs95|7 years ago|reply
Yeah. I can see the Java issue as a minor annoyance, but to put it on the same level as the lack of safety with C/C++ seems hyperbolic to me.
[+] brown9-2|7 years ago|reply
It reads like an innocent misperception due to inexperience with Java.

Very few people would choose to install libraries (JARs) used by their code via their OS package manager for instance.

[+] feikname|7 years ago|reply
JVM tuning (especially the GC and memory allocation scaling) can be a huge PITA.
[+] rmrfrmrf|7 years ago|reply
I'm not a fan of relying on distro package managers for installation of runtime dependencies on servers. Too many opportunites for variables to creep in if the version isn't locked, and then having to make sure all the package manager dependencies and config themselves. Even if you automate you're still at the mercy of the repo to have the version you need etc. and often times you need to customize the install for a highly-available and/or virtualized environment. Bad times all around.
[+] oweiler|7 years ago|reply
Also with GraalVM you may get away with building a native image which doesn't require a JVM at all.
[+] sheeshkebab|7 years ago|reply
Deploying jvm app/service is now even more complex than figuring out which version of node to use to run a service. Is it supposed to be oracle java? Openjdk? Adoptopenjdk or one of a half dozen more? Which version of it? Anything to tweak in gc or startup settings for it/the version? Do we need to regression test the service on minor jdk upgrade? Is that jvm compatible with some os version we are running that has some security patches and other settings?
[+] paulddraper|7 years ago|reply
Yeah, "npmjs.com" would have been more enlightening.
[+] GordonS|7 years ago|reply
The issue with the JVM is keeping it updated, in the context of a constant stream of security issues that need patching.
[+] steveklabnik|7 years ago|reply
Hey folks! This is part of our whitepaper series. This means that the audience is CTOs of larger organizations, and so the tone and content are geared for that, more than HN. Please keep that in mind!
[+] breatheoften|7 years ago|reply
Are CTO's of largish organizations still not part of the hacker news audience ...? That seems a bit of a damning statement to be about the population of CTO's ...
[+] phillipcarter|7 years ago|reply
Good overall whitepaper and I like to see efforts like these.

One are I felt was missing is some data here:

"It keeps resource usage low without the possibility of compromising memory safety. "

How did the resource usage compare with the Go and node rewrites? What metrics were used under which workload? Benchmarks are never perfect but I think a CTO-level person would like to see a table of results like that.

[+] the_duke|7 years ago|reply
A better title would be the subtitle of the article: "The npm Registry uses Rust for its CPU-bound bottlenecks".

Note that only one service (authentication) was rewritten from node to Rust.

[+] steveklabnik|7 years ago|reply
That’s what I titled it when I submitted it after we first published this.

There’s also a second service we know about in Rust, and that’s the one that renders package README pages.

[+] aboutruby|7 years ago|reply
Or title could use "npm, Inc" as it's referring to the organization
[+] BlackFly|7 years ago|reply
> Java was excluded from consideration because of the requirement of deploying the JVM and associated libraries along with any program to their production servers. This was an amount of operational complexity and resource overhead that was as undesirable as the unsafety of C or C++.

Just so everyone here is aware, this is by now an outdated complaint against Java.

https://vertx.io/blog/eclipse-vert-x-goes-native/

I'm choosing vertx as an example since it competes already with rust and c based applications over at https://www.techempower.com/benchmarks but you ought to be able to compile general programs ahead of time.

[+] ilovecaching|7 years ago|reply
I'd like to point out that even though it took them a week compared to an hour, a week is actually incredibly fast to learn Rust and build something useful with it. Learning C++ can take more than a month of training, and it's only because most people learn it over an entire semester at school that they learn it at all. This is also the time it takes to learn Rust, and presumably now that they've written one program, writing the next program will take them a fraction of a week.

The amount of time it takes to learn something is often indicative of its power. Anyone who has learned a foreign language or a musical instrument knows that the time spent investing up front pays huge dividends down the road when you have the skills and tools to richly express yourself. The reason that Go takes two days to learn is because it artificially limits the amount of up front investment at the cost of limiting expressiveness over the lifetime of your use of the language.

[+] faitswulff|7 years ago|reply
I like how this whitepaper sidesteps the "but the rewrite is the real improvement!" by also rewriting the service in Node.js along with Go and Rust.
[+] adamnemecek|7 years ago|reply
I’ve been playing around with rust since it came out but only recently did I decide to use it for a part of a project. It’s a very pleasant language. I didn’t fight the borrow checker much (maybe due to prior experience).

The language is nuts. It’s true what they say cargo is even better than the language, it’s just so easy to add packages to your project or to split your project into packages.

Cargo is an amazing investment as this will help people write non duplicated code. Like how many string implementations are there across c code bases. Each c project has so much code that’s the most boring, repetitive shit you can imagine. Cargo let’s you concentrate on writing your code without hassle.

I have experience with a lot of package managers, gems, go, cocoapods, sbt, cabal, pip, spm, npm, you name it but cargo is on a different plane of existence. Cargo makes the whole internet your standard library.

I also like cargo workspaces. Modern development needs a workflow where you pull in a dependency, and work on it in tandem with your code. Achieving a good workflow for this is surprisingly hard.

[+] spricket|7 years ago|reply
While I'm a big fan of Rust, excluding Java because "JVM" is kinda laughable. It's not hard to run at all. You package everything into a jar then run a single command. As easy to get working as a JS backend.

If their complaints are about GC tuning, is it not the same thing as tuning the GC in Js/Go? Java still had arguably the more mature GC of any language

[+] StreamBright|7 years ago|reply
No surprise here, good language design pays off for real world applications especially at scale like the NPM infra.
[+] nothrabannosir|7 years ago|reply
This entire article is a pretty damning report on JavaScript in general, but this sentence takes the cake (emphasis mine):

> The process of deploying the new Rust service was straight-forward, and soon they were able to forget about the Rust service because it caused so few operational issues. At npm, the usual experience of deploying a JavaScript service to production was that the service would need extensive monitoring for errors and excessive resource usage necessitating debugging and restarts.

Is this satire?

[+] twiss|7 years ago|reply
They also state that writing the service in Node took them an hour, two days for Go, and a week for Rust. Even taking into account their unfamiliarity with the language, it's probably fair to say that when switching to Rust, you'll usually spend more time writing and less time debugging. Whether that trade-off is worth it depends on the project.
[+] atoav|7 years ago|reply
Rust's package managment (cargo) is the best thing I have ever seen of it's kind. The very basic thing you can do is: cargo new funkyproject

Which creates a new barebones rust project called "funkyproject". Every dependency specified in it's Cargo.toml will be automatically downloaded at build (if there is a new version).

When a build is sucessful the versions of said dependency will be saved into a Cargo.lock file. This means if it compiles for you, it should compile on every other machine too.

A cargo.toml allows you also to use (public or private) repositories as a sorce for a library, specify wildcard version numbers to only select e.g. versions newer than 1.0.3 and older than 1.0.7 etc.

Because the compiler will show you unused dependencies you never really end up including anything you don't use. In practice this system does not only work incredibly well, but is also very comfortable to use and isolates it self from the system it is running on quite well.

I really wish Python also had something like this. Pipenv is sort of going into that direction, but it is nowhere near cargo in functional terms.

[+] fpgaminer|7 years ago|reply
I wrote and deployed a production service written in pre-1.0 Rust. In over three years of being deployed I never once had to touch that code. The infrastructure around it evolved several times, we even moved cloud providers in that time, but that particular service didn't need any changes or maintenance. It just kept chugging along.

Perhaps Rust's name is apropos: your code will be so reliable that you won't need to look at it again until it has collected rust on its thick iron framework.

[+] sheeshkebab|7 years ago|reply
Deploying a service written in any language into production environment at scale of npmjs is far from straightforward.

I think the satire here is that internet got so centralized lately that even a simple piece of code in JavaScript requires such a huge behemoth of an org running and maintaining all this monstrous infrastructure.

[+] faitswulff|7 years ago|reply
The whitepaper notes that almost 9 billion NPM packages are downloaded per week, so I don't see anything laughable about needing good monitoring.
[+] masklinn|7 years ago|reply
I would expect that to be par for the course for most languages. The more dynamic the more problematic, but it stands to reason that the less you can check for and enforce statically the more will eventually blow up at runtime.

Resource usage is similar though not exactly aligned e.g. Haskell has significant ability to statically enforce invariants and handle error conditions, but the complex runtime and default laziness can make resource usage difficult to predict.

I'd guess OCaml would also have done well in the comparison as it too combines an extensive type to system which is difficult to bypass with an eager execution model.

[+] echelon|7 years ago|reply
This is my experience, too.

I wrote trumped.com and deployed it prior to the last presidential election. The frontend and assets have been redeployed, but the core rust service for speech generation hasn't been touched. I've never had a service this reliable, and it took so little effort!

Rust is the best language I've ever used, bar none, period. And I've used a countless many of them.

The only places where I won't write Rust are for small one-off scripts and frontend web code. (Even with wasm, Typescript would be tough to beat.)

[+] austincheney|7 years ago|reply
> This entire article is a pretty damning report on JavaScript in general

How so?

[+] ddebernardy|7 years ago|reply
Anecdotal, but I recently used Gulp in a project to run some css clean-up tasks as part of a build process.

The JS dependencies:

    "gulp"
    "gulp-clean-css"
    "gulp-postcss"
    "gulp-uglify"
    "autoprefixer"
    "postcss-uncss"
    "uncss"
The number of node modules: just over 400.

So I'm not at all surprised that this might create surprises when deploying JS services in production.

[+] cryptica|7 years ago|reply
The idea that some programming languages can solve scalability issues is a myth. A language cannot solve scalability issues; all they can do is push the needle a tiny little bit further in terms of performance but this is completely meaningless.

Scalability is an architectural concern which cannot be ignored by system developers. This is because scalability is not about speed or performance, it's all about figuring out which workloads can be split up and executed in parallel; in order to do this, you need to understand the real-world problem which the software is trying to solve; this is not something that you can delegate to a compiler.

The best that a language can offer in terms of scalability is to make it easier to reason about parallel workloads and make the difference between serial and parallel workloads as explicit as possible. Whenever a language tries to hide the complexity of parallelization behind thread pools, they're not solving any real scalability issue; they're just delaying them some more.

[+] mavelikara|7 years ago|reply
> The idea that some programming languages can solve scalability issues is a myth.

True.

> A language cannot solve scalability issues; all they can do is push the needle a tiny little bit further in terms of performance but this is completely meaningless.

> Scalability is an architectural concern which cannot be ignored by system developers.

A language can prevent or delay such architectural concerns from being addressed by not offering sufficient capabilities.

[+] imtringued|7 years ago|reply
I don't understand the criticism of thread pools. Their only purpose is to avoid the expensive creation of threads. They don't do anything by themselves.
[+] megous|7 years ago|reply
This is no study. It reads more like a conclusion to one.
[+] makkesk8|7 years ago|reply
Interested to know why they didn't consider .net core.
[+] twiss|7 years ago|reply
For all this talk about performance, are there any benchmarks anywhere? Also, is there any blog post or anything by npm itself that this is sourced from?
[+] truth_seeker|7 years ago|reply
I have my doubts and confusion about this problem statement

> Most of the operations npm performs are network-bound and JavaScript is able to underpin an implementation that meets the performance goals. However, looking at the authorization service that determines whether a user is allowed to, say, publish a particular package, they saw a CPUbound task that was projected to become a performance bottleneck.

Oh, Really ???

So essentially Authorization service and I doubt the security algorithms computation are the main cause.

What i dont understand here is why is it not possible to write lower level JS or asm code to craft a well optimized code which V8 can totally nail to minimum CPU instructions required?

[+] tekknik|7 years ago|reply
The statement about Go using global dependencies being the standard is just someone’s opinion on the Go team. I’ve written Go for a few years now and never once shared a dep across projects. Create a new folder, set your GOPATH (use direnv) and pull your deps. I very much doubt they’ll be more productive in rust vs Go had they actually given Go a chance.
[+] z3t4|7 years ago|reply
You shouldn't be afraid to make independent micro services. Using a different language makes it more likely that it can be deleted, and more easily rewritten. If you are forced into a monolith usually means the architecture has a complexity problem.