Alert: NPM modules hijacked

[+] jimrandomh|10 years ago|reply

There's a lesson to be drawn here about dependency hygiene. On one hand, code-reuse is a good thing, and incorporating other projects by reference is a way to make it happen. On the other hand, each dependency you add creates a little bit of risk: that the package will be updated in a way that breaks your application, or the package maintainer will go rogue, or the package will get hijacked.

Most programming-language communities manage the code-reuse/dependency-hygiene tradeoff by concentrating code into a relatively small number of large libraries, and when people want a function or two that isn't in one of the libraries they're using, they incorporate it by copy-paste. In the Javascript/NPM world, on the other hand, I see a lot of projects with dependency references to huge numbers of tiny dependencies.

Today, we're seeing one of the reasons why that's a liability. Most people will take away lessons about code signing and signatures, and presumably the NPM software is going to be improved in that regard. But the other lesson is that projects should have fewer dependencies. Using a library may be as simple as adding one line to a package configuration file, but using a library properly requires substantial due diligence on the library, its license, its bug tracker and its author.

[+] kbenson|10 years ago|reply

> Most programming-language communities manage the code-reuse/dependency-hygiene tradeoff by concentrating code into a relatively small number of large libraries, and when people want a function or two that isn't in one of the libraries they're using, they incorporate it by copy-paste.

I don't think it's necessarily "most programming languages". I know Perl, and probably Python and Ruby, have very large ecosystems of small modules. I suspect it's more along the lines of static vs dynamic, or compiled vs interpreted.

> On the other hand, each dependency you add creates a little bit of risk: that the package will be updated in a way that breaks your application, or the package maintainer will go rogue, or the package will get hijacked.

Those specific risks are not necessarily from including a module, but are from subscribing to a module. If you include a specific version of a module and can confirm it has not changed to a degree that satisfies you (whether that's keeping a local copy to build from, or trusting the distribution system to be immutable and pegging a specific version to install), then your risk is fairly well defined. It's not zero, but it is limited to the problems the module existed with at the point you reviewed and included it.

Subscribing to a module, that is having the system download and build "the latest and greatest of whatever thing you call $foo" is not very safe, and if the build mechanism can execute arbitrary code and is automated, is insanity.

[+] maxthegeek1|10 years ago|reply

The difference here is that many people in the npm world are working on the web. Minimizing application size is of absolute importance on the web.

If we copy and past snippets of code between packages, we lose out on potential code de-duplication from dependency flattening.

If you're really worried about your dependencies disappearing/changing beneath you just check them into git.

[+] pc86|10 years ago|reply

> On the other hand, each dependency you add creates a little bit of risk: that the package will be updated in a way that breaks your application, or the package maintainer will go rogue, or the package will get hijacked.

If your development environment is such that you can't test these things before going into production, and that production updates its packages on its own, then you have much bigger (and harder to fix) issues that dependency hygiene.

[+] sotojuan|10 years ago|reply

What about using "trustable" dependencies? For small utilities like left pad, you could use lodash. You can import just the function, but still get the benefit of knowing that its lodash so it won't go down, it'll be fast, and well tested.

[+] igorgue|10 years ago|reply

Isn't it a bigger lesson to get that Open Source project are for the people and not the interest of a corporation?

[+] unknown|10 years ago|reply

[deleted]

[+] _Marak_|10 years ago|reply

I've been on the receiving end of a NPM name dispute violation handed down by Izs. https://docs.npmjs.com/misc/disputes

Was on vacation and found out that I had lost a published package name within 24 hours of the dispute request. Broke a few production systems. Really messed up my day.

Ended up having to beg with the person who filed the original request and they eventually gave me the package back.

Honestly, the whole process was a bit personal and I felt like I was being singled out as an individual by NPM, rather than being treated like a developer who was using the service. Not a nice feeling.

[+] SlashmanX|10 years ago|reply

It's insane that NPM allows anybody to just take over a module name if the original is unpublished, without even a warning to users.

[+] na85|10 years ago|reply

Insane, yes, but not unexpected. This is the sort of amateur behaviour that is emblematic of the entire nodejs community.

[+] choward|10 years ago|reply

It's insane that NPM allows anybody to un-publish modules to begin with.

Even if I have a dependency locked down in my NPM shrinkwrap file, it can change underneath me? That is pretty absurd and gives me zero confidence in my packages. It means I MUST commit them to source control or risk having my project completely broken some day.

I thought for sure that since NPM removed the ability to re-publish the same version of a package with different content that they also wouldn't let you remove versions of a package. It also means you should never user the "^" version specifier or risk downloading some completely different project.

They do have a "warning" at https://docs.npmjs.com/cli/unpublish: > WARNING > > It is generally considered bad behavior to remove versions of a library that others are depending on!

Seriously, what good does that do? Nobody takes warnings seriously.

[+] henrikfr|10 years ago|reply

Insane, yes, but I guess it's just never really been that much of a problem until now. Hopefully something good will come out of this..

[+] drdaeman|10 years ago|reply

Link rot and domain squatting are a well-known decades-old issues. NPM identifiers can be considered as just another weird form of pseudo-URIs and aren't special here.

It is insane that modules don't have signatures (that are actually verified, of course). Because npm can feed you basically anything.

It's not a problem that something else gets published under the old address. It's perfectly natural. The real problem is the trust model - that new content's accepted without even warnings.

[+] viperscape|10 years ago|reply

https://twitter.com/_scape/status/700453739182817281

My semi relevant tweet, it's just asking for it with that warning

[+] zamalek|10 years ago|reply

Edit: I'm definitely wrong. The person who published these was merely parking the packages. Leaving the existing (inaccurate, see response) message as one example of how badly this could have turned out.

> and the content of the files is suspicious

The script seems as though it might publish your entire codebase to NPM. Sorting NPM packages by creation date[1] reveals[2] a[3] few[4] potential victims. Filtering by the ISC license also seems to work.

[1]: https://libraries.io/search?order=desc&platforms=NPM&sort=cr...

[2]: https://libraries.io/npm/alaska-dev - internally hosted repo, edit: license doesn't match other projects by the same developer

[3]: https://libraries.io/npm/b3app-prototype - private bitbucket repo, edit: deleted from npmjs

[4]: https://libraries.io/npm/nodework - private bitbucket repo

[+] zwily|10 years ago|reply

Installing one of these packages doesn't execute the script though, you'd have to do that yourself. (It could be updated to do that though, which is terrifying.)

[+] doublerebel|10 years ago|reply

Yes, npm will publish any package.json without `private: true` set, and will publish all files in the current directory by default (anything not listed in .gitignore or .npmignore). There is a great CLI tool called "irish-pub" I use, it shows a publish dry-run to avoid these simple mistakes.

[+] nmjohn|10 years ago|reply

x.sh was just the script I used to automatically register all the packages.

It takes one argument, a package name, then attempts to publish that package.

So

    cat list | xargs -I{} ./x {}

was what I used to publish the whole list.

[+] unknown|10 years ago|reply

[deleted]

[+] dustinmoorenet|10 years ago|reply

As shitty as unpublishing is, I think the best thing the npm community needs to take away from this is auto updating is bad. Remove the ~ and ^ from your dependencies so that only a package of a specific version can be installed. "This or better" thinking doesn't work if "better" is unknown. I know that the version I am using currently is fine but I don't know about future updates. Even if we had signed packages, we are still installing unknown software if we just trust other devs. I lock my dependency version and then use https://www.npmjs.com/package/npm-check-updates to figure out what needs updating, then I test my code. This should not be done as an automatic part of the build process.

*edited npm capitalization

[+] Klathmon|10 years ago|reply

No need to muck about with your package.JSON, actually fix the problem by just checking in your node_modules folder.

Then continue working like you always have.

There are some changes that you'll need of you are using native modules, but they are simple and easy to do.

[+] spankalee|10 years ago|reply

Dependency version ranges should be a _wide_ as possible, so that users of your libraries have flexibility when selecting versions.

Rather than artificially constrict your version ranges, NPM should support real version locking, and applications should check in their lock file.

[+] kylehotchkiss|10 years ago|reply

By adding `save-exact=true` to .npmrc in a project NPM saves the exact version.

Learned this trying to fix Shrinkwrap a few weeks back but seems like it's a good practice for securities sake now.

[+] AgentME|10 years ago|reply

This doesn't lock the versions of your sub-dependencies.

`npm shrinkwrap` exists to do that. Applications should use it to pin the versions of all of their dependencies and sub-dependencies.

[+] heavenlyhash|10 years ago|reply

I challenge someone to tell me the difference between npm and an RCE.

"npm" is two things in this picture: the server and centralized service, and the software tool on everyone's build/dev machines.

Folks have historically been happy trusting the centralized npm server to behave consistently and pleasantly, and some opinions have recently shifted on that. But frankly, the server doesn't matter, in the big picture. The npm tool on your computer does. It's the one executing code on your computer with all of your local user's privileges.

This tool starts executing new code from a new author on my host as my user without any authentication except "trust the server". This is exactly the same words we would use to describe behavior of $script_kiddie_virus_of_the_week:

> "download code from C&C server; run it, thanks:D"

What's the difference here? "good intentions"?

I'd rather have something more than "good intentions" controlling what code ends up running on my computer. Wouldn't you?

[+] drinchev|10 years ago|reply

Author of the post here. According to a tweet [1], the user @nj48 seems to be non-malicious.

I updated the blog post.

Nevertheless I find his actions dangerous and irresponsible.

[1] https://twitter.com/seldo/status/712673227630313472

[+] callmevlad|10 years ago|reply

Given the conversation in yesterday's thread [1], I think a known (and personally identified) non-malicious person defensively reserving these packages is less dangerous than some unknown entity taking over these packages later.

I personally think the best course of action would have been for the NPM team to immediately blacklist these names (aside from left-pad, that's a separate conversation) after the entire list of unpublished packages was shared, and then make them available on a case-by-case basis.

https://news.ycombinator.com/item?id=11340510

[+] tombrossman|10 years ago|reply

Curious, should those of us using their Linux PPAs disable them until this is sorted? The idea of re-using disabled package names is out there now and I wonder how long it is before someone malicious tries this. Is this an unreasonable concern?

[+] dontscale|10 years ago|reply

For those of us who are pissed that this is going down but need the status to get on with our lives:

nj48 is a known friendly who has identified himself to us. We're going to clarify later today.

https://twitter.com/seldo/status/712673227630313472

[+] newman314|10 years ago|reply

I bet the orig dev was considered a known friendly until he decided to unpublish.

Relying on the notion of a "known friendly" to protect packages and namespace does not strike me as a sound practice.

As others may have mentioned, publishing packages really should be fire and forget. If something bad goes out, a replacement should be sent out. And for the life of me, I don't understand why they did not go with the <author/package> scheme.

[+] callmevlad|10 years ago|reply

Yes, as far as I've heard, this was done defensively to prevent malicious actors from claiming these packages.

Also, relevant conversation here in yesterday's thread: https://news.ycombinator.com/item?id=11340510

[+] gmisra|10 years ago|reply

It's worth noting that the official NPM dispute resolution policy makes the following very clear (https://docs.npmjs.com/misc/disputes)

> Some things are not allowed, and will be removed without discussion if they are brought to the attention of the npm registry admins, including but not limited to:

...

4. "Squatting" on a package name that you plan to use, but aren't actually using. Sorry, I don't care how great the name is, or how perfect a fit it is for the thing that someday might happen. If someone wants to use it today, and you're just taking up space with an empty tarball, you're going to be evicted.

5. Putting empty packages in the registry. Packages must have SOME functionality. It can be silly, but it can't be nothing. (See also: squatting.)

[+] tomku|10 years ago|reply

Looks like the author of the modules wrote a shell script to generate a package.json file and publish empty modules to npm to grab up all the unpublished names. However, when they ran it, they ran it in the same folder with their shell script and list of available modules and `npm publish` included them in the published modules as it does by default.

[+] raesene4|10 years ago|reply

A big problem with Software repositories that don't allow for /enforce cryptographic signing by the developer is that this can happen...

Ideally the developer would sign before publishing and the consumer could check the signature to validate before using.

Whilst not a silver bullet this is a kind of essential part of a secure package management solution.

[+] pvg|10 years ago|reply

Plenty of repositories require signatures.

[+] pluma|10 years ago|reply

Overly dramatic. The content is not suspicious at all, it's clearly a script to generate the package.json for an arbitrary package name passed in as an argument to the script.

I'm guessing @nj48 used a script to go over the official list of unpublished modules and attempt to generate a placeholder for each of them.

The same user also published some actual (unmanipulated) forks of the original modules, so I'm guessing this is mostly a quick move to preempt any malicious hijacking by others.

There is no reason to assume the "hijacking" is malicious. Certainly not in the scripts. The user is active on GitHub and shows no indication of malicious intent.

However it IS worth pointing out that the unpublished modules should now be treated with caution if you still rely on them because even if the replacements are identical and benevolent you probably need to take action.

[+] askyourmother|10 years ago|reply

There are some very shaky bits in the JS dev arena. Just a little earlier, we read about how a developer unpublished his libraries from NPM, the fallout from this affected numerous high profile projects like node.js, which were relying on a left pad "library" that basically was a small string padding function.

[+] SlashmanX|10 years ago|reply

This is the same story... It's a module that the developer unpublished

[+] kylehotchkiss|10 years ago|reply

Noticed another random package was uninstalled from NPM. Please oh please don't let this thing become a trend.

[+] spriggan3|10 years ago|reply

Sorry but the elephant in the room IS the lack of namespacing. There is no namespacing on NPM. If there were something like that this issue would have never happened. This is not a problem of distributed vs non distributed, NPM doesn't need a "blockchain" or whatever. Packages SHOULD by namespaced period and any serious package manager uses namespaces. I (and many others) called it from the very beginning and warned NPM authors that the lack of namespace would eventually lead to this kind of issue. NPM authors didn't give a damn ( there are other issues that have been known for years but they fell on deaf ears ). Packages should be resolved by a namespace + the name of the package. Now everybody is in panic mode because nobody knows what one is fetching from NPM anymore.

Now I hope NPM author will come to their senses and change the way NPM works. But the trust is broken, no question. Between stuff like that, people selling "realestate" on NPM for real money... Nodejs has been the least professional platform I have ever used. Everybody's out there to make a quick buck, nobody gives a damn, this is whole thing will collapse sooner or latter.

Finally people need to stop with the "unix philosophy" excuse. Importing 10 lines of code from a random package is not the "unix philosophy". Splitting a 100 loc module into 20 packages is not the "unix philosophy". Packages have so many dependencies it's getting ridiculous.

edit: corrected

[+] odinduty|10 years ago|reply

Name squatting in those package managers is a real problem. I found this on pypi and it was a very unpleasant surprise. https://sourceforge.net/p/pypi/support-requests/571/

[+] BinaryBullet|10 years ago|reply

In the root project folder (containing package.json and ./node_modules), you can run the following:

    comm -12 <(ls ./node_modules) <(curl https://gist.githubusercontent.com/azer/db27417ee84b5f34a6ea/raw/50ab7ef26dbde2d4ea52318a3590af78b2a21162/gistfile1.txt)

It will output any of the modules that @azer unpublished yesterday (that are being used by your project).

[+] ErikAugust|10 years ago|reply

Anything wrong with checking your dependencies into your repo? That way, you know, they can't go away or be hijacked.

[+] ktRolster|10 years ago|reply

As a rule of thumb, if a dependency would take you less than two days to rewrite (one day for writing, one day for testing), it's better to do it yourself and avoid the problems of dependencies.

[+] spriggan3|10 years ago|reply

How many nodejs developers actually audit their dependencies ? In my experience not many. it's just "npm search" without thinking. then "my problem is somebody else now". Frankly the whole ecosystem is just scary.

[+] kofejnik|10 years ago|reply

PSA: you should be keeping your dependencies in git

[+] davexunit|10 years ago|reply

Bundling is not the answer.

141 comments