top | item 8099713

How to take over the computer of a Maven Central user

405 points| akerl_ | 11 years ago |blog.ontoillogical.com | reply

127 comments

order
[+] moxie|11 years ago|reply
At Open Whisper Systems, we wrote a small open source gradle plugin called "gradle-witness" for this reason. Not just because dependencies could be transported over an insecure channel, but also because dependencies could be compromised if the gradle/maven repository were compromised:

https://github.com/whispersystems/gradle-witness

It allows you to "pin" dependencies by specifying the sha256sum of the jar you're expecting.

[+] heavenlyhash|11 years ago|reply
Hi moxie, might I ask if you've considered SHA-384 instead?

If I understand correctly, SHA-256 is part of the SHA-2 family of hash algorithms, and like SHA-1, when used alone it is subject to length extension attacks.

SHA-384 is also a member of the SHA-2 algorithm family, but is immune to length extension attacks because it runs with an internal state size of 512 bits -- by emitting fewer bits than its total internal state, length extensions are ruled out. (Wikipedia has a great table for clarifying all these confusing names and families of hashes: [5].) Other hashes like BLAKE-2 [1], though young, also promise built-in immunity to length-extension attacks. mdm [2] is immune to this because the relevant git datastructures all include either explicit field lengths as a prefix, or are sorted lists will null terminators, both of which diffuse length extension attacks by virtue of breaking their data format if extended.

Not that it's by any means easy to find a SHA-256 collision at present; but should collisions be found in the future, a length extension attack will increase the leverage for using those collisions to produce binaries that slip past this verification. An md5 Collision Demo[3] by Peter Selinger is my favourite site for concretely demonstrating what this looks like (though I think this[4] publication by CITS mentions the relationship to length extension more explicitly).

(I probably don't need to lecture to you of all people about length extensions :) but it's a subject I just recently refreshed myself on, and I wanted to try to leave a decent explanation here for unfamiliar readers.)

--

I'm also curious how you handled management of checksums for transitive dependencies. I recall we talked about this subject in private back in April, and one of the concerns you had with mdm was the challenge of integrating it with existing concepts of "artifacts" from the maven/gradle/etc world -- though there is an automatic importer from maven now, mdm still requires explicitly specifying every dependency.

Have you found ways to insulate gradle downloading updates to plugins or components of itself?

What happens when a dependency adds new transitivity dependencies? I guess that's not a threat during normal rebuilds, since specifying hashes ahead of time already essentially forbids loosely semver-ish resolution of dependencies at every rebuild, but if it does happen during an upgrade, does gradle-witness hook into gradle deeply enough that it can generate warnings for new dependencies that aren't watched?

This plugin looks like a great hybrid approach that keeps what you like from gradle and while starting to layer on "pinning" integrity checks. I'll recommend it to colleagues building their software with gradle.

P.S. is the license on gradle-witness such that I can fork or use the code as inspiration for writing an mdm+gradle binding plugin? I'm not sure if it makes more sense to produce a gradle plugin, or to just add transitive resolution tools to mdm so it can do first-time setup like gradle-witness does on its own, but I'm looking at it!

--

Edited: to also link the wikipedia chart of hash families.

--

[1] https://blake2.net/

[2] https://github.com/polydawn/mdm/

[3] http://www.mscs.dal.ca/~selinger/md5collision/

[4] http://web.archive.org/web/20071226014140/http://www.cits.ru...

[5] https://en.wikipedia.org/wiki/SHA-512#Comparison_of_SHA_func...

[+] notthetup|11 years ago|reply
This!

I am totally happy donating $10 to whisper systems for this work instead of forcing me to donate $10 to Apache Foundation (although a worthy cause) to be able to get https access to Maven Central.

[+] technomancy|11 years ago|reply
For Leiningen at least the goal is eventually to be able to flip a switch that will make it refuse to operate in the presence of unsigned dependencies. We're still a ways away from that becoming a reality, but the default is already to refuse to deploy new libraries without an accompanying signature.

Edit: of course, the question of how to determine which keys to trust is still pretty difficult, especially in the larger Java world. The community of Clojure authors is still small enough that a web of trust could still be established face-to-face at conferences that could cover a majority of authors.

The situation around Central is quite regrettable though.

[+] weavejester|11 years ago|reply
Leiningen also uses Clojars over HTTPS by default, I believe, so even without a web of trust, Clojars is still more secure than Central.
[+] brianefox|11 years ago|reply
The project to offer ssl free to every user of Maven Central is already underway. Stay tuned for details.
[+] ontoillogical|11 years ago|reply
Author here.

Brian are you speaking as a representative of Sonatype, or are you a 3rd party?

[+] needusername|11 years ago|reply
I'm pretty surprised that this article is news. Sonatype has been open about SSL for Maven Central since there has been Nexus or maybe even longer. I remember Jason van Zyl talking about this seven or more years ago.
[+] brl|11 years ago|reply
Yawn. Let me know when you're ready to announce a project to competently sign and verify artifacts.
[+] heavenlyhash|11 years ago|reply
SSL would have partially mitigated this attack, but it's not a full solution either. SSL is transport layer security -- you still fully trust the remote server not to give you cat memes. What if this wasn't necessary? Why can't we embed the hash of the dependencies we need in our projects directly? That would give us end-to-end confidence that we've got the right stuff.

This is exactly why I built mdm[1]: it's a dependency manager that's immune to cat memes getting in ur http.

Anyone using a system like git submodules to track source dependencies is immune to this entire category of attack. mdm does the same thing, plus works for binary payloads.

Build injection attacks have been known for a while now. There's actually a great publication by Fortify[2] where they even gave it a name: XBI, for Cross Build Injection attack. Among the high-profile targets even several years ago (the report is from 2007): Sendmail, IRSSI, and OpenSSH! It's great to see more attention to these issues, and practical implementations to double-underline both the seriousness of the threat and the ease of carrying out the attack.

Related note: signatures are good too, but still actually less useful than embedding the hash of the desired content. Signing keys can be captured; revocations require infrastructure and online verification to be useful. Embedding hashes in your version control can give all the integrity guarantees needed, without any of the fuss -- you should just verify the signature at the time you first commit a link to a dependency.

[1] https://github.com/polydawn/mdm/

[2] https://www.fortify.com/downloads2/public/fortify_attacking_...

[+] michaelt|11 years ago|reply

  Why can't we embed the hash of the dependencies we need 
  in our projects directly?
There's a lot of stuff in Maven, like the versions plugin and the release plugin, to update dependencies to the latest version. This stuff is useful for continuous integration and automated deployment, especially when your project is split into lots of modules to allow code reuse.

With code signing, you can (or hypothetically could, I don't know if anyone does this) check the latest version is signed by the same key as the previous version - whereas just pinning the hash wouldn't allow that.

I agree pinning the hash is useful if the signing key is captured.

[+] femto113|11 years ago|reply
Perhaps as a stopgap Maven Central (or a concerned third party?) could publish all of the SHA1 hashes on a page that is served via HTTPS. This would at least allow tools to detect the sort of attack described in the article.
[+] jontro|11 years ago|reply
This is a horrible policy made by sonatype. A better alternative of mavencentral should be created...
[+] finnn|11 years ago|reply
Evilgrade (https://github.com/infobyte/evilgrade) is a similar tool that works on a wider variety of insecure updaters. Perhaps a module could be written? Maybe one already exists, I haven't played with it in a while
[+] MrSourz|11 years ago|reply
I'm torn on how I feel about security being a paid feature in this case. Here the onus is being placed on the user, yet many won't be conscious of the choice they're making.

The tiff mentioned in the article was interesting to read. > https://twitter.com/mveytsman/status/491298846673473536

[+] avz|11 years ago|reply
Exposing your users to MITM attacks in order to encourage donations? Pure evil.
[+] danielweber|11 years ago|reply
If you aren't paying money, you aren't the user, you are a product.

Freemium models often suck because of stuff like this[1]. But if the "users" would just consider it normal to pay money then we wouldn't have crazy things going on where people providing critical infrastructure services need to figure out how to "convert" their "users." Instead, say, every professional Java shop would pay $100 a year or so for managed access. Projects that want to use it like a CDN so their users could download would pay a fee to host it.

They have bills to pay. They'll cover them one way or the other. If we pay directly at least we know what the game is.

[1] They could be inserting advertising into the jars. Hey, at least it would still be a "free" service, right?

[+] ternaryoperator|11 years ago|reply
That's a little paranoid. Let's see, we'll completely ruin our rep and our core business activity just so you're forced to donate--not to us, but to this open source group over here. Dude, put down the pipe.
[+] jimrandomh|11 years ago|reply
My main experience with Maven has been downloading some source code, and having to use Gradle to compile it. It went and downloaded a bunch of binaries, insecurely. There were no actual unsatisfied dependencies; it was just downloading pieces of Gradle itself.

I would've much rather had a Makefile. Build scripts and package managers need to be separate.

[+] yourad_io|11 years ago|reply
> Build scripts and package managers need to be separate.

This. Especially when there's broken links, you're gonna have a bad (and long) time.

[+] taeric|11 years ago|reply
I will join the small chorus agreeing that build scripts and package managers should be separate. Most folks I work with disagree.

Curious if anyone knows of any well done takes on this. In either way. (If I'm actually wrong, I'd like to know.) (I fully suspect there really is no "right" answer.)

[+] dmacvicar|11 years ago|reply
Finally someone who sees the real problem.
[+] jc4p|11 years ago|reply
jCenter is the new default repository used with Android's gradle plugin, I haven't used it myself yet but it looks like the site defaults to HTTPS for everything: https://bintray.com/bintray/jcenter
[+] jbaruch_s|11 years ago|reply
Full disclosure - I am a developer Advocate with JFrog, the company behind Bintray.

So,jcenter is a Java repository in Bintray (https://bintray.com/bintray/jcenter), which is the largest repo in the world for Java and Android OSS libraries, packages and components. All the content is served over a CDN, with a secure https connection. JCenter is the default repository in Goovy Grape (http://groovy.codehaus.org/Grape), built-in in Gradle (the jcenter() repository) and very easy to configure in every other build tool (maybe except Maven) and will become even easer very soon.

Bintray has a different approach to package identification than the legacy Maven Central. We don't rely on self-issued key-pairs (which can be generated to represent anyone, actually and never verified in Maven Central). Instead, similar to GitHub, Bintray gives a strong personal identity to any contributed library.

If you really need to get your package to Maven Central (for supporting legacy tools) you can do it from Bintray as well, in a click of a button or even automatically.

Hope that helps!

[+] sgarman|11 years ago|reply
Thanks for pointing this out, this is REALLY new, perhaps added in the most recent build. I didn't see notes for this either.
[+] tdicola|11 years ago|reply
Nice, I was going to ask if maybe Google or someone invested heavily in Android could step up and provide a secure source of dependencies for everyone.
[+] tensor|11 years ago|reply
The biggest problem with this policy is that new users, or even experienced ones, are likely not aware of it. This is a very serious problem that should be addressed quickly.

edit: and with websites everywhere routinely providing SSL, it seem crazy that it has to be a paid feature for such a critical service.

[+] sitkack|11 years ago|reply
Funny thing is that CERT doesn't have a problem with shenanigans like this. They are more concerned with buffer overflows than by-design stupidity.
[+] clarkm|11 years ago|reply
So in principle, it's doing the same thing as:

    $ curl http://get.example.io | sh
which we all know is bad. But in this case, it's hidden deep enough that most people don't even know it's happening.
[+] akerl_|11 years ago|reply
Sort-of. That has the additional non-malicious risks. A broken connection turns "rm -r /var/lib/cool/place" into "rm -r /var/" and the shell processes that.
[+] yonran|11 years ago|reply
Downloading from HTTP is not an issue (as far as integrity is concerned) if maven were to validate the downloads against some chain of trust. But apparently it is not.

Now I am wondering what tool actually uses those .asc files that I have to generate using mvn gpg:sign-and-deploy-file when I upload new packages to sonatype...

[+] chetanahuja|11 years ago|reply
If I understand this correctly, maven based builds can contain dependencies on libraries hosted on remote servers. golang build system has (or had) something similar too. Witnessing this trend take hold is astonishing and horrifying in equal parts. Not just as a security problem (which is clearly obvious) but also a huge hole in software engineering practices. How can anyone run a production build where parts of your build are being downloaded from untrusted third party sources in real time? How do you ensure repeatable, reliable builds? How do you debug production issues with limited knowledge of what version of various libraries are actually running in production?
[+] dmacvicar|11 years ago|reply
Java developers kind of laugh when I explain them that Linux distros struggle to bootstrap Maven from source due to being a non-trivial tool that depends on hundreds of artifacts to build.

The point is, what do you care that your repo is local, or that your jars are secured, if the tool you got maven itself in binary form, from a server you don't control?

That is the whole point of Linux distros package managers. It is not only about dependencies. Is about securing the whole chain and ensure repeatability.

Maven design, unlike ant, forces you to bootstrap it from binaries. Even worse, maven itself can't handle building a project _AND_ its dependencies from source. Why will the rest of the infrastructure be important then?

Yes, Linux distros build gcc and ant using a binary gcc and a binary ant. But it is always the previous build, so at some point in the chain it ends with sources and not with binaries.

And this is not about Maven's idea and concept. If it had depended on a few libraries and a simple way of building itself instead of needing the binaries of half of the stuff it is supposed to build in the first place (hundreds), just to build itself.

[+] buerkle|11 years ago|reply
It's fairly easy to setup a local server containing all your jars and still use maven or ivy. I do that at my current employer.
[+] watwut|11 years ago|reply
"How can anyone run a production build where parts of your build are being downloaded from untrusted third party sources in real time? How do you ensure repeatable, reliable builds?"

By not downloading everything from maven central in real time. Companies usually run their own repository and builds query that one. Central is queried only if the company run repository is missing some artifact or they want to update libraries. How much bureaucracy stands between you and company run repository upgrades depends on company and project needs.

As for production, does anyone compile stuff on production? I through everyone sends there compiled jars. You know what exact libs are contained in that jar, no information is missing.

[+] SoftwareMaven|11 years ago|reply
Hosting your own makes sense for multiple reasons: you can be assured what code you are getting, you aren't limited by bandwidth rates of remote providers, and you get to control up/down time. The first is a must; the second and third make life more tolerable.
[+] eikenberry|11 years ago|reply
In golang-land it is popular to deal with this by vendoring all the packages you depend on. There are several tools to manage this like godep. This is my preferred method as it allows for the reliable, repeatable build you are talking about.

There are other schools of thought, like pinning the remote repos to specific commit-id. These are better than nothing, but still depends on 3rd party repos which I think is to risky for production code. It is great for earlier stages of a project when you are trying to work out the libraries you will use and also need to collaborate.

[+] jwhitlark|11 years ago|reply
A couple of years ago we were trying to use BigCouch in a product. The erlang build tool was happy to have transitory dependencies that were just pointing at github:branch/HEAD. It got to the point where we'd build it on a test machine, and then just copy the binary around.
[+] ShardPhoenix|11 years ago|reply
With Maven you specify the versions of libraries, jars are cached locally, and you can run your own local Maven server if you need to.
[+] jnbiche|11 years ago|reply
npm has the same problem of sending packages over http, but it's even worse since on average each node package uses about a billion other packages and because injecting malicious code in JavaScript is incredibly easy.

And to be clear, just http here is not the issue. It's http combined with lack of package signing. apt runs over http, but it's a pretty secure system because of its efficient package signing. Package signing is even better than https alone since it prevents both MITM attacks and compromise of the apt repository.

In fact, apt and yum were pretty ahead of their time with package signing. It's a shame others haven't followed their path.

[+] seldo|11 years ago|reply
npm by default uses HTTPS, and has for more than 3 years. It's a little confusing because the loglines all say "http" in green, but if you actually look at the URLs being downloaded they are all to https://registry.npmjs.org/
[+] sitkack|11 years ago|reply
luarocks has the same problem. You don't need SSL, you need the packages to be signed.
[+] 0x0|11 years ago|reply
I wonder how many enterprise apps have been backdoored through this flaw over the years by now.
[+] sitkack|11 years ago|reply
I'd immediately backdoor the rt.jar and the compiler so that future binaries have the backdoor. Trusting trust ...