Google open-sources Rust crate audits

SilverBirch|2 years ago

>Before a project starts using a new crate, members usually perform a thorough audit to measure it against their standards for security, correctness, testing, and more.

Do they? I mean really? Let's lay aside the fact that it's almost impossible to eyeball security. I just cannot imagine that Google works so differently to every company I've ever worked at that they actually carefully check the stuff they use. Every company I've worked at has had a process for using external code. Some have been stricter than others, none have meaningfully required engineers to make a judgement on the security of code. All of them boil down to speed-running a pointless process.

And that leaves apart the obvious question: I want to use a crate, I check it 'works' for what I need. Some middle manager type has mandated I have to now add it to crate audit (FYI, this is the point I dropped importing the library and just wrote it myself) so I add it to crate audit. Some other poor sap comes along and uses it because I audited it, but he's working on VR goggles and I was working on in-vitro fertilization of cats and he's using a whole set of functions that I didn't even realise were there. When his VR Goggles fertilize his beta testers eyes with cat sperm due to a buffer overflow, which of us get fired?

yablak|2 years ago

Some useful context here:

https://chromium.googlesource.com/chromiumos/third_party/rus...

Seems there are 3-4 folks who helped build this and spent a lot of time doing initial audits; they outsource crypto algorithm audits to specialists.

zemnmez|2 years ago

Before the layoffs I worked on a security checks team (“ISE Hardening”) at Google. Google requires for almost all projects that code is physically imported into the SCS; when this code touches anything at all, extremely stringent security checks run at build-time.

These checks often don’t attempt to detect actual exploit paths, but for usage of APIs that simply may lead to vulnerability. These checks can only be disabled per file or per symbol and per check by a member of the security team via an allowlist change that has to be in the same commit.

This is not perfect but is by far the most stringent third party policy I’ve seen or worked with. The cost of bringing 3p code into the fold is high.

The flipside of this is that Google tech ends up with an insular and conservative outlook. I’d describe the Googl stack as ‘retro-futuristic’. It is still extremely mature and effective.

dboreham|2 years ago

Like many here I haven't seen the Google sausage being made, but I've had many Googler coworkers and friends over the years. I've learned that they may really be in another universe (e.g. put every single line of code over all space and time in the same SCCS, oh and write a new kind of build system while you're at it because otherwise that...doesn't work). So possibly they just don't use external dependencies, and the small number they do use really are "properly" audited?

But meanwhile in the regular universe, yes it happens the way you say.

zeroxfe|2 years ago

> All of them boil down to speed-running a pointless process.

There's a pretty large gap between auditing every line of code and doing nothing. Google does a good job managing external dependencies within their monorepo. There's dedicated tooling, infrastructure, and processes for this.

neilv|2 years ago

Starting over a decade ago, I instituted auditing packages used from a Cargo-like network package manager, in an important system that handled sensitive data.

I set up the environment to disable the normal package repo access. Every third-party package we wanted to use had to be imported into a mirror in our code repo and audited. (THe mirror also preserved multiple versions at once, like the package manager did.) New versions were also audited.

One effect of this was that I immediately incurred a cost when adding a new dependency on some random third party, which hinted at the risk. For example, if a small package had pulled in a dozen other dependencies, which I also would've had to maintain and audit, I would've gone "oh, heck, no!" and considered some other way.

At a later company, in which people had been writing code pulling on the order of a hundred packages from PyPI (and not tracking dep versions), yet it would have to run in production with very-very sensitive customer data... that was interesting. Fortunately, by then, software supply chain attacks were a thing, so at least I had something to point to, that my concern wasn't purely theoretical, but a real active threat.

Now that I have to use Python, JavaScript, and Rust, the cavalier attitudes towards pulling in whatever package some Stack Overflow answer used (and whatever indirect dependencies that package adds) are a source of concern and disappointment. Such are current incentives in many companies. But it's nice to know that some successful companies, like Google, take security and reliability very seriously.

pornel|2 years ago

Yes, some people review literally every line. Cargo-crev has a field for thoroughness. Many reviews are just "LGTM", but some reviewers really take time to check for bugs and have flagged dodgy code.

vasco|2 years ago

> When his VR Goggles fertilize his beta testers eyes with cat sperm due to a buffer overflow, which of us get fired?

The PM gets promoted for encouraging fast experimentation!

lathiat|2 years ago

At least sometimes: https://cloud.google.com/assured-open-source-software

Only 1000 packages but certainly seems they do that for a subset.

Conscat|2 years ago

> When his VR Goggles fertilize his beta testers eyes with cat sperm due to a buffer overflow

Ahh, classic undefined behavior.

afranchuk|2 years ago

Well just today I found unsoundness in a crate I was auditing. It turned out that the crate had since removed the entire module of functionality in question so I couldn't submit a bug, but it led me to take steps to remove use of the crate entirely.

jupp0r|2 years ago

Don't forget that you need to do this not only for the crate you depend on, but the whole dependency subtree that comes with it as well.

progbits|2 years ago

Can someone explain why cargo-vet doesn't include a cryptographic hash of the crate contents?

My understanding is that this repository, and similar ones from Mozilla and others, says: "I, person X from trustworthy organization Y, have reviewed version 1.0 of crate foo and deemed it legit" (for a definition of trustworthy and legit).

But now how does that help me if I want to be careful about what I depend on and supply-chain attacks? I ask for version 1.0 of crate foo but might get some malicious payload without knowing it.

hobofan|2 years ago

That's already prevented by the checksum which is present for all crate versions in the registry index, which is set in stone on publish and verified by cargo on download. See e.g. https://github.com/rust-lang/crates.io-index/blob/74f1b1e064...

pornel|2 years ago

https://lib.rs/cargo-crev does this, with the entire chain from the crate data to the reviewer's trusted identity. However, this adds a lot of complexity.

cargo-vet went for the other extreme of being super simple. To fill in their review report you don't even need any tooling.

mr_00ff00|2 years ago

Curious if any senior devs on HN can comment on the importance/effectiveness of audits for crates?

I’m a junior C++ dev that dabbles with rust in my free time, and I always feel a bit nervous when pulling huge dependency trees with tons of crates into projects.

I would assume most places would turn away from the “node.js” way of doing these things and would just write internal versions of things they need.

Again I am junior, so maybe my worries are way over blown.

pclmulqdq|2 years ago

I think in a lot of C++ and ex-C++ orgs you see this sentiment a lot, and sometimes for good reason. Sometimes that code has security or performance reasons to worry about this. On the other hand, it often doesn't.

On the other hand, Python folks and JavaScript users (which make up a lot of emigres to Rust) probably don't care enough about their supply chain. That's how you end up with misspelled packages causing viruses in production and other disasters.

The short answer to this is that it actually depends a lot on what you are doing.

nemothekid|2 years ago

The "node.js" way of doing things, and it's dysfunction, is nearly exclusive to node because Javascript lacks a standard library and npm's haphazard way of running things. Java, Ruby, Python, even my grandfather's Perl have had "modules" for years with none of the fear that is typically associated with Node.

Personally, C++ aversion to sane dependency management is more about C++'s "I know better than you" culture and legacy cruft (packages are usually managed by the distro, not the language) than actually having any serious security implications.

lesuorac|2 years ago

> I would assume most places would turn away from the “node.js” way of doing these things and would just write internal versions of things they need.

Incorrect assumption, look up the left pad fiasco [1]. Its importance is really a personal opinion; convince nearly always trumps security so if the NPM way allows you to increase sales by ~10% you'll see people continuing to do it.

Google is fairly principled though, all of the 3p code is internally vendored and supposed to be audited by the people pulling in that code/update.

[1]: https://www.google.com/search?q=leftpad+broke+the+internet

wongarsu|2 years ago

Writing your own version of everything means it's probably more tuned to your needs. But unless it's a core part of your software it will also be worse because you can't justify putting many resources into it. It also means new hires will have to learn a lot more. It's one of the (many) reasons why it's so hard to onboard into C/C++ projects, because every standard building block is bespoke and somehow different than what everyone else does. Of course if you are really big you just have those resources, which is why Meta or Google can have bespoke everything.

On security it's a tradeoff. The open-source version is an easier target for attackers, but might be much more battle-tested and thus more bug-free. Audits are the attempt to have the best of both worlds here, and since they again can be crowd-sourced (with cargo-vet and cargo-cev both working on this) it scales even for companies that aren't Google-sized.

pornel|2 years ago

I've reviewed hundreds of Rust crates. It's tedious and boring. The results are boring too — their code is mostly good! Big dependency trees have a reputation for being hot garbage, but that's not my experience. In Rust the small focused crates tend to do one thing, and do it well.

bombolo|2 years ago

> I would assume most places would turn away from the “node.js” way of doing these things and would just write internal versions of things they need.

I assume most places don't care.

fcantournet|2 years ago

Dependencies are dependencies in rust as in C++. I found it's extremely rare that homegrown library that have similar functionality to (used) open-source libraries are better from a security stand-point.

At least in Rust a large part of the security issues that would be VERY time consuming to audit at scale through your dependency tree (whether internal or public) are covered by the compiler/borrow checker/type-system.

In that sense I would take on an larger amount of dependency in Rust than I would in C++ while sleeping better.

vitorsr|2 years ago

See also Google Cloud’s Assured Open Source Software service:

https://cloud.google.com/assured-open-source-software

baby|2 years ago

Interesting stuff! I think everyone seems to come up with their own solutions. I think security in general is a matter of who you trust, and things only work when we build a network of trust.

Imagine if all companies and rust developers started sharing what crates they were confident in + what other organizations they trust as well. If you could then create your own set of such companies, and then choose a dependency depth you were willing to go down to, you might be able to quickly vet a number of crates this way, or at least see the weird crates that demand a bit more attention.

If this could be added to whackadep[1] then you'd be able to monitor your Rust repo pretty solidly!

[1]: https://www.cryptologie.net/article/550/supply-chain-attacks...

sneed_chucker|2 years ago

What's the state of supply chain auditing in general these days?

adolph|2 years ago

SE Daily podcast episode "CAP Theorem 23 Years Later with Eric Brewer" touches on it a bit. On the whole an excellent episode.

https://softwareengineeringdaily.com/2023/05/12/cap-theorem-...

pabs3|2 years ago

I wonder why they went with cargo vet instead of the more cross-language crev:

https://github.com/crev-dev/

wly_cdgr|2 years ago

Hmmm, is it really a good idea to de-duplicate audits? This is a situation where I want multiple parties to each separately do their own audits