top | item 45668160

SourceFS: A 2h+ Android build becomes a 15m task with a virtual filesystem

154 points| cdesai | 4 months ago |source.dev

62 comments

order

ongy|4 months ago

While it looks like at least some of the team are ex-googlers, this isn't the srcfs we know from piper (Google internal tools).

Looks like it's similar in some ways. But they also don't tell too much and even the self-hosting variant is "Talk to us" pricing :/

7e|4 months ago

Google or Meta needs to open source their magic VFSes. Maybe Meta is closest with EdenFS.

jonnrb|4 months ago

WDYM this seems very familiar. At commit deadbeef I don't need to materialize the full tree to build some subcomponent of the monorepo. Did I miss something?

And as for pricing... are there really that many people working on O(billion) lines of code that can't afford $TalkToUs? I'd reckon that Linux is the biggest source of hobbyist commits and that checks out on my laptop OK (though I'll admit I don't really do much beyond ./configure && make there...)

Ericson2314|4 months ago

The headline times are a bit ridiculous. Are they trying to turn https://github.com/facebook/sapling/blob/main/eden/fs/docs/O... or some git fuse thing into a product?

zokier|4 months ago

Well they also claim to be able to cache build steps somehow build-system independently.

> As the build runs, any step that exactly matches a prior record is skipped and the results are automatically reused

> SourceFS delivers the performance gains of modern build systems like Bazel or Buck2 – while also accelerating checkouts – all without requiring any migration.

Which sounds way too good to be true.

jonnrb|4 months ago

It seems like that plus some build output caching?

sudahtigabulan|4 months ago

This reminds me of ClearCase and its MVFS.

Builds were audited by somehow intercepting things like open(2) and getenv(3) invoked by a compiler or similar tool, and each produced object had an associated record listing the full path to the tool that produced it, its accurate dependencies (exact versions), and environment variables that were actually used. Anything that could affect the reproducibility was captured.

If an object was about to be built with the exact same circumstances as those in an existing record, the old object was reused, or "winked-in", as they called it.

It also provided versioning at filesystem level, so one could write something like file.c@@/trunk/branch/subbranch/3 and use it with any program without having to run a VCS client. The version part of the "filename" was seen as regular subdirectories, so you could autocomplete it even with ancient shells (I used it on Solaris).

bityard|4 months ago

Meh, content marketing for a commercial biz. There are no interesting technical details here.

I was a build engineer in a previous life. Not for Android apps, but some of the low-effort, high-value tricks I used involved:

* Do your building in a tmpfs if you have the spare RAM and your build (or parts of it) can fit there.

* Don't copy around large files if you can use symlinks, hardlinks, or reflinks instead.

* If you don't care about crash resiliency during the build phase (and you normally should not, each build should be done in a brand-new pristine reproducible environment that can be thrown away), save useless I/O via libeatmydata and similar tools.

* Cross-compilers are much faster than emulation for a native compiler, but there is a greater chance of missing some crucial piece of configuration and silently ending up with a broken artifact. Choose wisely.

The high-value high-effort parts are ruthlessly optimizing your build system and caching intermediate build artifacts that rarely change.

7e|4 months ago

That’s all basic stuff, and none of it solves what this product claims to.

serbancon|4 months ago

Hey everyone. I’m Serban, co-founder of Source.dev. Thanks for the upvotes and thoughtful discussion. I’ll reply to as many comments as I can. Nothing means more to an early-stage team than seeing we’re building something people truly value - thanks from all of us at Source.dev!

CJefferson|4 months ago

While I’m sure it’s much more advanced, out of interest is this similar to the Python tool ‘fabricate’, which would use strace to track all files a program read, and wrote?

MarsIronPI|4 months ago

> Fast builds are what truly makes a difference to developer productivity. With SourceFS builds complete over 9x faster on a regular developer machine. This sets a new standard as it enables developers to get their sword fighting time back and speeds-up the lengthy feedback loop on CI pipelines.

Objection! Long build times are better for sword-fighting time. The longer it takes, the more sword-fighting we have time for!

DuckConference|4 months ago

Their performance claims are quite a bit ahead of the distributed android build systems that I've used, I'm curious what the secret sauce is.

cogman10|4 months ago

Is it going to be anything more than just a fancier ccache?

theossuary|4 months ago

Why tf does an electric vehicle need 500m+ lines of code

jeffbee|4 months ago

Some people actually write tests.

vzaliva|4 months ago

It sounds from the page that it is Android-source-code specific. Why? Could this work with any source code base?

everlier|4 months ago

If my understanding is correct, this only makes sense for codebases that do not fit in memory of a largest build box an organisation can run

rs186|4 months ago

I think the page itself answers your question pretty well.

forrestthewoods|4 months ago

The world desperately needs a good open source VFS that supports Windows, macOS, and Linux. Waaaaay too many companies have independently reinvented this wheel. Someone just needs to do it once, open source it, and then we can all move on.

7e|4 months ago

This. Such a product also solves some AI problems by matting you version very large amounts of training data in a VCS like git, which can then be farmed out for distributed unit testing.

ctoth|4 months ago

Once builds are "fast enough," there's no business case for the painful work of making the codebase comprehensible.

We're going to 1 billion LoC codebases and there's nothing stopping us!

yencabulator|4 months ago

Vagueposts from the marketing department are not appreciated.

_1tan|4 months ago

I want this but self hosted/integrated into our CI (Gitlab in our case).

serbancon|4 months ago

Please fill in this form: https://www.source.dev/demo . We’re prioritizing cloud deployments but are keen to hear about your use case and see what we can do.

api|4 months ago

Could you just do the build in /dev/shm?

ongy|4 months ago

No. `/dev/shm` would just be a build in `tmpfs`.

Though from what I gather form the story, part of the spedup comes from how android composes their build stages.

I.e. speeding up by not downloading everything only helps if you don't need everything you download. And adds up when you download multiple times.

I'm not sure they can actually provide a speedup in a tight developer cycle with a local git checkout and a good build system.

zar22|4 months ago

[deleted]

jeffrallen|4 months ago

Tldr: your build system is so f'd that you have gigs of unused source and hundreds of repeated executions of the same build step. They can fix that. Or, you could, I dunno, fix your build?

jayd16|4 months ago

You could just have a mono-repo with a large amount of assets that aren't always relevant to pull.

Incremental builds and diff only pulls are not enough in a modern workflow. You either need to keep a fleet of warm builders or you need to store and sync the previous build state to fresh machines.

Games and I'm sure many other types of apps fall into this category of long builds, large assets, and lots of intermediate build files. You don't even need multiple apps in a repo to hit this problem. There's no simple off the shelf solution.