Buck2: Our open source build system

[+] ihnorton|3 years ago|reply

The fact that Buck2 is written in a statically-compilable language is compelling, compared to Bazel and others. It's also great that Windows appears to be supported out of the box [1,1a] -- and even tested in CI. I'm curious how much "real world" usage it's gotten on Windows, if any.

I don't see many details about the sandboxing/hermetic build story in the docs, and in particular whether it is supported at all on Linux or Windows (the only mention in the docs is Darwin).

It's a good sign that the Conan integration PR [2] was warmly received (if not merged, yet). I would hope that the system is extensible enough to allow hooking in other dependency managers like vcpkg. Using an external PM loses some of the benefits, but it also dramatically reduces the level of effort for initial adoption. I think bazel suffered from the early difficulties integrating with other systems, although IIUC rules_foreign_cc is much better now. If I'm following the code/examples correctly, Buck2 supports C++ out of the box, but I can't quite tell if/how it would integrate with CMake or others in the way that rules_foreign_cc does.

(one of the major drawbacks of vcpkg is that it can't do parallel dependency builds [3]. If Buck2 was able to consume a vcpkg dependency tree and build it in parallel, that would be a very attractive prospect -- wishcasting here)

[1] https://buck2.build/docs/developers/windows_cheat_sheet/ [1a] https://github.com/facebook/buck2/blob/738cc398ccb9768567288... [2] https://github.com/facebook/buck2/pull/58 [3] https://github.com/microsoft/vcpkg/discussions/19129

[+] fanzeyi|3 years ago|reply

One side effect of all the Metaverse investment is that Meta now has a lot more engineers working on Windows. You bet there will be real world usage. ;)

[+] e4m2|3 years ago|reply

> There are also some things that aren't quite yet finished:

> There are not yet mechanisms to build in release mode (that should be achieved by modifying the toolchain).

> Windows/Mac builds are still in progress; open-source code is mostly tested on Linux.

Source: https://buck2.build/docs/why.

[+] yencabulator|2 years ago|reply

> I don't see many details about the sandboxing/hermetic build story in the docs, [...]

Looks like local mode just inherits whatever environment the buck daemon was spawned in.

The remote execution thing is configured with a docker image to run things in, and only specified files are coped into the container instance, so it's somewhat hermetic. Docker containers aren't really reproducible, and there's only one image per remote execution backend, so that's kinda the weakest link (especially compared to something like Nix's hermetic builds, where the build-visible filesystem only contains the things you declared as dependencies).

[+] unknown|2 years ago|reply

[deleted]

[+] oggy|2 years ago|reply

Great to see this. I hope it takes off - Bazel is useful but I really like the principled approach behind it (see the Build Systems a la Carte paper), and Neil is scarily good from my experience of working with him so I'd expect that they've come up with something awesome.

One thing I find annoying with all of these general, language-agnostic build systems though is that they break the "citizenship" in the corresponding language. So while you can usually relatively easily build a Rust project that uses crates.io dependencies, or a Python project with PyPi dependencies, it seems hard to make a library built using Bazel/Buck available to non-Bazel/Buck users (i.e., build something available on crates.io or PyPi). Does anyone know of any tools or approaches that can help with that?

[+] marcyb5st|2 years ago|reply

Regarding bazel, the rules_python has a py_wheel rule that helps you creating wheels that you can upload to pypi (https://github.com/bazelbuild/rules_python/blob/52e14b78307a...).

If you want to see an approach of bazel to pypi taken a bit to the extreme you can have a look at tensorflow on GitHub to see how they do it. They don't use the above-mentioned building rule because I think their build step is quite complicated (C/C++ stuff, Vida/ROCm support, python bindings, and multiOS support all in one before you can publish to pypi).

[+] kccqzy|2 years ago|reply

I have a lot of respect for Neil, but I've been burned by the incompleteness and lack of surrounding ecosystem for his original build system Shake (https://shakebuild.com/). This was in a team where everyone knows Haskell.

I'm cautiously optimistic with this latest work. I'm glad at least this isn't some unsupported personal project but something official from Meta.

[+] klodolph|2 years ago|reply

The “citizenship” point is really interesting. I’ve found these build systems to be really useful for solving problems in multi-language repos. They make it super easy to create all the build artifacts I want. However, in many ways, they make the source more difficult to consume for people downstream.

[+] jpdb|2 years ago|reply

Bazel now has a module system that you can use.

https://bazel.build/external/module

This means your packages are just Git repos + BUILD files.

[+] rvcdbn|2 years ago|reply

These kinds of tools are designed to work in monorepos so you don’t really rely on package management like you do with separate repos. This works really well for sharing close inside companies/entities. Doesn’t work as well for sharing code between entities.

[+] dnsco|2 years ago|reply

If I'm understanding, for the rust specific case, this generates your BUCK files from your Cargo.toml:

https://github.com/facebookincubator/reindeer

[+] lopkeny12ko|2 years ago|reply

> One thing I find annoying with all of these general, language-agnostic build systems though is that they break the "citizenship" in the corresponding language

I mean, this is kind of the whole point. A language agnostic build system needs a way to express dependencies and relationships in a way that is agnostic to, and abstracts over, the underlying programming language and its associated ecosystem conventions.

[+] chubot|2 years ago|reply

The linked paper is pretty interesting, and short at 4 pages.

In plainer language, I'd say the observation/motivation is that not only do compiling and linking benefit from incrementality/caching/parallelism, but so does the build system itself. That is, the parsing of the build config, and the transformation of the high level target graph to the low level action graph.

So you can implement the build system itself on top of an incremental computation engine.

Also the way I think about the additional dependencies for monadic build systems is basically #include scanning. It's common to complain that Bazel forces you to duplicate dependency info in BUILD files. This info is already present (in some possibly sloppy form) in header files.

So maybe they can allow execution of the preprocessor to feed back into the shape of the target graph or action graph. Although I wonder effect that has on performance.

---

The point about Java vs. Rust is interesting too -- Java doesn't have async/await, or coroutines.

I would have thought you give up some control over when things run with with async/await, but maybe not... I'd like to see how they schedule the tasks.

Implementing Applicative Build Systems Monadically

https://ndmitchell.com/downloads/paper-implementing_applicat...

[+] RcouF1uZ4gsC|3 years ago|reply

> Buck2 is an extensible and performant build system written in Rust

I really appreciate tooling that is written in Rust or Go that produce single binaries with minimal runtime dependencies.

Getting tooling written in for example Python to run reliably can be an exercise in frustration due to runtime environmental dependencies.

[+] crabbone|3 years ago|reply

Your problem is that Python sucks, especially it's dependency management. It sucks not because it ought to suck, but because of the incompetence of PyPA (the people responsible for packaging).

There are multiple problems with Python packaging which ought not exist, but are there and make lives of Python users worse:

* Python doesn't have a package manager. pip can install packages, but installing packages iteratively will break dependencies of packages installed in previous iterations. So, if you call pip install twice or more, you are likely to end up with a broken system.

* Python cannot deal with different programs wanting different versions of the same dependency.

* Python version iterates very fast. It's even worse for most of the Python packages. To stand still you need to update all the time, because everything goes stale very fast. In addition, this creates too many versions of packages for dependency solvers to process leading to insanely long installation times, which, in turn, prompts the package maintainers to specify very precise version requirements (to reduce the time one has to wait for the solver to figure out what to install), but this, in turn, creates a situation where there are lots of allegedly incompatible packages.

* Python package maintainers have too many elements in support matrix. This leads to quick abandonment of old versions, fragmented support across platforms and versions.

* Python packages are low quality. Many Python programmers don't understand what needs to go into a package, they either put too little or too much or just the wrong stuff altogether.

All of the above could've been solved by better moderation of community-generated packages, stricter rules on package submission process, longer version release cycles, formalizing package requirements across different platforms, creating tools s.a. package manager to aid in this process... PyPA simply doesn't care. That's why it sucks.

[+] TikolaNesla|3 years ago|reply

Yes, just what I thought when I installed the Shopify CLI (https://github.com/Shopify/cli) a few days ago because they force you to install Ruby and Node

[+] rektide|3 years ago|reply

Personally it seems like a huge waste of memory to me. It's the electron of the backend. It's absolutely done for convenience & simplicity, with good cause after the pain we have endured. But every single binary bringing the whole universe of libraries with it offends.

Why have an OS at all if every program is just going to package everything it needs?

It feels like we cheapened out. Rather than get good & figure out how to manage things well, rather than driver harder, we're bunting the problem. It sucks & it's lo-fi & a huge waste of resources.

[+] bogwog|3 years ago|reply

I feel so lucky that I found waf[1] a few years ago. It just... solves everything. Build systems are notoriously difficult to get right, but waf is about as close to perfect as you can get. Even when it doesn't do something you need, or it does things in a way that doesn't work for you, the amount of work needed to extend/modify/optimize it to your project's needs is tiny (minus the learning curve ofc, but the core is <10k lines of Python with zero dependencies), and doesn't require you to maintain a fork or anything like that.

The fact that the Buck team felt they had to do a from scratch rewrite to build the features they needed just goes to show how hard it is to design something robust in this area.

If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck? I know FB's scale makes their needs unique, but at least at a surface level, it doesn't seem like Buck offers anything that couldn't have been implemented easily in waf. Adding Starlark, optimizing performance, implementing remote task execution, adding fancy console output, implementing hermetic builds, supporting any language, etc...

[1]: https://waf.io/

[+] klodolph|2 years ago|reply

> If there are any people in the Buck team here, I would be curious to hear if you all happened to evaluate waf before choosing to build Buck?

There’s no way Waf can handle code bases as large as the ones inside Facebook (Buck) or Google (Bazel). Waf also has some problems with cross-compilation, IIRC. Waf would simply choke.

If you think about the problems you run into with extremely large code bases, then the design decisions behind Buck/Bazel/etc. start to make a lot of sense. Things like how targets are labeled as //package:target, rather than paths like package/target. Package build files are only loaded as needed, so your build files can be extremely broken in one part of the tree, and you can still build anything that doesn’t depend on the broken parts. In large code bases, it is simply not feasible to expect all of your build scripts to work all of the time.

The Python -> Starlark change was made because the build scripts need to be completely hermetic and deterministic. Starlark is reusable outside Bazel/Buck precisely because other projects want that same hermeticity and determinism.

Waf is nice but I really want to emphasize just how damn large the codebases are that Bazel and Buck handle. They are large enough that you cannot load the entire build graph into memory on a single machine—neither Facebook nor Google have the will to load that much RAM into a single server just to run builds or build queries. Some of these design decisions are basically there so that you can load subsets of the build graph and cache parts of the build graph. You want to hit cache as much as possible.

I’ve used Waf and its predecessor SCons, and I’ve also used Buck and Bazel.

[+] jsgf|3 years ago|reply

I don't know if they considered waf specifically, but the team is definitely very familiar with the state of the art: https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

One of the key requirements is that Buck2 had to be an (almost) drop-in replacement for Buck1 since there's no way we could reasonably rewrite all the millions of existing build rules to accommodate anything else.

Also Buck needs to support aggressive caching, and doing that reliably puts lots of other constraints on the build system (eg deterministic build actions via strong hermeticity) which lots of build systems don't really support. It's not clear to me whether waf does, for example (though if you squint it does look a bit like Buck's rule definitions in Starlark).

[+] xxpor|3 years ago|reply

I truly believe any build system that uses a general-purpose language by default is too powerful. It lets people do silly stuff too easily. Build systems (for projects with a lot of different contributors) should be easy to understand, with few, if any, project specific concepts to learn. There can always be an escape hatch to python (see GN, for example), but 99% of the code should just be boring lists of files to build.

[+] PaulDavisThe1st|2 years ago|reply

And the best part about waf? The explicit design intent that you include the build system with the source code. This gets rid of all the problems with build systems becoming backwards/forwards incompatible, and trying to deal with the issues when a developer works on one project using build system v3.9 and another that users build system v4.6

With waf, the build system is trivially included in the source, and so your project always uses the right version of waf for itself.

[+] softfalcon|3 years ago|reply

I could be wrong as I haven't dug into the waf docs too too much, but I think the major difference between waf and Buck is the ability to handle dependency management between various projects in a large org.

The documentation and examples for waf seem to be around building one project, in one language, with an output of statistics and test results. I am sure this is a simplification for education and documentation purposes, but it does leave a vague area around "what if I have more than 1 or 2 build targets + 5 libs + 2 apps + 3 interdependent helper libraries?"

Buck seems to be different in that it does everything waf does but also has clear `dep` files to map dependencies between various libraries within a large repository with many, many different languages and build environments.

The key thing here being, I suspect that within Meta's giant repositories of various projects, they have a tight inter-linking between all these libraries and wanted build tooling that could not only build everything, but be able to map the dependency trees between everything as well.

Pair that with a bunch of consolidated release mapping between the disparate projects and their various links and you have a reason why someone would likely choose Buck over waf purely from a requirements side.

As for another reason they likely chose Buck over waf. It would appear that waf is a capable, but lesser known project in the wider dev community. I say this because when I look into waf, I mostly see it compared against CMake. Its mental state resides mostly in the minds of C++ devs. Either because of NIHS (not invented here syndrome) or fear that the project wouldn't be maintained over time, Meta may have decided to just roll their own tooling. They seem to be really big on the whole "being the SDK of the internet" as of late. I could see them not wanting to support an independent BSD licensed library they don't have complete control over.

These are just my thoughts, I could be completely wrong about everything I've said, but they're my best insights into why they likely didn't consider waf for this.

[+] scrollaway|2 years ago|reply

Waf bills itself as "the meta build system". But Buck2 is "the Meta build system". :)

[+] rtpg|2 years ago|reply

waf looks pretty nice but does it have a remote cache? For me the biggest argument for Bazel is the remote caching, and not having it is a bit of a deal breaker IMO

[+] thomasahle|3 years ago|reply

It's probably more about better caching, but using buck2 internally at Meta reduced me buildtimes from minutes to seconds. A very welcome upgrade.

[+] Dowwie|2 years ago|reply

For what language?

[+] lopkeny12ko|2 years ago|reply

I'm missing some historical context here. This article goes out of its way to compare and contrast with Bazel. Even the usage conventions, build syntax (Starlark), and RBE API are the same as in Bazel.

Did FB fork Bazel in the early days but retain basically everything about it except the name? Why didn't they just...adopt Bazel, and contribute to it like any other open source project?

[+] 0xcafefood|2 years ago|reply

One thing you might be missing is that this is Buck2.

Buck (https://github.com/facebook/buck) has been open sourced for nearly 10 years now.

The lore I've heard is that former Googlers went to Facebook, built Buck based on Blaze, and Facebook open sourced that before Google open sourced Blaze (as Bazel).

The first pull to the Buck github repo was on May 8, 2013 (https://github.com/facebook/buck/pulls?q=is%3Apr+sort%3Acrea...). The first to Bazel was Sep 30, 2014 (https://github.com/bazelbuild/bazel/pulls?q=is%3Apr+sort%3Ac...).

[+] esprehn|2 years ago|reply

Blaze is very old (from 2006), the history is described here: https://mike-bland.com/2012/10/01/tools.html#blaze-forge-src...

In the years that followed folks left Google and joined other companies and created similar build systems because blaze had a lot of advantages at scale. Facebook made Buck, Twitter made Pants. Blaze was still closed source inside Google. They all used the same python looking language.

In 2012 Twitter open sourced Pants: https://blog.twitter.com/engineering/en_us/a/2016/the-releas...

In 2013 Facebook open sourced Buck: https://en.m.wikipedia.org/wiki/Buck_(software)

In 2015 Google finally open sourced most of blaze, but renamed it bazel for copyright reasons. Some might argue they waited too long because clearly there was a lot of demand for such a system. :)

After that Twitter (mostly?) migrated to bazel and Facebook sort of stalled out on Buck. But then recently they decided to rewrite it from scratch to fix a lot of the architecture problems resulting in Buck2.

Buck2 looks pretty impressive and hopefully it gets the bazel folks moving faster. For example the analysis phase in bazel is very slow even inside Google, and Buck2 shows an alternative design that's much faster.

[+] krschultz|2 years ago|reply

At the time that FB started writing Buck, Bazel was not open source. I believe it did exist as Blaze internally at Google before FB started writing Buck. Facebook open sourced Buck before Google open sourced Blaze as Bazel.

Over time Facebook has been working to align Buck with Bazel, e.g. the conversion to Starlark syntax so tools such as Buildozer work on both systems. I believe Buck2 also now uses the same remote execution APIs as Bazel, but don't quote me on that.

[+] ynx|2 years ago|reply

Buck far predates Bazel, and was built by ex-googlers replicating Blaze.

Skylark was a later evolution, after the python scripts grew out of control, and a cue that fb took from Google long after Buck had been near-universally deployed for several years.

[+] bdittmer|2 years ago|reply

Remote Execution is just a gRPC protocol -- bazel, buck1 and others implement it.

[+] jeffbee|3 years ago|reply

Hrmm, it makes performance claims with regard to Buck1 but not to Bazel, the obvious alternative. Hardly anyone uses Buck1 so you'd think it would be relevant.

[+] rockwotj|2 years ago|reply

Does anyone know how IDE support for Buck2 is? I couldn't find anything except some xcode config rules. Half the battle with Bazel/Buck/etc is that getting and IDE or LSP to work for C++/Java/Kotlin/Swift/etc is always a pain because those tools don't really work out of the box.

[+] evmar|3 years ago|reply

How do the "transitive-sets (tsets)" mentioned here compare to Bazel depsets[1]? Is it the same thing with a different name, or different in some important way?

[1] https://bazel.build/rules/lib/depset

[+] yurodivuie|2 years ago|reply

Do smaller companies (smaller than Meta and Google) use these kinds of build tools much? It seems like a system that rebuilds everything whenever a dependency changes is more suited an environment that has very few, if any, external dependencies.

Is anyone using Buck/Bazel and also using frameworks like Spring, or React, for example?

[+] umanwizard|2 years ago|reply

> In our internal tests at Meta, we observed that Buck2 completed builds 2x as fast as Buck1.

In my experience Buck was spending a huge amount of time in GC, so this doesn’t surprise me. It must have been (ab)using Java in such a way that massive amounts of stuff were sprayed across the heap.

[+] rtpg|2 years ago|reply

The dynamic dependency stuff looks very nice! It feels like a good entrypoint for systems that are "merely" wanting good build caching, and not being "so huge git falls apart" big.

My biggest gripe with Bazel is how when you're off the beaten path suddenly it feels like the ecosystem really doesn't want you to just solve problems yourself. Meanwhile in this Buck2 documentation, directly talking about adding good support for tools outsides of community-provided things.

I still am not a superfan of the awkward way that custom implementations get declared (which I think comes from needing to support super-giant projects? But it's jsut awkward) and all the naming suffers from Google-like "we cannot call them functions but must call them factories" NIH things... but at least there are clear docs.

[+] orthoxerox|2 years ago|reply

I really hope the team responsible for it is called Timbuktu.

[+] LegNeato|2 years ago|reply

Congrats to the team! Very excited to finally get to use this.

[+] kylecordes|2 years ago|reply

The essential characteristics of Buck2 look very appealing - but it's hard to see this catching up with the substantial ecosystem of language support rules for Bazel.

[+] jmmv|2 years ago|reply

As a former Bazel developer and current Bazel user, I very much like the design principles that they outline for Buck2. In particular:

* The fact that it is written in a compiled safe language is a breath of fresh air. I personally like Java the language and understand why Bazel was originally written in Java and how it has done a great job at "hiding" it, but it's still there. In particular, Java's memory and threading models has been problematic for certain scenarios. (I haven't kept up with the language advances and I believe there are new ways to fix this, but adopting them would require a major overhaul of Bazel's internals.) Plus Bazel being written in Java prevents it from being adopted in smaller projects that are /not/ written in Java--a bummer for the whole open source ecosystem.

* The complete separation of language rules from the core is great. This is something that Bazel has wanted to achieve for a long time, but they are still stuck with native C++ and Java rules (it's really hard to rewrite them apparently). Not a huge deal, but in Buck2's case, their design highlights that it's clean enough to support this from day one.

* The "single" phase execution is also nice to see. Bazel used to have three phases (loading, analysis, and execution) and later managed to interleave the first two. However, the separation is still annoying from a performance perspective, and also introduces some artifacts in the memory model.

* It's good that various remote execution facilities as well as virtual file systems have been considered from day one. These do not matter much... until they do, at which point you want the flexibility to incorporate them. Bazel used to have this in the Google-internal version (there is that ACM paper that explains this), but the external version doesn't. For example, there is a patch to support an output tree on a lazy file system courtesy of the bb-clientd project, but after years it hasn't been upstreamed yet.

* And lastly, it's also great to see that what they open sourced is what they use internally. Bazel isn't like that: Google tried to open source a "cleaner version" by removing certain warts that were considered dead ends... and that has been both good and bad. On the one hand, this has been key to developing Starlark to where it is today, but on the other, this has made it hard for certain communities to adopt Bazel (e.g. the Python rules were mostly unusable for a really long time).

Now, a question: Buck2 uses the Starlark language, but that does not imply that they implement the same Build APIs to support the rules that Bazel has. Does anyone know to what extent the rules are compatible between the two? If Buck2 supported the Bazel rules ecosystem or with minor changes, that'd be huge!

[+] ndmitchell|2 years ago|reply

Thanks for the comments! There are two levels at which you could make Buck2/Bazel compatible:

* At the BUILD/BUCK file level. I imagine you can get close, but there are details between each other that will be hard to overcome. Buck2 doesn't have the WORKSPACE concept, so that will be hard to emulate. Bazel doesn't have dynamic dependencies which means that things like OCaml are written as multiple rules for a single library, while Buck2 just has a single rule. I think it should be possible to define a macro layer that was common enough so that it could be implemented by both Bazel and Buck2, and then open source users could just support the unified Bazck build system. * At the rules level. These might be harder, as the subtle details tend to be quite important. Things like tsets vs depsets is likely to be an issue, as they are approximately the same mechanism, but one wired into the dependency graph and one not, which is going to show up everywhere.

[+] 1MachineElf|2 years ago|reply

There are a few references to NixOS on the code/issues.[0] I wonder what Meta's use case is for NixOS.

[0] https://github.com/facebook/buck2/search?q=nixos&type=issues

[+] noisy_boy|2 years ago|reply

Wonder if they have examples for Java where maven and groovy are the main two tools.

Also, in case of our builds, we can benefit only so much from being faster during build phase because it is all the other bits like SonarQube scans, pushing artifacts to Artifactory, misc housekeeping bits, annoyingly slow Octopus deployments etc, that add most of the time to the long deployment cycle. Sometimes I think if a dedicated Go utility that takes care everything build related (parallelizing when possible) would make things faster; it will have the full picture after all. But then we will be reimplementing all the features of these various tools which is maybe ok at FB scale but would be too much for a smaller shop.

[+] djha-skin|2 years ago|reply

Everyone says buck and bazel are so amazing but honestly, mono repos are unicorns. No one does this. It's useful to no one I know. I keep hearing it's useful to somebody, so it must be really useful when it is, but I've never ever seen buck, bazel or monorepos in real life. And it's been my career to build stuff.

[+] 5Qn8mNbc2FNCiVV|2 years ago|reply

I'm sorry to break it to you, but monorepos are extremely common. Doesn't mean they have to be as large but every company I've been at had a monorepo.

And as soon as you have to manage PRs for multiple repos with a new cross-cutting feature or scheduling changes in the correct order you understand why they are so appealing.

[+] __float|2 years ago|reply

There's quite a few well-known places listed on https://bazel.build/community/users across many industries. I think Buck and Pants and Please (and ...) are not as widely used, but if they had a list to add we'd have even more examples.

271 comments