Everything you need to know about monorepos

[+] TekMol|4 years ago|reply

"Ask HN: Is there any way to detect websites that are SEO-optimized on Google?"

Unfortunately, those seem to even get into HN. This page is a perfect example. This is how the page starts:

    Everything you need to know about monorepos,
    and the tools to build them.

    Understanding Monorepos

    Monorepos are hot right now, especially among Web
    developers. We created this resource to help developers
    understand what monorepos are, what benefitsthey can
    bring, and the tools available to make monorepo
    development delightful.

    There are many great monorepo tools, built by great
    teams, with different philosophies...

I got so tired at this point that I stopped reading.

In my mind, I see the job description on Fiverr "Fast SEO writer wanted! Please write a 3000 word page about monorepos. Make sure to mention 'monorepos' and related terms like 'web developers', 'tools', 'development' etc frequently."

[+] jdlshore|4 years ago|reply

What a terrible take. As someone with an interest in monorepos, who's currently working with a company adopting Nx, I found the page interesting and compelling. I spent a good 30-60 minutes following links and investigating deeper.

You didn't even read past the fourth paragraph. You should be ashamed for derailing what could have been a productive discussion about an interesting topic with your shallow dismissal.

[+] teekert|4 years ago|reply

You must be primed to detect this, I did not find this at all bothersome. I actually like the page, I always wondered if a monorepo is best for us, now I have some extra arguments to say yes if the discussion arises.

[+] scoutt|4 years ago|reply

I felt exactly the same. So I left the page and went to https://en.wikipedia.org/wiki/Monorepo

[+] skylanh|4 years ago|reply

I think... it has been. OTOH, if you cycle to the end, it's a collaboration by the developers or "community outreach" of those tool chains:

"The tools we'll focus on are: Bazel (by Google), Gradle Build Tool (by Gradle, Inc), Lage (by Microsoft), Lerna, Nx (by Nrwl), Rush (by Microsoft), and Turborepo (by Vercel). We chose these tools because of their usage or recognition in the Web development community."

Contributors:

- Alex Eagle / Bazel

- Kenneth Chau / Lage

- Jeff Cross / Nx

- Victor Savkin / Nx

- Pete Gonzalez / Rush

- Justin Reock / Gradle

So, yeah, it would be a focused website.

[+] InGoodFaith|4 years ago|reply

You can scroll directly to the bottom and see the final comparison table.

Here's a screenshot for your convenience [1]

I agree with you that I prefer to get straight to the point, but this pet-peeve tangent doesn't seem to be a productive discussion of the actual merits of the tooling.

1: https://i.imgur.com/8Vzbh3c.png

[+] rfoo|4 years ago|reply

There should be a big warning on the top:

Monorepo is a way to morph dependency management problem into source control problem within your organization. Currently, FOSS tools solve none of them.

[+] bfung|4 years ago|reply

Agree - the site focuses a lot on build, but ignores scm tooling. At a certain size, git no longer works well as a pure monorepo w/o submodules and these mega companies have teams of people optimizing code time vs build time checkouts of these monorepos to handle subsections.

[+] SantiagoElf|4 years ago|reply

Ahahaha, good one.

But, this is what you get when the barrier of entry for software development has been set so low :)

Millions of developers using tools that they don't fully understand to produce sub-par solutions.

I am so happy that this is the state, so people who can build working, delivered on time solutions are getting paid very well.

[+] dataangel|4 years ago|reply

Could you elaborate? I use a monorepo at work and if anything dealing with 3rd party dependencies is easier because you don't have to coordinate upgrading versions across teams. For 1st party stuff in the repo we don't have a need to version libraries at all, if it all builds and passes all the tests everything is good. The whole point is to use the whole tree from a consistent snapshot as a release, so you never worry about using a new first party library with an old first party binary.

[+] renke1|4 years ago|reply

When I talk about monorepos in our company, I always try to make the distinction between JS monorepo tooling (say nx, turbo or more low-level pnpm/npm/yarn workspaces) and real (?) monorepo tooling (say Bazel). Whereas the latter has more focus on dealing with a wide variety of source code types and artifacts the former is exclusively dealing with NPM packages (which may include other stuff like Go/Rust sometimes). Does this distinction even make sense?

[+] n42|4 years ago|reply

I don't know what it is, but it feels like the JS tooling community so often pretends that the rest of the world does not exist in their marketing. I find myself having to dig into docs or the GitHub repo before I figure out what language or ecosystem I'm even reading about.

somehow, the authors of this website neglect to even mention Nix. maybe that has something to do with the fact that this is a marketing page for the tool they named Nx (seriously?).

[+] politelemon|4 years ago|reply

Title should reflect what the website has used, "Monorepo explained". It certainly doesn't cover "everything you need to know about monorepos" and glosses over its disadvantages and the things you need to watch out for.

The most important one being that you need to have an org/team structure that is set up to support it. You cannot say that it will make the org more efficient as organizations are not all the same. In order to push monorepos, the decision makers ought to know what those caveats and tradeoffs are, or they're going to be in for a sad time.

Site does do a good job of going over the tooling around it. Now this might be a matter of perception, it seems that the tooling is getting better, though not yet very mature. I see a few instances of "write your own" where the tooling is lacking, which is not a great way to go about things, and once again, makes assumptions about the nature of the orgs.

[+] rix0r|4 years ago|reply

Something very important not covered by the article:

Is the tool going to help me detect when I accidentally bypass the declared dependencies?

For example, in a basic monorepo it's very easy to accidentally rely on the file layout on disk (require'ing a dependency not in your package.json but that has been hoisted because it's a dependency of a different package accidentally succeeds, cp'ing files from `../some-other-project` should not be allowed but is possible). All of these invalidate some optimizations that monorepo tools want to make.

At scale with many contributors, it's HARD to teach and remember and apply all these rules, and so the monorepo tool really should help you detect and fix them (basically: fail the build if you mess up).

The article doesn't really make it clear which tools will do that for you. Pretty sure that Bazel does, Nx probably does, and lerna and turborepo don't.

[+] withinboredom|4 years ago|reply

In our mono repo at work, we have a few hundred devs in there daily working just fine. Linters check that there are no relative paths allowed (so you can’t rely on directory structure) and no absolute paths either. If you want to load a file in the tree, you must use a “blessed” constant or function to get the base path of your current code or some other code.

TBF, if you have centralized dependencies or your dependency on another module affects your dependencies, you are probably doing it wrong. APIs between parts should be well defined and not require the entire dependency runtime to be loaded to interact with it.

[+] Aeolun|4 years ago|reply

pnpm definitely doesn’t do that hoisting (unless you specifically ask it to).

It’s nice to suddenly see 10 missing explicit dependencies simply by virtue of running ‘pnpm install’ instead of ‘npm install’.

[+] echelon|4 years ago|reply

I'm building a pretty big service that has four user-facing websites and even more backends (HTTP servers, highly bespoke job queues to run ML workloads, etc.)

This was an absolute nightmare to try managing in separate repos. I've finally settled on two monorepos: a Yarn/TypeScript/React frontend monorepo, and a Rust/Docker backend monorepo.

Does anyone have any advice on these? I sort of stumbled into this pattern on my own and haven't optimized any of it yet.

For Rust, I'm curious if folks have used Bazel for true monorepo build optimization. I don't want to rebuild the world on every push to master.

Likewise for the frontend, is there any way to not trigger Netlify builds for all projects if only one project (or its dependencies) change?

Would super appreciate any advice.

[+] cies|4 years ago|reply

If the (web) API surface between your BE and FE is based on a schema (a.k.a. typed API, like with OpenAPIv3 or GraphQL) then I'd put them in a mono repo. This way you can recompile the FE automatically if the schema changed (usually an FE client lib is generated from the API schema). This helps discovering errors at compile time.

If your API is not schema-based, you have no way of knowing something broke without FE/UI testing.

[+] marcyb5st|4 years ago|reply

Bazel should be smart enough to build only what changed. Is it possible that your CI doesn't cache previous runs? With Bazel I successfully used Google cloud build to achieve that by storing the bazel-* folders to Google Cloud Storage as last step of every build and downloading them as first step.

The target bucket I use has a very short object lifecycle setting so I don't even have to clean up old artifacts manually.

[+] mvkel|4 years ago|reply

The way this is written sets off my bs detectors that were built from the eras of "serverless solves everything" and "mvc solves everything."

The only thing I'm convinced of these days is: whatever way you choose is the right way.

[+] wdb|4 years ago|reply

I wish they explained how to merge existing repos into one new (mono)repo while keeping git history. Still haven’t cracked that problem

[+] duijf|4 years ago|reply

Here's a way you can do this with git. This trick relies on `git merge --allow-unrelated-histories`.

Assuming you have repos `foo` and `bar` and want to move them to the new repo `mono`.

    $ ls
    foo
    bar
    
    # Prepare for import: we want to move all files into a new subdir `foo` so
    # we don't get conflicts later. This uses Zsh's extended globs. See
    # https://stackoverflow.com/questions/670460/move-all-files-except-one for
    # bash syntax.
    $ cd foo
    $ setopt extended_glob
    $ mkdir foo
    $ mv ^foo foo
    $ git add .
    $ git commit -m "Prepare foo for import"
    
    # Follow those "move to subdir" steps for `bar` as well.
    
    # Now make the final monorepo
    $ cd ..
    $ mkdir mono
    $ cd mono
    $ git init
    $ touch README.md
    $ git add README.md
    $ git commit -m "Initial commit in mono"
    
    $ git remote add foo ../foo
    $ git fetch foo
    $ git remote add bar ../bar
    $ git fetch bar
    
    # Substitute `main` for `master` or whatever branch you want to import.
    $ git merge --allow-unrelated-histories foo/main
    $ git merge --allow-unrelated-histories bar/main

    # Inspect the final history:
    $ git log --oneline --graph
    *   8aa67e5 (HEAD -> main) Import bar
    |\
    | * eec0abd (bar/main) Prepare bar for import
    | * 9741d6d More stuff in bar
    | * 634ba3d Initial commit bar
    *   43be6e9 Import foo
    |\
    | * d4805a0 (foo/main) Prepare foo for import
    | * 4d2ca10 More stuff in foo
    | * 72072a1 Initial commit foo
    * bfcb339 Initial commit in mono

[+] tazjin|4 years ago|reply

There are several ways to do this. Having extensively experimented with all of them I can say that the best are josh[0] (if you need external history continuity) and git subtree[1] (if you just need the commits to remain valid within your repository).

[0]: https://github.com/josh-project/josh

[1]: https://manpages.debian.org/testing/git-man/git-subtree.1.en...

[+] sixstringtheory|4 years ago|reply

Check out https://github.com/newren/git-filter-repo/

[+] contravariant|4 years ago|reply

I'm curious about that as well. Maybe it'd be possible to start a repo with a single empty commit, rebase everything on that in a separate branche for each of the git repos and then merge them all into the master branche? Although some file renaming may be in order, otherwise everything ends up in the same folder.

[+] siscia|4 years ago|reply

It is possible and I did it.

In target repo you create a folder and in that folder you rebase your dependency repository.

Maybe I can find better documentation that I remember writing it down somewhere.

[+] onox|4 years ago|reply

You can use git pull with the --allow-unrelated-histories option.

[+] coryrc|4 years ago|reply

git subtree

[+] surrTurr|4 years ago|reply

Great information. I've been building a monorepo of my own for a system that consists of RESTful and event-driven microservices. They are defined via Open API and Async API respectively. Does anyone know a tool to generate documentations for each type of service and put them together into one cohesive documentation?

[+] deliriousferret|4 years ago|reply

I don't know about AsyncAPI but with https://readme.com/ you can upload an OpenAPI file and it generates a nice documentation

[+] revskill|4 years ago|reply

When you use monorepo to solve one problem, you have two problems to solve.

[+] cies|4 years ago|reply

Without reasoning your statement is baseless.

[+] adrianomartins|4 years ago|reply

This is website is amazing. It's doing a very needed job in the internet. I feel it should also be the trunkbaseddevelopment.tools and pairprogramming.tools :)

[+] PufPufPuf|4 years ago|reply

https://trunkbaseddevelopment.com/

https://www.pairprogramwith.me/

58 comments