Automating Git Bisect with Ephemeral Environments

[+] cortesoft|1 year ago|reply

The tricky thing with git bisect run is that you usually have to have your test script not be part of your git checkout for the process to work.

This is because of the particular situation that bisect is valuable in; you find a regression that doesn’t have a test, and you aren’t sure when it was committed.

First, it has to be a regression; if it is just a bug, then there was no previous version that didn’t have the bug, it just hadn’t been found until now. It has to be something that worked before and now doesn’t.

Second, it has to not have had a test before the regression was found. If you had a test for it already, it would have been found by CI as soon as it was committed, and you wouldn’t need to figure out which commit broke it.

So if you try to add a test to the normal test suite of the code and commit it, git bisect run is not going to work; as soon as you check out the older code, your new test won’t be there and the tests will pass because your new test of the breakage doesn’t run. You have to have the new test persist across git checkouts. This is not trivial, because you can’t just exclude your test files from being updated by git bisect, since other tests will also be changing through versions. You need to have your tests always include some non-version controlled file, and you need to have added that include PRIOR to your last known good version.

The only other use case would be if you are making a lot of commits without running tests, so you actually could break a pre-existing test and not know which commit broke it. If that is your situation, you should probably change your workflow to test every commit instead of trying to get git bisect to work.

For these reasons, I have never found ‘git bisect run’ to be as valuable as it seemed when I first learned about it.

[+] jakub_g|1 year ago|reply

As long as your test suite just finds and runs all files in a given folder (without needing to explicitly "enable" them in some index file), this should work:

- create a NEW test file in `some/path/to/test.ext` (and back it up outside repo just in case)

- do NOT commit it in the repo

- `git bisect`

That way, bisect would check out different commits, but without touching `some/path/to/test.ext` because it's not tracked by git.

It could be also helpful to make git not see some file changes through the diff/status:

`git update-index --assume-unchanged path/to/some/file`

to trick git into thinking the file didn't change. (Although when you checkout a commit which did modify that file, this would crash).

[+] chw9e|1 year ago|reply

Another good case is for rolling back a single bad commit from a batch that got merged into main at the same time.

Doing batch merges with a merge queue can speed up things if you have a ton of longer running end to end and integration tests. But then if a test fails you need to identify which commit out of the batch is causing it so you don’t reject the entire batch.

[+] chw9e|1 year ago|reply

Now with AI test frameworks like stagehand it’s actually possible to write end to end tests after a bug appears that can be backwards compatible as long as changes to the dom are not too extreme. But things like broken selectors won’t be an issue.

I wrote about that here: https://qckfx.com/blog/ai-powered-stagehand-git-bisect-findi...

[+] arccy|1 year ago|reply

git bisect run can take a script, you can easily script adding a test case and running the test, either to an existing file or a new file.

[+] LegionMammal978|1 year ago|reply

I've always wished for a "git trisect", or a "git n-sect" in general, that can try multiple commits in parallel. The use case would be for testing changes in software that has a long, single-threaded component in the build process (e.g., a heavily overloaded configure script). For long-running projects where each bisect takes over a dozen steps, those components lead to lots of thumb-twiddling.

[+] andmarios|1 year ago|reply

The magic of bisect is that you rule out half of your remaining commits every time you run it. So even if you have 1000 commits, it takes at most 10 runs. An n-bisect wouldn't be that much faster, it could be slower because you will not always be able to rule out half your commits.

[+] flir|1 year ago|reply

I think you'd have to hack it together on a per-project basis, but could you do it with containers? You'd have to identify the point where the build process diverges, make n copies of the container...

(If I'm understanding you correctly).

But there's not much that's faster than a binary search.

[+] nlunbeck|1 year ago|reply

> Because ephemeral environments are reproducible on demand (via Docker images, Kubernetes pods, or a cloud VM), you can guarantee that each bisect step sees the same conditions. This drastically reduces "works on my machine" fiascos.

Agree on this pattern for all code changes. Hard to understate the amount of time we've saved by testing against the full prod-like environment right away. An ephemeral env implementation makes this easy and low stakes, so diving right into E2E testing a copy of your real infra isn't wildly unreasonable. However, I work for Shipyard (https://shipyard.build) so I'm a bit biased on these processes.

[+] jaguar75|1 year ago|reply

Interesting idea! The real bottleneck here is likely not the number of test runs, but rather the overhead of environment setup and tear down. This is where ephemeral environments really shine if they can be optimized for quick startup.

[+] Ramiro|1 year ago|reply

Very cool! This is a great example of how ephemeral environments can help for a lot more than just fast inner loops or manual verification.

[+] dagelf|1 year ago|reply

TL;DR Lazy compute intensive way to find what non-commit change broke your tests... if your tests are any good.

[+] daveguy|1 year ago|reply

Lazy, or just the value of human time prioritied over the value of computer time?

I'd rather use git bisect over checking a whole bunch of possibilities manually.

[+] aoeusnth1|1 year ago|reply

Is CI and presubmit testing lazy?

21 comments