1. It doesn't show off the unique capabilities of firecracker very well.
2. The comparison not very fair.
2a. The docker-build step (which dominates the runtime) is run without any caching, just by adding 2 lines to your build-push-action, "cache-from: type=gha, cache-to: type=gha,mode=max" you can make it a lot faster.
2b. ~1m20s of the time is just "VM start". GitHub Actions has had a rough time recently, but you should never wait that long to get your CI running in day-to-day operation.
2c. The tests are unrealistically short at 20s which allows the author to get to their 10x faster number.
Let's say the GitHub Action starts in 5 seconds, the GitHub Actions cache reduces the build time to 2 minutes and the tests take 10 minutes to run. Now Firecracker is 20% faster ...
You can also get comparable performance out of https://buildkite.com/ which lets you self-host runners on AWS meaning you're almost guaranteed to get a hot docker cache (running against locally attached SSDs). You can now start running your tests (almost) as fast with much more mature tooling.
> You can also get comparable performance out of https://buildkite.com/ which lets you self-host runners on AWS
you can self-host github runners as well, with a few caveats, the most serious one being that then you are responsible for cleaning up the state of your self-hosted runner between runs
structural isolation guarantees of the form "build execution during run N cannot possibly impact build execution of run N+1" are tremendously helpful -- they reduce the number of weird CI failures and the cost to triage and fix each weird CI failure (by reducing the space of possible interactions). If you cannot offer similar guarantees when self hosting your own CI infrastructure then it may not be wise to self host.
I tried to get docker layer caching working within GHA for a second benchmark, but it seems like none of the approaches work particularly well for a "docker-compose build" - I'd happily amend the post with a second benchmark if you wouldn't mind opening a PR based on the existing one [1]
The point still stands for 2c - you can super easily parallelize with firecracker (by taking a snapshot of the state right before the test runs, then loading it a bunch of times)
There’s an even faster strategy than this and it’s easier to setup.
You’re going to deploy 4 CI pipelines (so make sure you’re not manually putting together ci pipelines configs, use automation):
Pipeline 1: A conveyor belt of environments. All this pipeline does is spin up fresh environments then run a short automated smoke test. Hydrate the env with the most recent mask from prod. The trigger condition is there’s less than <Threshold> environments available. I did 8 on a whim and never saw a need to change it.
Pipeline 2: Normal garden variety CI pipeline triggered on merges to main. Output of this will be two artifacts persisted: a built package and your unit test evidence
Pipeline 3: Test your automated deployment by deploying the package build from #2 into the first of the queue of free envs from #1 trigger your end to end and integration and contract tests. Don’t run your security or operability tests here.
Pipeline 4: Async pipeline triggered on a 6hr schedule, do your long running stuff like fuzz testing here, your security tests etc. do these outside of the dev cycle.
Release candidates can only be signed after a successful run through 2, 3 & 4. That means prod deploys are on a predictable cadence which users and ops are usually appreciative of rather than we drop it in when it’s ready.
The DevEx is pretty sweet - you don’t see pipeline 1 or 4 in your build loop. Only the runtime of 3 would be comparable to the article - slightly faster than the article because no firecracker bringup overhead, no matter how small that is.
There are times when some corner of software development speaks a specialized language and this is an example.
1. Conveyor belt(?) of environments. Hydrate(?) the env(ironment). Mask(?) from prod(uction)
2. I think I got this. Typical "merge to main pipeline" with built product and test results as outputs.
DevEx(?). And not sure why I wouldn't see pipeline #4 in my build loop because I can't deploy unless 2, 3 and 4 pass.... Maybe you mean I don't wait to see it.
Also not sure how it's faster because environments still need to be brought up. Unless you are trying to say that the environment is already running when the merge to master pipeline succeeds.
May I ask what stack you employ to meet these goals?
Many tend to reach for Gitlab CI or Github Actions but these piles of "executable yaml" never appear to be up to the task of complex deployment logic you describe in your post, not including that they don't account for multi-repo or composed artifact workflows naturally. The state of the art, if you can call it that, is Jenkins where you can drop into raw-ish groovy/java for the logic pieces when you need to. But then you run into the constant struggle of working around Jenkin's leaky abstractions and peculiarities.
You can patch together a pile of bash, python, go et al but you land in a worse place where there is no guiding structure to the automation for onboarding, enhancement, and maintenance.
I'm curious of other's experiences building complex build / deployment pipelines where up-front you have consistent entry structure to the automation but have all the escape hatches one would need to implement custom logic when required, in a type safe, potentially compiled, testable way (ie: pipelines as 'actual' code).
Of course one could write their own automation engine to avoid yaml hell and all that. However I am not seeing any pervasive solutions being presented that don't present "yet another (yaml | json | xml | cue | whatever) task dag launching containers running random scripts from wherever".
Firecracker is great and all, but the core idea here described works also with plain docker; i.e. there is nothing inherently firecracker specific to the basic technique
1. Docker doesn't deal with running processes (like postgres or redis), only the filesystem state
2. Docker doesn't have enough isolation, so you'd probably need to run it within qemu or firecracker for compliance in bigger teams
3. Docker-in-docker is still pretty painful, if you need to do anything nonstandard like change the size of /dev/shm, access /dev/kvm, or load kernel drivers, it'll take custom configuration.
Yeah, I don’t like that the article itself treats building the DB seed data, etc, into the Firecracker VM image like this is impossible to do in Docker. The techniques are good things to do — but it’s very tenuous how the techniques are connected to Firecracker.
I’ve do all of the above using multi-layered Docker files and a cron CI job to rebuild the base integration test image every 6 hours. Sure if you need the isolation, Firecracker is the way to go. But if you invest primarily in container shenanigans to speed up CI with Docker, it’s not too much extra work to wrap it in a Firecracker VM, plain QEMU, or whatever once you start wanting more isolation.
Also, maybe I’m holding it wrong but Docker in Docker had not bitten us yet on our GitHub action runners.
We find the dominating factor in (our) incremental builds / CI to be network/io caching, which has less to do with firecracker/docker and more with the surrounding hw/sw (gha topology & smarts, IO speed, ...). It's a real problem in GPU/AI CI where we get monster image sizes. There were some cool blog posts ~last year on caching and routing tricks happening at GH (joint with MSR?), but they've seemingly gone silent..
If you a cloud host, you need a way to sandbox hostile code. Firecracker allows you to do that (it is a configuration of the traditional KVM virtualization system except lighter and faster, instead of booting a VPS which can take minutes, you can now spawn one in under a second).
Because process isolation under unix is pretty lax. Processes have by default have all the rights of the user. And you might end up with a system different from the initial state
Used to do something similar with vsphere a while back. The servers took ages to get into the right state to test so much easier to just revert to snapshot to get a clean state.
Always amaze me to see the new trend of DevOps that will be happily following such a tutorial, wget and running random code from the internet in production...
I don't think this is production, this is for running your tests. Your code in the "tests haven't run yet" state probably leak all the secrets they have access to and destroy the machine they're running on, so you don't let them have any secrets and create a new machine each time. "curl | bash" here just injects potential flakiness (as does "npm install" when npm dies, etc.)
Obviously a lot of people treat their CI system as their CD system, and do things like letting tests have highly privileged access to their production k8s cluster. That's a terrible idea even if you aren't installing software with "curl | bash".
So overall, I don't think this is worth a HN comment to complain about. People are going to install software in non-auditable non-reproducible ways.
[+] [-] kami8845|4 years ago|reply
1. It doesn't show off the unique capabilities of firecracker very well.
2. The comparison not very fair.
2a. The docker-build step (which dominates the runtime) is run without any caching, just by adding 2 lines to your build-push-action, "cache-from: type=gha, cache-to: type=gha,mode=max" you can make it a lot faster.
2b. ~1m20s of the time is just "VM start". GitHub Actions has had a rough time recently, but you should never wait that long to get your CI running in day-to-day operation.
2c. The tests are unrealistically short at 20s which allows the author to get to their 10x faster number.
Let's say the GitHub Action starts in 5 seconds, the GitHub Actions cache reduces the build time to 2 minutes and the tests take 10 minutes to run. Now Firecracker is 20% faster ...
You can also get comparable performance out of https://buildkite.com/ which lets you self-host runners on AWS meaning you're almost guaranteed to get a hot docker cache (running against locally attached SSDs). You can now start running your tests (almost) as fast with much more mature tooling.
[+] [-] shoo|4 years ago|reply
you can self-host github runners as well, with a few caveats, the most serious one being that then you are responsible for cleaning up the state of your self-hosted runner between runs
https://docs.github.com/en/actions/hosting-your-own-runners/...
structural isolation guarantees of the form "build execution during run N cannot possibly impact build execution of run N+1" are tremendously helpful -- they reduce the number of weird CI failures and the cost to triage and fix each weird CI failure (by reducing the space of possible interactions). If you cannot offer similar guarantees when self hosting your own CI infrastructure then it may not be wise to self host.
[+] [-] colinchartier|4 years ago|reply
https://github.com/webappio/livechat-example/blob/be7c9121c1...
The point still stands for 2c - you can super easily parallelize with firecracker (by taking a snapshot of the state right before the test runs, then loading it a bunch of times)
[+] [-] huijzer|4 years ago|reply
[+] [-] CraigJPerry|4 years ago|reply
You’re going to deploy 4 CI pipelines (so make sure you’re not manually putting together ci pipelines configs, use automation):
Pipeline 1: A conveyor belt of environments. All this pipeline does is spin up fresh environments then run a short automated smoke test. Hydrate the env with the most recent mask from prod. The trigger condition is there’s less than <Threshold> environments available. I did 8 on a whim and never saw a need to change it.
Pipeline 2: Normal garden variety CI pipeline triggered on merges to main. Output of this will be two artifacts persisted: a built package and your unit test evidence
Pipeline 3: Test your automated deployment by deploying the package build from #2 into the first of the queue of free envs from #1 trigger your end to end and integration and contract tests. Don’t run your security or operability tests here.
Pipeline 4: Async pipeline triggered on a 6hr schedule, do your long running stuff like fuzz testing here, your security tests etc. do these outside of the dev cycle.
Release candidates can only be signed after a successful run through 2, 3 & 4. That means prod deploys are on a predictable cadence which users and ops are usually appreciative of rather than we drop it in when it’s ready.
The DevEx is pretty sweet - you don’t see pipeline 1 or 4 in your build loop. Only the runtime of 3 would be comparable to the article - slightly faster than the article because no firecracker bringup overhead, no matter how small that is.
[+] [-] drjasonharrison|4 years ago|reply
1. Conveyor belt(?) of environments. Hydrate(?) the env(ironment). Mask(?) from prod(uction)
2. I think I got this. Typical "merge to main pipeline" with built product and test results as outputs.
DevEx(?). And not sure why I wouldn't see pipeline #4 in my build loop because I can't deploy unless 2, 3 and 4 pass.... Maybe you mean I don't wait to see it.
Also not sure how it's faster because environments still need to be brought up. Unless you are trying to say that the environment is already running when the merge to master pipeline succeeds.
[+] [-] ReganLaitila|4 years ago|reply
Many tend to reach for Gitlab CI or Github Actions but these piles of "executable yaml" never appear to be up to the task of complex deployment logic you describe in your post, not including that they don't account for multi-repo or composed artifact workflows naturally. The state of the art, if you can call it that, is Jenkins where you can drop into raw-ish groovy/java for the logic pieces when you need to. But then you run into the constant struggle of working around Jenkin's leaky abstractions and peculiarities.
You can patch together a pile of bash, python, go et al but you land in a worse place where there is no guiding structure to the automation for onboarding, enhancement, and maintenance.
I'm curious of other's experiences building complex build / deployment pipelines where up-front you have consistent entry structure to the automation but have all the escape hatches one would need to implement custom logic when required, in a type safe, potentially compiled, testable way (ie: pipelines as 'actual' code).
Of course one could write their own automation engine to avoid yaml hell and all that. However I am not seeing any pervasive solutions being presented that don't present "yet another (yaml | json | xml | cue | whatever) task dag launching containers running random scripts from wherever".
[+] [-] ithkuil|4 years ago|reply
[+] [-] colinchartier|4 years ago|reply
The three big differences are:
1. Docker doesn't deal with running processes (like postgres or redis), only the filesystem state
2. Docker doesn't have enough isolation, so you'd probably need to run it within qemu or firecracker for compliance in bigger teams
3. Docker-in-docker is still pretty painful, if you need to do anything nonstandard like change the size of /dev/shm, access /dev/kvm, or load kernel drivers, it'll take custom configuration.
[+] [-] jitl|4 years ago|reply
I’ve do all of the above using multi-layered Docker files and a cron CI job to rebuild the base integration test image every 6 hours. Sure if you need the isolation, Firecracker is the way to go. But if you invest primarily in container shenanigans to speed up CI with Docker, it’s not too much extra work to wrap it in a Firecracker VM, plain QEMU, or whatever once you start wanting more isolation.
Also, maybe I’m holding it wrong but Docker in Docker had not bitten us yet on our GitHub action runners.
[+] [-] lmeyerov|4 years ago|reply
We find the dominating factor in (our) incremental builds / CI to be network/io caching, which has less to do with firecracker/docker and more with the surrounding hw/sw (gha topology & smarts, IO speed, ...). It's a real problem in GPU/AI CI where we get monster image sizes. There were some cool blog posts ~last year on caching and routing tricks happening at GH (joint with MSR?), but they've seemingly gone silent..
[+] [-] lgierth|4 years ago|reply
[+] [-] wyldfire|4 years ago|reply
[+] [-] melony|4 years ago|reply
[+] [-] legulere|4 years ago|reply
[+] [-] rossmohax|4 years ago|reply
[+] [-] nicoburns|4 years ago|reply
[+] [-] forgotusername6|4 years ago|reply
[+] [-] kaivalyagandhi|4 years ago|reply
[+] [-] yewenjie|4 years ago|reply
[+] [-] cpach|4 years ago|reply
[+] [-] sjosh003|4 years ago|reply
1. https://github.com/weaveworks/ignite
[+] [-] greatgib|4 years ago|reply
[+] [-] jrockway|4 years ago|reply
Obviously a lot of people treat their CI system as their CD system, and do things like letting tests have highly privileged access to their production k8s cluster. That's a terrible idea even if you aren't installing software with "curl | bash".
So overall, I don't think this is worth a HN comment to complain about. People are going to install software in non-auditable non-reproducible ways.
[+] [-] jen20|4 years ago|reply
[+] [-] supermatt|4 years ago|reply
Sure it’s faster startup, but the rest is nonsense.
[+] [-] fideloper|4 years ago|reply
[+] [-] iampims|4 years ago|reply
[+] [-] StreamBright|4 years ago|reply
[+] [-] n8ta|4 years ago|reply
[+] [-] tedunangst|4 years ago|reply
[+] [-] colinchartier|4 years ago|reply
[+] [-] neatze|4 years ago|reply
I don't understand why you need to rebuild docker image every app build, this seems like really wasteful.
[+] [-] n8ta|4 years ago|reply
[+] [-] goodpoint|4 years ago|reply