LXC – Running 14,000 tests per day and beyond

[+] arohner|12 years ago|reply

If you're interested having this kind of setup for your own team, without having to muck with servers, you can use something like https://circleci.com. We use LXC, and pay lots of attention to having fast I/O. [disclosure: I'm founder/CTO of circle]

[+] rsanders|12 years ago|reply

We switched to CircleCI from our own, painfully maintained Jenkins box to test our Rails apps. Between their clearly highly tuned infrastructure, nice little tweaks like caching the installed gemset between builds, and automatic parallelization of the test suite, our average build times dropped by 50-75% and became much more consistent. We did give up some flexibility in determining which branch names automatically run, or whether builds only run once a pull request is opened, but it's well worth it.

[+] amackera|12 years ago|reply

We've been using Circle for quite a few months now and it's amazing. Thanks for a great service!

[+] thejosh|12 years ago|reply

Stupid question, do you also support bitbucket?

[+] pedoh|12 years ago|reply

Part two is up: http://codeascraft.com/2013/09/23/lxc-automating-containers-...

It looks like they're not leveraging LXC via Docker. I wonder if that's because they've been doing it this way pre-Docker, or if there are some technical reasons why it made sense to skip it.

[+] bobf|12 years ago|reply

They've probably been doing it long before Docker, as most people who use LXC have been. Also, it makes sense to skip it because Docker adds relatively little for their use case.

[+] contingencies|12 years ago|reply

Indeed, LXC rocks for this.

Drawbacks are that it's nontrivial to set up and requires some rigid formalism in developer output that sometimes demands training and/or cultural change. But it's definitely something everyone should consider.

In my (currently internal, heavily LXC-utilizing but) infrastructure and OS neutral project, I am looking at specifically this sort of automation but for complex topologies of interdependent services, HA clustering layers, complex emulated network topologies (bonded links, multiple VLANs), etc. Plans are to include failure testing at the communications level (slow links, lossy links, cable snaps, switch failures, etc.) in addition to resource levels (disk, memory, etc.).

Outputs of a successful automated testing environment can include amazingly detailed information for capacity planning, automatically generated security policies (both for container-side, host-side and infrastructure-side deployment).

It's a fascinating area and one that is ripe for great change. Many people have needs here, the question is how to meet them at the intersection of current infrastructure and codebases, existing teams, business level concerns, varying hardware availability, etc. Both pre-commit and post-commit hooks are useful for different types of automation. IMHO LXC's blazing speed broadens significantly what can be tested with pre-commit.

[+] jtreminio|12 years ago|reply

You say your dev's local development environment is different to prod. Are you guys letting each dev set up their own environment by hand, or have you provided a Puppet or Chef repo that they can clone and have an exact replica up and running within minutes with Vagrant?

[+] JPaulEtsy|12 years ago|reply

It's only somewhat different. We try to keep everything as similar as possible. We use chef to get new and old VMs up to date on our current setup, reusing recipes from prod to dev whenever possible. That being said, every dev is allowed to modify/change their vm in any way they seem fit. It is recommended to speak with us before doing any wild changes of configurations that might cause chef runs to start failing or make your VM not as good of a representation of production.

We also allow our developers to connect to a proxy to our production MySQL shards from their development environments in a read only mode. This allows them to leverage the large data sets that are quite hard to replicate in our development architecture. There is also a limited read/write mode that we are working on (with the proxy filtering dangerous queries). But all that is another blog post for another day.

We also do not use vagrant, opting for QEMU/KVM on physical hardware. The same tooling you saw in part 2 of my blog post also creates our development VMs as well.

[+] rorrr2|12 years ago|reply

It doesn't matter. Developers almost never develop/test in an environment that mimics production - multiple load balancers, multiple app servers, multiple database servers, failover to a second data center, etc.

If your dev environment is not different from prod, you're either insanely rich of your server setup is trivial.

[+] sir_charles804|12 years ago|reply

I would love to know the OP's (or anyone else's) thoughts on Vagrant and why it is not in use to mimic prod on all of the devs' machines.

[+] bazzargh|12 years ago|reply

They've 2 use cases, this test one, which has a rationale linked from 'workload' in article:

"Run end-to-end, the 7,000 trunk tests would take about half an hour to execute. We split these tests up into subsets, and distribute those onto the 10 machines in our Jenkins cluster, where all the subsets can run concurrently..."

http://codeascraft.com/2011/04/20/divide-and-concur/

...so clearly running these tests on a single devs machine would be a bottleneck. The other use case is the dev env: in a previous blog they described how they're using their own internal cloud to run the dev vm's faster on dedicated hardware (with easy, one click provisioning):

http://codeascraft.com/2012/03/13/making-it-virtually-easy-t...

which makes sense. Why emulate prod running on dev's own boxes when you can pool the hardware and get better utilisation, & at the same time run them faster?

30 comments