top | item 14377232

Bootsnap: Optimizing Ruby App Boot Time

285 points| Finbarr | 8 years ago |engineering.shopify.com

74 comments

order
[+] JohnBooty|8 years ago|reply
Wow! This is potentially a real gain for us. We have a big in-house, monolithic Rails app.

My initial experiment was encouraging. Boot time in development mode went from ~23sec to ~16sec, and I only enabled it for the main engine that comprises about 85% of our codebase so the real gains might be larger.

Looking forward to seeing what it can do in production mode - our boot times there are horrendous and it's a big deal for things like cron jobs. Thank you to all those who worked on this.

[+] chrisseaton|8 years ago|reply
In the implementation of Ruby that I work on, TruffleRuby, we've been exploring lazy parsing, where the parser will find a method but not fully parse it until the method is called for the first time. I wonder if there's any other modifications you could make to the VM itself to improve startup time.
[+] burke|8 years ago|reply
The particularly frustrating thing, when I've started thinking about optimizing boot time at a VM level, is that it's near-impossible to "understand" what loading a file actually does, since it's all just evaluated in a single namespace.

It would be great if we somehow had a way to load a module-as-file without unknown side-effects, and without depending so deeply on the other contents of the global namespace.

But this is basically describing a complete overhaul of most of what makes ruby ruby, so... ¯\_(ツ)_/¯

[+] rurban|8 years ago|reply
I'm also planning to add lazy parsing to perl. Do you store the whole string of the method body or do you mmap your source files, and store only the mmap ptr, offset and length for the body?
[+] tomstuart|8 years ago|reply
How difficult is lazy parsing for Ruby? How much parsing do you need to do just to find where the method body ends?
[+] burke|8 years ago|reply
I'm the primary author, can answer questions if you have any.
[+] johne20|8 years ago|reply
Nice work Burke! I saw the title, and I immediately thought that has to be Burke. Sure enough... :)
[+] wolco|8 years ago|reply
Curious when did shopify go with Ruby/rails. If I remember when the company was initially started they were looking for php developers. Was the orginal stack built in Ruby/rails?
[+] daviding|8 years ago|reply
Does it work on a Heroku stack?
[+] dobs|8 years ago|reply
Gave this a quick shot on my own monolithic app and it cut startup time almost in half. Impressive considering how easy it was to configure!

Startup time was one reason we started migrating away from Rails in a previous workplace, between frustrating startup time in development and test and occasional quirkiness of zeus and spring. Bootsnap would have been a godsend.

[+] burke|8 years ago|reply
I've tossed around the idea of writing zeus again now that I actually understand the language I wrote it in. Spring is much simpler, but because of the manner in which it's loaded, it isn't capable of detecting certain types of file change, which reduces developer confidence in it.

Zeus is capable of detecting any sort of invalidating file change, but is pretty buggy (or at least was historically -- the Stripe guys improved it a lot after I stopped working on it).

[+] jitl|8 years ago|reply
This is awesome, and I'd love to use it for the command-line dev tools that I write. Unfortunately this gem requires Ruby 2.3+, but macOS built-in Ruby, which is the Ruby we target, is only 2.0.0.

Does anyone know of a good solution for prebuilt, relocatable Rubies on macOS that I could easily bundle with my tool? I'm reluctant to use Homebrew or another package manager like rbenv, where I'd have to implement a non-trivial bootstrap process. Phusion's travelling-ruby project would be perfect, but it's unmaintained.

I just want my CLI to boot in 0.05s without needing to change languages. Love Ruby, but getting decent perf takes a bit of effort.

[+] burke|8 years ago|reply
Sadly, the 2.3 requirement is inherent, since the RubyVM::InstructionSequence dump/load API was introduced in 2.3.0. However, you could probably still benefit from http://github.com/byroot/bootscale.
[+] jacobevelyn|8 years ago|reply
As a fellow Ruby CLI developer, I feel your pain exactly. I've been planning on exploring Traveling Ruby[^1] for exactly this reason (as well as the fact that telling users they need to `sudo gem install` something is non-ideal) but hadn't yet gotten around to it.

Out of curiosity, what's your tool(s)?

[^1]: https://github.com/JacobEvelyn/friends/issues/160

[+] dismantlethesun|8 years ago|reply
I'm kinda shocked that Ruby boot times can be up to 25 seconds for a monolithic app.

A Python project I work on has 279,124 lines of code and boots up in 2.5 seconds.

Without downloading it, all I can find is Discourse had 60,000 lines of code 3 years ago [1]. Assuming as an extreme estimate they tripled their code size in 3 years, we have 180,000 LOC taking 6 seconds to boot up according to the article.

Is this normal for Ruby? Is the author using a spinning disk drive rather than an SSD?

[1] https://github.com/bleonard/rails_stats

[+] burke|8 years ago|reply
The largest culprit for slow ruby boot times is an O(n) number of syscalls over the LOAD_PATH each time `require` is called, so the number of syscalls is essentially O(n*2) to the number of gems. The load-path-caching feature of bootsnap (cf. bootscale) fixes this, and accounts for a reduction from 25 to ~9.5 seconds. The iseq/yaml caching only accounts for the last ~3.5 seconds.
[+] EvilTrout|8 years ago|reply
Discourse co-founder here.

I'm not sure about those stats you posted from 3 years ago since they aren't using the same `rake stats` numbers that are built in to Rails. Discourse's Rails app is currently 63k SLOC not including tests.

On my relatively fast computer booting takes 4s without bootsnap and 2.5s with it, which is a nice quality of life improvement.

[+] jitl|8 years ago|reply
In this case, you need to analyze not only the applications source code, but also the size and quantity of its dependencies, which inflate Ruby's LOAD_PATH, which as discussed makes `require` slow. The issues raised here are typical for a large Ruby application with many gem dependencies.

I think it's safe to assume the author using reasonable SSDs on a Macbook Pro, given that the iseq cache targets only macOS.

[+] choward|8 years ago|reply
Does your Python app load all at once or lazily load as you hit different parts of the app?
[+] burke|8 years ago|reply
FWIW, the machine that generated all of those times:

* MacOS Sierra

* 2.6 GHz Intel Core i7

* 16GB 2133 MHz LPDDR3

* 500GB SSD, whatever Apple ships.

[+] est|8 years ago|reply
If you try zc.buildout, your python code start time will drop significantly. It will insert gazillion sys.path.
[+] Cerium|8 years ago|reply
Thanks for releasing this, I gave it a try.

Starting benchmark time: 13.05 seconds. With load_path_cache: 10.01 seconds

Sadly, with compile_cache on I'm getting an error. /vendor/bundle/ruby/2.3.0/gems/bootsnap-0.2.14/lib/bootsnap/compile_cache/iseq.rb:30:in `fetch': No space left on device (Errno::ENOSPC)

Any ideas on what causes this?

[+] burke|8 years ago|reply
Yep, you're probably using linux. The cache backend for compiled artifacts is filesystem extended attributes, which have a maximum size of 64MB on darwin, but as little as 4kB on some linux configurations (if they're even enabled, which they often are not).

Practically speaking, the compilation caching features are not supported on linux. Eventually we'll change the cache backend or add a different one that does work on linux.

[+] ausjke|8 years ago|reply
Considering PHP7, Java8/Kotlin, Go, C++17, Python3, Javascript/ES6 etc these days, how will Rudy be doing in the long run? any reason for new comers to pick up Ruby instead of the mentioned list? I just started using PHP myself.
[+] deedubaya|8 years ago|reply
Avoiding a flame war, it depends on what your goals are.

From a language standpoint: Ruby emphasizes developer happiness at the expense of some things, like performance/concurrency for example.

From a career standpoint: There is a lot of ruby in the world today. There will be lots of applications to maintain as the years go on, which is +1 from a career perspective. Lots of people will also continue to write new ruby software, because it's effective and easy to be productive in.

All the languages you mentioned + ruby are all good languages to learn for various reasons. All have their weaknesses and strengths. None of them are an effective hammer for every nail you'll encounter.

[+] qmr|8 years ago|reply
PHP is an absolutely horrible language. I suggest learning ruby or python.
[+] quotha|8 years ago|reply
5.times { print "Odelay!" }
[+] zapt02|8 years ago|reply
I think Ruby is essentially dead in the waters. It just has no unique selling point. Python is better at small scripts, machine learning and mathematical application. PHP7 has much better tooling for HTTP, the largest CMS systems and doesn't have any boot time to speak of. JS & Node is the new kid on the block with tons of great libraries being written for it. Why would you start to learn Ruby today?
[+] omarforgotpwd|8 years ago|reply
Might have missed something, but why not just merge these changes into Rails?
[+] yxhuvud|8 years ago|reply
For starters, because it only works on mac.
[+] misterbowfinger|8 years ago|reply
Are there plans to support JRuby?
[+] burke|8 years ago|reply
No. I'm not opposed to it, but we don't use it at Shopify and I doubt the RubyVM::InstructionSequence API is compatible.

Bootscale should work, and the load-path-caching feature of bootsnap should work too, if you can get the gem to install.

[+] iagooar|8 years ago|reply
-
[+] burke|8 years ago|reply
Honestly, it works well for us. DHH may have been a little rosier than necessary: there are some downsides, to be sure, but we can mitigate them to a large extent (e.g.: TFA), and we get a lot of benefit out of the architecture.

It is definitely a net positive for us. YMMV, of course.

[+] noir_lord|8 years ago|reply
> In 2017 there is really no reason to defend a monolithic architecture.

I wonder if in 2019 I'll be seeing "In 2019 there is really no reason to define a micro-services architecture".

The pendulum it keeps on swinging.