Kill Your Dependencies

[+] LukeB_UK|10 years ago|reply

I disagree, and this quote I've seen floating around the internet sort of sums the idea up to me (albeit with a music analogy):

> I thought using loops was cheating, so I programmed my own using samples. I then thought using samples was cheating, so I recorded real drums. I then thought that programming it was cheating, so I learned to play drums for real. I then thought using bought drums was cheating, so I learned to make my own. I then thought using premade skins was cheating, so I killed a goat and skinned it. I then thought that that was cheating too, so I grew my own goat from a baby goat. I also think that is cheating, but I’m not sure where to go from here. I haven’t made any music lately, what with the goat farming and all.

[+] babebridou|10 years ago|reply

The idea in the OP is that if you're going to use the Amen Break, don't require "Hiphop-all", "Breakbeat-all" or, hell, if we're going by some of the Wikipedia examples, "Futurama-all".

Just import "AmenBrother-drums" or something and start from there, because obviously you're not going to use Zoidberg's leftmost tentacle in your cool new sound.

https://en.wikipedia.org/wiki/Amen_break

[+] endemic|10 years ago|reply

I don't think that's what he's advocating: he's just saying that every dependency you have is another thing you have to worry about, so why not try to limit them as much as possible? Obviously it's impractical sometimes, and that's OK, as long as you understand the consequences.

[+] gglitch|10 years ago|reply

Charming quote. The same argument could be made about making your own food "from scratch." But it's not a solid refutation of the essay. Let's say that if one end of the spectrum is unlimited dependencies and complete indifference to the complexity and size of the project, and the other end of the spectrum is raising your own goats, there must be an ideal somewhere in the middle.

[+] kibwen|10 years ago|reply

This reminds me of Objectivist-C:

http://fdiv.net/2012/04/01/objectivist-c

"In Objectivist-C, software engineers have eliminated the need for object-oriented principles like Dependency Inversion, Acyclic Dependencies, and Stable Dependencies. Instead, they strictly adhere to one simple principle: No Dependencies."

[+] BurningFrog|10 years ago|reply

'If you wish to make an apple pie from scratch, you must first invent the universe.'

  — Carl Sagan

[+] goblin89|10 years ago|reply

If we’re talking recording, it doesn’t matter that much how exactly you arrive at some fixed audio sequence. Indeed, use loops or goats, borrow $xxxxx gear from friends, pull multi-terabyte libraries of perfectly sampled cello—as long as it floats your boat (and your target audience’s. If you make stock progressive house for clubs, goat involvement probably isn’t worth it). By way of knob tweaking and some applied randomness you arrive at a take you like and then you delete everything else and let the goat roam free.

Things are different if we’re talking live performance. You’d be fine with that multi-terabyte cello library on stage, but not if sampling plugin brought opaque nth-party dependencies you can’t vouch for and have no time to properly wrap your head around. You don’t want surprises and you want to know for sure there’s enough performance leeway on your concert laptop.

There are steps you’d take to reduce the error margin while you perform that have no parallels in software engineering realm. Maybe you’d bounce your cello loops to audio to never have that computation happening in real time. Maybe you’d bring analog synths, which are bulky and pricey but also simpler conceptually, self-sufficient, well-tested and never give blue screen of death.

Your song is essentially being created every moment your program is running or being developed. You’re putting even more trust in what you’re pulling—the black box abstraction boundaries around your dependencies can contract, the input you receive from audience is more direct, and importantly there’s no you actually playing instruments and directing the performance because the whole behavior is defined by algorithms that often end up partly delegated to dependencies.

[+] justaaron|10 years ago|reply

that's a silly/wrongful quote IMO... I've produced/recorded music for over 20 years and loops from other people are not only cheating, they are not your damned recordings and you sound like every other loop-arranger out there... it's not really that difficult to record real-world sounds and guess what!? you can use those same audio tools (your daw + plugins etc) to manipulate and sweeten YOUR recordings just as easily as you do with others recordings aka loops. making music with sample-pack loops is basically just dj'ing/remixing... which I also have done for over 20 years so I know the difference... Using boughten drums is not cheating, as it's generally about the way you tune them and play them that make the difference, versus loop manipulation that literally is about modifying otherwise sample-for-sample copies... Guitar tonality = 80% in the fingers and the ear/mind of the player... The pickups and amp/pre-amp combination DO make the other 20% perhaps... As for making drums from goat-skins, well it's not exactly rocket science...

now, onto the the problem I have with the actual issue at hand: it's not like the author is saying to write your libraries in assembler. he's saying that maybe you don't need to include gems upon gems that themselves reference other gems, as the dependencies pileup and get ridiculous and the performance suffers, which is actually a major concern with Rails and other monolithic frameworks with single-threaded processing etc. for crying out loud, he's even giving you permission to specify your behaviors with a ridiculously high-level concise language aka Ruby... how much more direct and obvious can a point be and still be missed?

[+] AndyMcConachie|10 years ago|reply

There's always going to be a balance between reusing code someone else has wrote, and writing new code. They both have their pros and cons.

[+] coldtea|10 years ago|reply

I'm usually against counter-arguing by referencing fallacies, but this sums up your comment perfectly:

http://www.nizkor.org/features/fallacies/slippery-slope.html

[+] reedlaw|10 years ago|reply

The idealized goat farmer will make a better musician than the one who simply uses others' samples. To bring the analogy back home, the programmer who understands silicon and circuits is better equipped than the one who relies on massive dependencies.

[+] ninjakeyboard|10 years ago|reply

Regardless of how perfectly this does or doesn't match the context, I approve as an electronic musician and programmer and I will use this quote forever! Thankyou! www.soundcloud.com/decklyn

[+] mwcampbell|10 years ago|reply

A large number of dependencies is only a problem in environments that aren't amenable to per-function static linking or tree-shaking. These include dynamically typed languages like Python, Ruby, and JavaScript (except when using the Google Closure Compiler in advanced mode), but also platforms like the JVM and .NET when reflection is allowed. Where static linking or tree-shaking is feasible, the run-time impact of bringing in a large library but only using a small part of it is no more than the impact of rewriting the small part you actually use.

Edit: Dart is an interesting case. It has dynamic typing, but it's static enough that tree-shaking is feasible. Seth Ladd's blog post about why tree-shaking is necessary [1] makes the same point that I'm making here.

[1]: http://blog.sethladd.com/2013/01/minification-is-not-enough-...

[+] jerf|10 years ago|reply

The run time resource usage risks are only one element of deep dependency trees, and certainly the one I worry about least. The biggest risks are the fact that you've handed every person in your dependency tree developer status in your project. That includes not just potential maliciousness, though that is a factor, but potentially disappearing the project entirely, potentially taking it in an unexpected direction, potentially introducing bugs (oh that all projects merely monotonically got better), potentially introducing security vulnerabilities in deep parts of the stack you wouldn't even think to audit, diamond dependencies, difficult-to-replicate builds, etc. There are then tools that can strike back at some of these, but there's something to be said for avoiding the problem.

For that matter, there's no guarantee that tree-shaking would even have fixed the referenced issue; if the library preloaded 10MB of stuff, like a Unicode definition table, that you didn't use, but the tree shaker couldn't quite prove you never would, you'll still end up with it loaded at runtime. (For that matter, you may very well be using such code even though you don't mean to, if, for instance, you have code that attempts to load the table, and uses it if it loads for something trivial, but will just keep going without it if it is not present. The tree shaker will determine (correctly!) that you're using it.)

Basically, tree shaking only sort of kind of addresses one particular problem that deep dependencies can introduce, and that one not even necessarily reliably and well.

[+] nostrademons|10 years ago|reply

Not always. Dependencies were a huge problem at Google, even in C++ (perhaps especially in C++), because they mean that the linker has to do a lot of extra work. And unlike compiling, linking can't be parallelized, since it has to produce one binary. At one point the webserver for Google Search grew big enough that the linker started running out of RAM on build machines, and then we had a big problem.

There's still no substitute for good code hygiene and knowing exactly what you're using and what additional bloat you're bringing in when you add a library.

[+] skybrian|10 years ago|reply

Tree shaking is helpful but not enough. It makes dependencies more fine-grained and binaries smaller by removing some false sharing. But library maintainers still have to be careful about true sharing, where a function calls another function, which in turn pulls in something big (like a lot of data stored in a constant).

You need both tree shaking and a community dedicated to keeping code small.

Javascript has the latter; it's not universally true, but lots of JavaScript libraries pride themselves on small code size and few dependencies.

That's great. But you can't stop doing that work just because you have a tree-shaking compiler. For example, there's a lot of work going into making Angular 2 apps reasonably sized and dart2js doesn't magically make it go away.

[+] rlpb|10 years ago|reply

I can't disagree with you, but you're also missing other issues related with having a big pile of dependencies. Maintainability being one. Runtime efficiency isn't the only problem that can be solved here.

[+] shoover|10 years ago|reply

Yeah, but it seems like you listed most of the platforms most people actually use, all in the X column. How do we get from 100MB Electron deployment and fat, partially used jars and dlls to this magical tree-shaking world?

[+] x0x0|10 years ago|reply

The jvm isn't really penalized because the concurrency model tends to be threads, rather than processes, so you don't pay a per-worker cost for a large library. Plus the jvm will optimize only the actually used code, including de-virtualizing.

[+] outworlder|10 years ago|reply

It helps somewhat, but I feel that the "only is a problem" assertion is too strong.

Tree shaking doesn't help you when you are pulling in every HTTP client in existence transitively. It is still code that is being run, so can't be automatically optimized away, but it is unnecessary.

[+] _0w8t|10 years ago|reply

The casual link between tree-shaking-compatible-languages and accepting a large number of dependencies is not that clear. It could be that those who use large number of dependencies just prefer less dynamic languages when dependencies are very explicit and manageable.

[+] TazeTSchnitzel|10 years ago|reply

PHP doesn't have tree-shaking, as it isn't compiled, but it does have autoloading: the source code files for classes are only loaded from disk when they're instantiated. This is probably similarly beneficial.

[+] diggan|10 years ago|reply

As everything, I think a bit of balance is needed.

You're doing a quick MVP to demonstrate that your idea is working? Fuck it, just throw in dependencies for everything, just care about solving the problem you're trying to solve and proving/disproving your point.

Once you verified it, then go and kill your dependencies. But don't do it just because you want to do it. If in the end the users doesn't benefit from you optimizing your dependencies, why do it? (Speaking from a product side rather than a OSS project used by other projects)

Not sure KILL ALL DEPENDENCIES is helpful, but I'm not sure that MAKE EVERYTHING A DEPENDENCY is helpful either so...

[+] st3v3r|10 years ago|reply

That'd be good advice if those MVPs didn't so often become the actual product themselves. If industry and management understood that these things were proof of concepts, and realized that the actual product is going to have to be rewritten, then I'd agree with you.

[+] badloginagain|10 years ago|reply

Ruby is great for prototyping, because its so easy to get things up and running.

The big thing is transitioning to any kind of final production code. The rules for clean code apply as much for Ruby as it does for any other language.

But its a good post, it can easily be something you overlook, due to gems being so damn convenient.

[+] rhapsodyv|10 years ago|reply

Perfect. Get things done fast. Prove your product. If it succeed, you will have time to tune every aspect and invent your own wheel that fit your needs. But until that, RAM is a lot cheaper than your own time writing from scratch your version of things that are very stable and largely used.

But, the advice is really important for gem writters. As a gem author, I think you really need think a little more about our dependencies as you do with our public interface.

[+] fixermark|10 years ago|reply

s/kill/understand, which is useful advice for software engineering in general.

As time approaches infinity, the number of magic "I use this package and it does something in my code, and then it all just works" dependencies you pull in should approach 0.

[+] allendoerfer|10 years ago|reply

To me, this seems more like an argument for optimizing beyond your own stack. Don't kill your own dependencies.

Your app uses to much memory? Improve a dependency, you have now improved other peoples apps, too.

Your app uses to much dependencies in total? Try to get all your first-level dependencies to standardize on the best http-client. (Which he is partially doing with his post.)

Dependencies may have problems, but shared problems are better than problems only you have.

[+] sargas|10 years ago|reply

I agree 100% with this.

I used to bring in dependencies with the "don't reinvent the wheel" mentality. Then I realized how much trust I'm giving to the authors of all dependencies I pull in. Now I tend to do my best to understand the dependencies I bring so I can improve them if I can.

The only problem I find with this decision is when I make an improvement/fix a bug on a dependency, and the project is either inactive or the authors don't give a crap about your work.

[+] drunkenazi|10 years ago|reply

Couldn't agree more. I think the title is unreasonably one sided, and saying "be part of the solution" is equally one sided.

Dependencies are great for the reasons you specified, and I saw nothing in that article suggesting otherwise. The part that feels the worst to read is:

> Can I implement the required minimal functionality myself? Own it.

This is largely a judgement call; "can I" and "minimal functionality" are subject to change based on many external circumstances. "Own it" also seems to imply owning it not as a dependency, based on the context, but rather as a part of a monolithic whole.

It is also interesting that the sidekiq product makes use of gem dependencies. At top level 5 without platform dependencies, which (mostly due to rails) expands out to many more. The message should not be to "kill your dependencies", because that mindset is outdated and slow.

So tired of hearing about how bad dependencies or scripting languages are. Would be much more excited to hear about how to contribute to open source dependencies, and how to write efficient scripts.

[+] simonw|10 years ago|reply

Another benefit to minimizing your dependencies is security. The less external packages you are using (especially packages without active, security-conscious maintainers) the less likely you are to suffer a surprise vulnerability due to something deep down in your dependency hierarchy.

This goes for client-side JavaScript too. XSS holes are one of the worst web app vulnerabilities out there and could easily be introduced accidentally by a simple mistake in a library. And this stuff is incredibly hard to audit these days thanks to the JavaScript community's cultural trend towards deeply nested dependencies.

[+] riffraff|10 years ago|reply

but otoh, if you try to reinvent something instead of using a tried & true library, you might as well just add new bugs.

I.e. I'd 100% use libxml to sanitize xml rather than trying and reimplementing xml parsing myself.

As always, trade offs.

[+] ninjakeyboard|10 years ago|reply

I'm not 100% sure I agree with this as stated. Sure if the functionality is in core lib, use it but... it depends...

Consider these three statements:

- No code runs faster than no code. - No code has fewer bugs than no code. - No code is easier to understand than no code.

For a language like scala where there is no json processing in the standard lib, if there is a json library that is battle tested, then by removing my own json code and leaning on that well tried and tested code for serialization/de-serialization, I've removed a whole bunch of code from my own library. The whole point of having modules as abstractions is to keep concerns neatly tucked in their own places to to increase re-use. By subscribing to the idea that my module should implement all of the functionality it needs, we're loosing the benefits of modularization.

I just went through this exercise myself in a library I maintain - I removed my own json code and put a library in. I removed a bunch of code and made the whole thing simpler by leaning on that abstraction.

[+] rileymat2|10 years ago|reply

"The mime-types gem recently optimized its memory usage and saved megabytes of RAM. Literally every Rails app in existence can benefit from this optimization because Rails depends on the mime-types gem transitively: rails -> actionmailer -> mail -> mime-types."

It seems like this could also be cast as a major success for "semi" standard dependencies.

[+] laumars|10 years ago|reply

This article would be more accurately written as "prefer the standard library over 3rd party solutions" since all the examples given still required dependencies, but ones that are shipped as part of the language runtime (Ruby in this case).

However when discussing languages with no specific standard library or languages who's standard library is missing feature y, then it's quite understandable to use a 3rd party battle tested dependency. In fact I'd go further and say it would be advisable to use a respected 3rd party library when dealing with code which handles security or other complex concepts with high failure rates.

[+] Animats|10 years ago|reply

Avoid shims.

There are lots of libraries that just put one interface on top of another interface. They don't do much actual work. Pulling in shims, especially if they pull in lots of other stuff you're not using, should be avoided.

If the dependency does real work you'd otherwise have to code, then use it.

[+] dawnerd|10 years ago|reply

cough Mongoose. Been moving away from it on my projects. While it does provide a nice interface, it just creates more work down the road.

[+] peterwwillis|10 years ago|reply

Perl apps have thousands upon thousands of dependencies. It's intentional - reused code in CPAN means less downloading, more efficient code, and less bugs as the codebase gets refined. An app that relies mostly on dependencies is essentially an app with free support by hundreds of developers. That's the case with CPAN anyway; I don't know how Ruby people do things.

Bugs happen, though. If you see a bug in a dependency, it is your job to report it at the very least, if not make an attempt to fix it. Without this community of people helping to improve a common codebase, we'd all be writing everything from scratch, and progress would move a lot slower.

[+] adenadel|10 years ago|reply

This reminds of of this article

http://www.joelonsoftware.com/articles/fog0000000007.html

Apparently Microsoft's Excel team had even written their own C compiler.

[+] nickpsecurity|10 years ago|reply

Obligatory essay from PHK on the effect the author describes:

http://queue.acm.org/detail.cfm?id=2349257

History continues to repeat itself. Fake reuse and proliferation of unnecessary bloat are two of those recurring themes. Fight it whenever you can. The old TCL, LISP, Delphi, REBOL, etc clients and servers were tiny by modern standards. They still got the job done. Strip out bloat wherever you can. Also, standardize on one tool for each given job plus hide it behind a good interface to enable swapping it out if its own interface isn't great.

[+] vinceguidry|10 years ago|reply

Gems I use fall into three categories.

A lot of my projects are just wrappers around one main gem. Rails, Nokogiri, Roo, API wrapper gems. These are 'project gems'. If they give me problems, I'll re-evaluate the scope of the project and perhaps pick another gem to orient the project around. Once the project reaches maturity, I'll default to fixing the problem rather than re-engineering it unless the problems run deep.

Sometimes I'll use gems like Phoner to handle datatypes that are too tricky to do with regular Ruby. I'll call these 'utility gems'. When I include a utility gem, generally it has one job and one job only, it's invoked in exactly one place in the code and gets included in that file. I can generally replace a utility gem with stdlib Ruby code if I really need to.

I also have what I call 'infrastructure gems'. These are gems like pry, capistrano, and thor that I tend to include in every project where it seems they would be useful. These are gems that are worth getting to know very well because they solve really hard problems that you don't want to use stdlib for. If these give me problems I will do whatever I need to to resolve them and understand why the problem exists, because the costs of migrating off of them would be steep.

The decision to use a gem should not be taken too lightly, but nor should it weigh large on the mind. Be quick to try it out, but also quick to take it out.

[+] EGreg|10 years ago|reply

I was just thinking about this today. But from the point of view of growinga community around a platform!

Would you want to have one namespace for "official" modules and heavily influence everyone to use them? That's centralization (of governance). But, it's not centralization of a process that requires high availability. So the "drawback" is only that you centralize control and can make certain guarantees to developers on your platform.

When you're starting an ecosystem, you can choose a "main namespace" as yum, npm etc. does or you can choose the more "egalitarian" convention of "Vendor/product" as github and Composer do. I think, in the end, the latter leads to a lot more proliferation of crap, and as the articls said, multiple versions of everything existing side-by-side.

I have to deal with these issues when designing our company's platform (http://qbix.com/platform) and I think that having a central namespace is good. The platform installer etc. will make it super easy to download and install the "official" plugins. You can distribute your own "custom" plugins but the preferred way to contribute to the community would be to check what's already there first and respect that. If you REALLY want to make an alternative to something, make it good enough that the community's admins protecting the namespace will allow it into the namespace. Otherwise, promote it yourself, or fork the whole platform.

[+] BinaryIdiot|10 years ago|reply

This is a great read that can be applied to node.js very much. I've seen apps that include 10, maybe 20 dependencies but when you flatten out the full dependency tree? Thousands. It's incredible and if one of those dependencies screws up semantic versioning or just screws up in general it can be a nightmare to debug and fix.

This is why every 1.0 product I work on I include every dependency that speeds up my development. In 2.0 the first things to do is prune all unnecessary dependencies and start minor rewrites when a dependency can be done in house (yeah yeah reinventing the wheel is a problem but most npm dependencies are small and many can be recreated internally without issue).

This is even more important if you're creating a library / module. My msngr.js library uses zero dependencies and yet can make http calls in node and the browser because it was easy to implement the minimal solution I needed without bringing in dependencies to support a single way of calling http.

[+] alexose|10 years ago|reply

The worst offender, IMO, is request. I've seen more than a few projects pull it in just to make a single HTTP call. Just look at its package.json:

https://github.com/request/request/blob/master/package.json

[+] nextos|10 years ago|reply

This sadly also happens in some Linux repositories that add too many dependencies to a few key packages.

On NixOS, last time I tried, installing mutt ended up bringing python as well.

[+] yoz-y|10 years ago|reply

> No code runs faster than no code. > No code has fewer bugs than no code. > No code uses less memory than no code. > No code is easier to understand than no code.

The dependencies you decide to implement yourself in a minimal fashion are code though. I generally agree with the article, but in the end It Depends™

[+] dec0dedab0de|10 years ago|reply

It sounds like this is advocating NIH syndrome. If a library is going to make my job easier I'm going to use it, unless there is a very specific benefit of doing it myself.

[+] justinator|10 years ago|reply

Perl takes a pragmatic take on this (as well as other takes...) with the collection of ::Tiny CPAN modules that just do one thing pretty OK. Things like Try::Tiny that help immensely with exception handling - something you don't want to really roll you own.

It itself does not have any dependencies that aren't in core:

http://deps.cpantesters.org/?module=Try%3A%3ATiny;perl=lates...

[+] ocdtrekkie|10 years ago|reply

So, I've been writing a home automation system using the .NET Framework (with Visual Basic, I'll wait until you finish laughing).......... Okay.

I've made a point not to add any third parties references and packages I can avoid. I went ahead and got a third party scheduling engine, and the SQLite provider, but beyond that, I'm writing everything else myself so far.

First of all, I'm learning a lot in having to write stuff myself. At the very least, it's a great educational experience. I've worked with a lot of code samples, so I'm not going totally from scratch, but they're all at the very least tailored to my needs.

But for me, the big thing is keeping everything thin. The program loads in milliseconds. Almost all of the reference data for what it's built on is in one place (the .NET Framework Reference). And key, is that the features my program supports are the features I want and need, not the features some dependency has told me to have.

The biggest dependency I have, Quartz.NET, is actually the most confusing part. It's not structured like the rest of my program is, it's documentation leaves some things to be desired, and it does a lot more than I need it to. There's a lot of bloat I could cut out if I wrote my own scheduler, and maybe someday I will.

[+] cdnsteve|10 years ago|reply

Double edge sword. Deps are great! Functionality added quickly. Deps are terrible! They broke my app.

If your app has a long shelf time, the less deps you rely on, the easier to manage from what I've seen.

For some reason Golang feels like it makes sense here. Pretty much everything you need is in core. *Disclaimer, I don't have any Golang apps in prod but I'd love to hear from those that do.

226 comments