top | item 23630640

Software reuse is more like an organ transplant than snapping Lego blocks (2011)

667 points| nilsandrey | 5 years ago |johndcook.com | reply

228 comments

order
[+] _bxg1|5 years ago|reply
This makes me realize: the one example I can think of where software has been "Lego-like" is Unix piping. I'm not a Unix purist or anything, but they really hit on something special when it came to "code talking to other code".

Speculating about what made it stand apart: it seems like the (enforced) simplicity of the interfaces between pieces of code. Just text streams; one in, one out, nothing more nothing less. No coming up with a host of parallel, named streams that all have their own behaviors that need to be documented. And good luck coming up with a complicated data protocol built atop your text stream; it won't work with anything else and so nobody will use your program.

Interface complexity was harshly discouraged just by the facts-on-the-ground of the ecosystem.

Compare that with the average library interface, or framework, or domain-specific language, or REST API, etc. etc. and it becomes obvious why integrating any of those things is more like performing surgery.

[+] ohazi|5 years ago|reply
I think the best way to describe pipes to the uninitiated is in terms of copy & paste.

Copy & paste is the basic, ubiquitous, doesn't-try-to-do-too-much IPC mechanism that allows normal users to shovel data from one program into another. As simple as it is, it's indispensable, and it's difficult to imagine trying to use a computer without this feature.

The same applies to pipes, even though they work a little bit differently and are useful in slightly different situations. They're the "I just need to do this one thing" IPC mechanism for slightly more technical users.

[+] epr|5 years ago|reply
Rich Hickey addresses this in his famous 2011 talk, titled Simple Made Easy.

> Are we all not glad we don’t use the Unix method of communicating on the web? Right? Any arbitrary command string can be the argument list for your program, and any arbitrary set of characters can come out the other end. Let’s all write parsers.

The way that I think about this is that the Unix philosophy, which this behavior is undoubtedly representative of, is at one end of a spectrum, with something like strict typing at the other end. Rich, being a big proponent of what is described in the article as "Lego-like" development clearly does not prefer either end of the spectrum, but something in-between. In my opinion as well, the future of software development is somewhere in the middle of this spectrum, although exactly where the line should be drawn is a matter of trade-offs, not absolute best and worst. My estimation is that seasoned developers who have worked in many languages and in a variety of circumstances have all internalized this.

[+] nwienert|5 years ago|reply
In practice it’s totally inscrutable. I never remember or even feel comfortable guessing at anything more than the most basic. Meanwhile, any typed library in language X usually works immediately with no docs given a decent IDE.
[+] shadowgovt|5 years ago|reply
I think piping works well because it's an opinionated framework for IPC. The strong opinions it holds are:

1) data is a stream of bytes

2) data is only a stream of bytes

That's it. And it turns out that's a pretty powerful abstraction... Except it requires the developer to write a lot of code to massage the data entering and/or leaving the pipe if either end of it thinks "stream of bytes" means something different. In the broad-and-flat space where it's most useful---text manipulation---it works great because every tool agrees what text is (kind of... Pipe some Unicode into something that only understands only ASCII and you're going to have a lousy day). When we get outside that space?

So while, on the one hand, it allows a series of processes to go from a text file to your audio hardware (neat!), on the other hand, it allows you to accidentally pipe /dev/random directly into your audio hardware, which, here's hoping you don't have your headphones turned all the way up.

This example also kind of handwaves something, in that you touched on it directly but called it a feature, not a bug: pipes are almost always the wrong tool if you do want structure. They're too flexible. It's way the wrong API for anything where you cannot afford to have any mistakes, because unless you include something in the pipe chain to sanity-check your data, who knows what comes out the other end?

[+] taneq|5 years ago|reply
> I'm not a Unix purist or anything, but they really hit on something special when it came to "code talking to other code".

I agree, and it just hit me while reading your comment that the special thing is not just that you can plug any program into any other program. It's that if one program doesn't work cleanly with another, this enforced simplicity means that you can easily modify the output of one program to work with another program. Unix command-line programs aren't always directly composable but they're adaptable in a way that other interfaces aren't.

It's not great for infrastructure, don't get me wrong. This isn't nuts and bolts. It's putty. But often putty is all you need to funnel the flow of information this one time.

[+] lmilcin|5 years ago|reply
You put this very nicely.

This is why functional programming and lisps are such fantastic development environment, because you can use components (functions) that are not very opinionated about what they are acting on.

[+] jb3689|5 years ago|reply
Another fascinating thing is how we as a community reacted to this simplicity. One thing in particular that I find interesting is the conventions that have been built up. Many tools don't just accept text streams but react the same way to a common set of options and assume line-separation among other things. None of these are defined in the interface but were good ideas that were adopted between projects
[+] ajuc|5 years ago|reply
It's a good analogy because you can make anything out of lego - even a car - but it won't be anything good for real use, just toys.

BTW if I was making UNIX command line today it would use LinkedHashMaps for everything instead of text streams.

[+] jmchuster|5 years ago|reply
Is is not the same description for REST API? You pass in a text body and get back a text body? Everyone uses JSON instead of a complicated data protocol?
[+] crimsonalucard5|5 years ago|reply
Unix piping is basically functional programming.

If you ever wondered why some people are obsessed with functional programming this is the reason why:

Functional programming forces every primitive in your program to be a Lego Block.

A lot of functional programmers don't see the big picture. They see a sort of elegance with the functional style, they like the immutability but they can't explain the practical significance to the uninitiated.

Functional Programming is the answer to the question that has plagued me as a programmer for years. How do I organize my program in such a way that it becomes endlessly re-useable from a practical standpoint?

Functional programming transforms organ transplantation into lego building blocks.

The "lego" is the "function" and "connecting two lego blocks" is "function composition".

In short, another name for "Point free style" programming is "Lego building block style" programming.

[+] dmlorenzetti|5 years ago|reply
One thing I appreciate about John D. Cook's blog is that he doesn't feel the need to pad out what he wants to say.

Here, he had a thought, and he expressed it in two paragraphs. I'm sure he could have riffed on the core idea for another 10 paragraphs, developed a few tangential lines of thought, inserted pull quotes -- in short, turned it into a full-blown essay.

Given that his blog serves, at least in part, as an advertisement for his services, he even has some incentive to demonstrate how comprehensive and "smart" he can be.

His unpadded style means I'm never afraid to check out a link to one of his posts on HN. Whereas I will often forego clicking on links to Medium, or the Atlantic, or wherever, until I have looked at a few comments to see whether it will be worth my time.

[+] partyboat1586|5 years ago|reply
"I enjoy that John D. Cook doesn't pad his posts."
[+] mduncs|5 years ago|reply
Its an interesting statement, but how much discussion can we get from it as an audience?

I haven't thought about this very much, and there is a lot I'm curious about that he hasn't elaborated on.

What are the signs of rejection? Whats an example of failure, are there examples of that wonderful modular behavior that he admires?

Its a nice way to introduce a thought or observation, but I want to know more about why he thinks that, not what he thinks.

[+] jihadjihad|5 years ago|reply
Honestly I was on the fence about clicking the link until I saw where it was from--his content is reliably interesting and straight to the point. If it was on Medium I wouldn't have even bothered and, like you, would have gone to the comments. The compression is lossy but it's a great filter for crap content.
[+] qchris|5 years ago|reply
There was a science teacher in my high school that used to have a rule about doing things similarly. For any kind of a lab report, instead of "you must write a report of at least 3 pages", it was "your report must not be more than 2 pages long."

Not only am I sure it made it easier for him to grade, but it really forced students to write concisely about their work.

[+] groby_b|5 years ago|reply
For what it's worth, as an advertisement for his services, conciseness is better. It's easier to disagree with parts of a detailed opinion than with a vague general statement.

You can then project your own opinions into the general framework, and you find you fully agree :)

As a consultant, "I 100% agree with you, you understand me" is exactly the feeling you want.

He writes the long articles that show off his smarts in fairly specialized areas, where you need to be an expert to disagree.

It's really clever, and I'm curious if it's intentional on his part, or just his style.

[+] gmfawcett|5 years ago|reply
A few links would have been nice -- e.g. to any serious comparison of LEGO to software components.
[+] eben-ezer|5 years ago|reply
This is the first time I've seen one his posts. It caught me really off guard but I completely agree with your sentiments. It is refreshing to see and I wish more blogs took this method to heart
[+] agentultra|5 years ago|reply
Haskell is much closer to the lego blocks analogy than most languages I've tried due to the focus on composition and polymorphism.

The teetering edge that grinds some people's gears are monads which don't compose generally but do compose in specific, concrete ways a-la monad transformers. The next breakthrough here, I think, is going to be the work coming out of effect systems based on free(r) monads and delimited continuations. Once the dust settles here I think we'll have a good language for composing side effects as well.

In the current state of things I think heart-surgery is an apt metaphor. The lego brick analogy works for small, delimited domains with denotational semantics. "Workflow" languages and such.

[+] jkachmar|5 years ago|reply
I like Haskell, I write Haskell at my day job (and did so at my previous day job), and I help maintain some of the community build infrastructure so I’m familiar with a large-ish graph of the Haskell ecosystem and how things fit together.[0]

I don’t really think Haskell is _meaningfully_ superior than other languages at the things that OP is talking about.

Refactoring Haskell _in the small_[1] is much nicer than many other languages, I don’t disagree on that point. Despite this, Haskell applications are _just as susceptible_ to the failures of software architecture that bind components of software together as other languages are.

In some cases I would even suggest that combining two Haskell applications can be _more_ fraught than in other languages, as the language community doesn’t have much in the way of agreed-upon design patterns that provide common idioms that can be used to enmesh them cleanly.

[0] I’m mostly belaboring these points to establish that I’m not talking out of my ass, and that I’ve at least got some practical experience to back up my points.

[1] This is to say when one refractors individual functions collections of interlocking abstraction

[+] chrischen|5 years ago|reply
That’s the whole point of functional programming: composition of small things to make bigger things.
[+] dustingetz|5 years ago|reply
You don't need haskell to make applications compose

See JVM (garbage collection), React, Datomic

Functions and scalar values is probably enough

[+] nradov|5 years ago|reply
And yet back in 1993, Visual Basic programmers were able to reuse software by literally snapping together controls like Lego blocks. There was a rich ecosystem of third-party controls. Competing tools such as Delphi had similar features. Since then the industry has gone backwards, or at best sideways, in many areas.
[+] ChrisMarshallNY|5 years ago|reply
How about modding cars?

It's neither as bad as an organ transplant, nor as easy as LEGO.

It is also highly variable, dependent upon the SDKs and API choices.

I've written SDKs for decades. Some are super simple, where you just add the dylib or source file, and call a function, and other ones require establishing a context, instantiating one (or more) instances, and setting all kinds of properties.

I think it's amusing, and accurate, in many cases; but, like most things in life, it's not actually that simple.

[+] seph-reed|5 years ago|reply
In a word: "Engineering"
[+] DivisionSol|5 years ago|reply
When I’m working with Unix utilities like grep or doing stuff with xargs... it genuinely feels like I’m playing with legos.

I feel like this is trying to argue for more “consulting surgeons” when we need more “tooling machinists” who know how to make a good LEGO block.

[+] mynegation|5 years ago|reply
This is an astute metaphor. In my experience software reuse simplicity strongly depends on the following factors:

* interface surface area (i.e. how much of an interface is exposed)

* data types of data coming in and out (static or dynamic). Static languages have an advantage here as many integration constraints can be expressed with types.

* whether it is a very focused functionality (e.g. extracting EXIF from file) vs cross-cutting concerns (e.g. logging)

The more limited surface area, the simpler the data types and invariants, the more localized it is - the more it is like LEGO as opposed to an organ transplant

[+] jackhalford|5 years ago|reply
For reusing software source I agree. The only current way around this is with the unix pipe system where you reuse software _executables_ instead of software _source code_

The reason it works is because unix softwares agree to a simple contract of reading from stdin and writing to stdout, which is very limiting in terms of concurrency but unlocks a huge world of compatibility.

I wonder if we will ever get software legos without the runtime bloat from forking.

ps: to anyone countering with examples of languages that are reusable through modules, that doesn't count because you are locked in to a given language.

[+] meatmanek|5 years ago|reply
> I wonder if we will ever get software legos without the runtime bloat from forking.

In a sense, shared object files / dynamically linked libraries meet this criteria -- they can be loaded into program memory and used by a single process.

There's also flowgraph-based signal processing systems, like gnuradio, which heavily use the concept of pipes (usually a stream of numbers or a stream of vectors of numbers) but, as I understand it, don't require OS forking. (Though they do implement their own schedulers for concurrency, and for gnuradio at least, blocks are typically shipped as source so I'm not sure whether that counts as reusing executables vs. reusing source code.)

[+] criddell|5 years ago|reply
Another current way is with COM in Windows.
[+] davedx|5 years ago|reply
IMHO conflating “systems” and “components” here. Module and package management has never been better and building “a system” from components, open source or otherwise, is extremely effective. Integrating software systems (with e.g. API’s, webhooks and event busses) is non-trivial, complex and difficult. They are not the same endeavor.
[+] someguy101010|5 years ago|reply
It really depends on the complexity of the software doesn't it? Libraries and packages often snap into my projects easily and without much modification at all, especially if I'm planning on using them. I can see this analogy working for more complex projects though, where you may be copy pasting code from on project to another and trying to make it fit together.
[+] julianlam|5 years ago|reply
With significant enough abstraction and a sensible public API you can make this claim, but more often than not you end up needing to dive into the internals to hook things up properly.
[+] milansuk|5 years ago|reply
This is exactly what I'm working on right now. My approach is a little bit different. It's about building a community, where people share LEGO pieces(I call them Assets), but they also share where and how in the code they've made the connections. After some time there will be enough connections so users will just re-use them. Only time will show how well it will work.

I wrote about it few weeks ago: https://skyalt.com/blog/dsl.html

>I have a plan on how to do that on paper, but because connecting assets together can be complex, it's better If most of the users don't do that. The key thing is that users don't share only assets, but also how and where they are connected to each other. It's like when someone makes and publishes a library, but also example code for how to work with a library. But in SkyAlt, asset connections will be automated. This is also why it's very important to build and grow the community - more assets, more connections, which means easier to use.

[+] hellodanylo|5 years ago|reply
Do users have incentive to document and share the connections, other than helping the community's long term goal?

Absence of such incentive appears to be the reason that open-source software is not always perfectly connectable -- few people have a significant incentive to ensure this design goal.

[+] kstenerud|5 years ago|reply
Software re-use is limited by the primitive structures and algorithms in the language and runtime library. The more fundamental parts (queues, strings for example) that are left to the user to implement, the less likely it is that it will be possible to make compatible components.
[+] rbosinger|5 years ago|reply
I feel like if even on a purely technical level we eventually achieve Lego like reuse that the business and social layer of things will ensure that somebody always requires an off brand Lego block to be shoved in there. It's almost human nature.
[+] daenz|5 years ago|reply
The reason for this is that LEGO has a tightly regulated interface that each piece must adhere to. This interface regulation doesn't exist in software to that degree, nor is it necessarily desirable.
[+] petermcneeley|5 years ago|reply
Well designed modular software is certainly like Lego. It just takes investment. It is a choice to have organ transplant software be the default. That choice is non investment.
[+] shadowgovt|5 years ago|reply
Depends on the software. Or to torture the analogy a bit: you can build anything out of LEGO if you're willing to use a jigsaw, some Gorilla glue, a blowtorch...

The point of frameworks is to provide the standardization of the "shape" of various bits of software so it's a lot more like snapping together LEGO. But even then, LEGO isn't universally snappable; some blocks just don't click to other blocks in the product line. And then, of course, there's the "illegal" block hacks (https://crafty.diply.com/28429/14-illegal-lego-building-hack...) that work in practice but are not at all using the tool the way it's specified. When software reuse is like LEGO, we should expect (a) some things we want to do, we can't really do without jigsaws and glue and (b) sometimes, people will do things that the software technically allows but no sane person would call "desired" or "intended."

In fact, the LEGO-to-framework analogy works pretty great. And yeah, outside the context of a consistent (and, I'd argue, opinionated) framework, you're about as likely to have two pieces of software interoperate as you are likely to pick two random chunks of matter in the universe, slap them against each other, and have anything useful happen. I just tried it with a brick and this glass of water on my table. Now I have "wet brick in a pile of shattered glass," but I don't know anyone's going to give me a Series-A funding round for that.

[+] runningmike|5 years ago|reply
Reuse of my own sloppy code is always easy. But reuse of complex code created by others is often far more difficult since context is missing. Creating god reusable software code block, are often the lower level apis The more context specific software is, the harder reuse gets. It’s the pain of generic vs context specific imho.
[+] solidist|5 years ago|reply
Fred Brooks had a similar take, in an opinionated way, with the concept of "lead surgeon". Motivation to re-read the chapter in mythical man-month.
[+] orenkt|5 years ago|reply
So. Every editor has copy/paste functionality and search/replace. Using that plus following the SOLID principle is my way to reuse code. And it is as if I would build with LEGO bricks.

I tell you even a secret. Every piece of software has all the code you need to work with it already - just copy/paste stuff. You only need to know where it is. Every piece of software is it's own tutorial how to make that particular piece of software.

I've done this for years. I call my method of working copy/paste method.

And by reading this, I think I know why I piss so many programmers when I say what I do. And how fast I am with it.

[+] bpyne|5 years ago|reply
Someone well-known in the IT field wrote about software reuse in the large vs. in the small about 10-15 years ago. The gist was that reuse in the small is a success, i.e. it's fairly simple to write a function that developers reuse within an application. When you try to generalize use beyond a certain context it becomes significantly more complicated to be successful. I think the motivation for the post was issues in object reuse in OO development. I'm trying to find the original post.

John D. Cook's post shines the light once again on the difficulty of writing reusable components.

[+] hikarudo|5 years ago|reply
I would love to have a link to that piece about software reuse in the large vs in the small!