Linux follows an inverse form of Conway's Law

[+] tytso|8 years ago|reply

Conway's Law absolutely applies to Linux. The trick is to remember that the communication patterns that Linux is optimized to reflect is "git tree pulls". Over time, things have been factored to minimize the amount of cross-tree merge conflicts. That way we can decentralize the development effort, and worry much less about conflicts when Linus has to pull from a hundred-odd git trees (many of which have sub-trees that were merged together by the subsystem maintainers before Linus then pulled them into his tree). But that is the primary communication pattern which is critical, and we have absolutely optimized how the code has been laid out to minimize communication difficulties --- when defined in terms of merge conflicts.

[+] sunir|8 years ago|reply

I don't think the author makes a very convincing point here; Linux is frequently cited as a prime example of Conway's Law in practice for a reason.

The reason the driver subsystem is architected as pluggable modules ("drivers") is to support the extremely wide array of organizations that have to build into it.

The reason why Linux is broken down into subsystems is to support the "specialists" who work in only on system at a time.

The reason Linux is a monolithic kernel that has a large degree of complication internally (vs. a microkernel) is because Linus is strong enough to make it happen.

I mean, the logical error is right in the title. The author inverted cause and effect.

[+] Doradus|8 years ago|reply

It seems absurd that an article claims to disprove a hypothesis linking structure to communication patterns without even the slightest mention of trying to observe those patterns. But they go even further: they claim to have determined the direction of causality between two variables without even having measured one of them!

[+] contravariant|8 years ago|reply

Actually I'd go even further and state that they are mistaken in their belief that Conway's law is about cause and effect. Conway's law holds regardless of whether the software was structured to match the organisation or the organisation was structured to match the software.

[+] smitherfield|8 years ago|reply

Misleading y-axis strikes again: https://cdn-images-1.medium.com/max/800/1*MiF0VmKW19IC3OlVDI...

[+] cyphar|8 years ago|reply

It's not just a misleading y-axis, it's completely the wrong kind of graph for showing that kind of data. Should've used a stacked bar graph...

[+] scribu|8 years ago|reply

What the article is suggesting is that the Linux architecture wasn't affected by organisational pressures that closed-source systems face.

That is to say that subsytems were defined solely based on technical considerations, which is how it should be if the goal is sound engineering.

Not sure what to make of the ratio between "specialists" and "generalists". A comparison to ratios from other projects would provide some helpful context.

[+] lomnakkus|8 years ago|reply

> That is to say that subsytems were defined solely based on technical considerations, which is how it should be if the goal is sound engineering.

I think that's too idealistic. As another sibling poster pointed out, it's more like a democracy... with all the attendant upsides and downsides. (Also, I'm sure there's a lot of interpersonal politics involved, even if it's all over email.)

Don't get me wrong, the Linux kernel works surprisingly well and I rely on it for almost everything (including my livelihood), but if you really look at it, some of the subsystems are shockingly bad.

I think a good example is containers/namespaces which have a ridiculously bad security record. (See "user namespaces".) Again, I'm sure the people working on these things had the best of intentions, are very competent generally, and were hampered by the "never break userspace" rule. However, if containers/namespaces were truly designed, it could have been done a lot better. (See "Zones" on Solaris et al.)

[+] xyzzy_plugh|8 years ago|reply

That sounds nice, but realistically a ton of driver code is shoehorned in by sloppy corporate sponsors whose contributors stumble down the narrow, winding line between getting paid and making Linus happy.

[+] jacquesm|8 years ago|reply

> That is to say that subsy[s]tems were defined solely based on technical considerations, which is how it should be if the goal is sound engineering.

That never was the goal and if it was it hasn't been achieved. Linux is like democracy, it isn't perfect but it is the best that we've got. Unlike democracy however it is fairly easy to get rid of Linux and replace it with something better.

[+] nwmcsween|8 years ago|reply

Not organizational pressures so much but personal politics moreso as there probably quite a few subpar solutions in the kernel because of egos, etc.

[+] davidst|8 years ago|reply

The article makes a good point but I wonder if it is an incomplete explanation. What would happen if Linus Torvalds walked away and there was no single leader to guide (or "dictate", depending on your point of view) its development? Would it begin to fragment and exhibit signs of Conway's Law?

I believe the answer is, yes, it would. While Linus is a stubborn and opinionated leader ("Benevolent Dictator For Life") it is those qualities, coupled with his extremely high standards, that have preserved the coherence of Linux's system architecture all this time.

[+] pmoriarty|8 years ago|reply

I wonder if there are any large, successful open source projects that are leaderless and function well without a social hierarchy?

If non-hierarchical social structures are really more effective, such examples should be easy to find, no?

On the other hand, maybe their absence only indicates that online communities simply tend to mirror the social structures of offline communities, or that they're just mostly made up of people who prefer hierarchies.

[+] jacques_chester|8 years ago|reply

I've elsewhere seen described the "Inverse Conway Manoeuvre": make the org fit the emerging architecture.

We do this at work. It mostly works, modulo "Distributed Systems Are Hard".

[+] inopinatus|8 years ago|reply

On the other hand I have seen SAP implementations that could be described in the same terms.

They are the kind of project that kills the recipient organisation.

[+] lomnakkus|8 years ago|reply

Can you expand on exactly what that means in practical terms? (Please also explain the "modulo" if you can.)

[+] mpweiher|8 years ago|reply

I have also done this at work with (what I believe) some success, w/o the distributed systems aspect.

[+] ryanmarsh|8 years ago|reply

There is an unsubstantiated ocean between the "Therefore" beginning the last paragraph and the paragraphs before it. If anything, the author's data points lead me to draw the opposite conclusion.

[+] superlopuh|8 years ago|reply

The statistics seem to be obviously incorrect, there is no discount for the distribution of the number of files that a contributor might author/have a significant effect on. Since most contributors will have made a small number of contributions, this is a large bias.

The graph that would ultimately support the point of the article would have the difference between a simulation of a uniform distribution of contributions by the authors, and have a full 0-100% axis for scale, as opposed to the 35-65% presented in this article.

[+] dorfsmay|8 years ago|reply

> the Degree-of-Authorship (DOA) measure to define the authors of each file in a system

But in source control, author is typically defined as the first contributor to a file, which doesn't always reflect the person who contributed the most content to the file.

[+] cyphar|8 years ago|reply

The paper they linked to has a much better description of how they figure out file authors, and how they avoid issues with one-time contributors.

[+] notalaser|8 years ago|reply

In my experience, while the statistics that the article quotes are obviously correct, the reasons have very little to do with the architecture, and they very much mimic the way that the "community" works. Linux' architecture has very little to do with why communication (and contributions) are the way they are. In fact, the architecture is largely designed precisely so that it can withstand the sort of organizational pressure that the Linux kernel faces. See, for example, the recent(-ish) rejection of AMD's drivers: they got rejected because they included a HAL, which -- based on previous exeperience -- is usually a bad idea in an open system, as it tends to depend on highly organization-specific knowledge, and the volume and difficulty of maintenance work makes it difficult to manage by a non-committed community once the main owner drops it for greener pa$ture$.

The very separation that the article draws "core" vs "drivers" is actually highly representative of how the Linux community is structured. Most of the core work (including the driver subsystem's backbone) is done by long-term contributors who actually work on the Linux kernel full time. Most drivers actually come from occasional contributors.

Driver contributions are "specialized" for the same reasons why they're specialized on pretty much any non-hobby operating system, namely:

1. The expertise required to write complex drivers mainly exists within the organization that sells the hardware. Needless to say, these people are largely paid -- by the hardware manufacturers! -- to pay drivers, not contribute to what the article calls core subsystems. There are exceptions ("trivial" devices, such as simple EEPROMs in drivers/misc, are written by people outside the organizations that sold them), but otherwise drivers are mostly one-organization shows. In fact, for some hardware devices, "generalists" don't even have access to the sort of documentation required to write the drivers in the first place. (sauce: wrote Linux drivers for devices that you and me can't get datasheets for. $manufacturer doesn't even bother to talk to you if you aren't Really Big (TM))

2. Furthermore, there really are subsystems in the kernel that are largely a one-company show and are very obvious examples of Conway's law. IIO drivers, for instance, while started by Jonathan Cameron who, IIRC, is really an independent developer, are largely Intel' and Analog Devices' -- to such a degree that, even though they follow the same coding conventions, if you've worked there enough, you can tell who wrote a given snippet. Same goes for most of the graphics drivers. Most of Infiniband used to be IBM, I think. If you dig down in the drivers subsystems, you'll see even funnier examples (my favourite example are ChipIdea USB controllers; a few years ago, support for USB slave mode on some Broadcom SoCs broke down because Freescale pretty much took over de facto ownership of the drivers, and some of their changesets worked fine on their ARM cores, but broke on Broadcom's funky MIPS-based cores)

Also, this is very weird to me:

> Adherence to Conway's Lay (sic!) is often mentioned as one of the benefits of microservices architecture.

Back in My Day (TM), adherence to Conway's Law was usually considered a negative trait, summarized by the mantra that, in the absence of proper technical leadership, an organization of N teams tasked with writing a compiler is going to produce an N-pass compiler.

Of course, this is a most negative example, but are we really, seriously considering that adherence to Conway's law is a positive thing today? That it's actually a good idea for the architecture of a software system to reflect the "architecture" of the team that created it, rather than, you know, the architecture that's actually best for what it's meant to do?

[+] regularfry|8 years ago|reply

Adherence to Conway's Law is regarded as a good thing, but not that way round. We want to build the teams to match the architecture of the software we're building, not vice versa. If you don't cut it that way, you're always going to be swimming upstream.

[+] RandyRanderson|8 years ago|reply

https://en.wikipedia.org/wiki/Conway%27s_law

[+] JacksCracked|8 years ago|reply

Could this be related to the fact that all communication between Linux devs is done by email?

58 comments