That's an interesting set of questions, and it feels frustrating that we'll probably never get an answer for most of them.
My most important question regarding software engineering would probably be something like: how well are resources allocated and could we do a better job? Consider how many brilliant and outstanding people go work on some useless software at a big adtech company rather than helping to improve the world. The greatest tragedy of our time.
I think the question is applicable to more than just engineers. Our economy is a paperclip maximizer of sorts - in the sense that its maximization target is far removed from the target of building the best possible world that we can imagine building. And the costs and risks are running away from us. How far removed is it? Seeing how we're "slowly" but surely killing the biosphere I think paperclip maximizer is not a bad analogy.
> Consider how many brilliant and outstanding people go work on some useless software at a big adtech company rather than helping to improve the world.
I've had conversations with young people nearing graduation about their career choices. Often it revolved around compensation levels at various tech companies. I'd ask them if they really wanted to dedicate their life to finding better ways to serve advertisements, or do they want to do something great?
There are many outstanding athletes who spend their whole career on the bench. Your resource allocation concern is a valid one, but it isn't one that is going to be addressed by big companies anytime soon, at some point individuals needs to weigh the 'pay vs helping world' thing.
These all stem from the complete lack of empiricism and scientific method in this discipline. I'm pretty sure we all have opinions on most of that stuff. None of which are backed by any evidence whatsoever, we are basically always going with our gut.
"23. Has anyone ever compared how long it takes to reach a workable level of understanding of a software system with and without UML diagrams or other graphical notations? More generally, is there any correlation between the amount or quality of different kinds of developer-oriented documentation and time-to-understanding, and if so, which kinds of documentation fare best?"
This is such an important question and it's just the tip of the iceberg of a very deep problem that is rotting our software systems. We are absolutely pathetic at dealing with complexity and we actually enjoy complexity. We don't tackle questions such as 23. ANYWHERE near as seriously as we should.
Developers overestimate their mental bandwidth which leads them to pompously build over-complicated tech stacks despite only having archaic tools to mitigate and navigate their complexity.
Companies don't need to hire more devs to deal with their complex software systems, they need better tools to navigate their software systems. But because companies don't truly value their money and devs don't truly value their time, we end up in the situation we are in now.
We should have hundreds of companies investing on initiatives akin to Moldable Development[1], instead they play the following bingo:
1) let's just hire more devs and hope to land on a 10xer
2) let's build our own framework
Additionally, we overvalue specialization. By overloading developer brains with complex tech stacks, we encourage a culture of specialized profiles who find solace in trivia. Doing so, we limit cross-pollination and stifle true innovation. This attitude is actively killing-off thousands of valuable ideas.
Every second, there's a coder out there who thinks of something wild, which requires very specific tools from different fields and finds out that the people who built such tools couldn't be bothered making them accessible under a sensible time-budget to people outside of their niche/ivory tower. So the dev either drops the idea or gets sucked up into a niche.
This is tragic, but hey look! We have a new (totally not low-hanging fruit that could be predicted 10 years ago) Generative Model, WOW! "What a time to be alive"!
that's because creating software is all about making human artifacts, not studying the natural world and that's an inherently subjective task. It's like asking "what's the best set of tools for a blacksmith?" or "what's the best way to brew beer?". Well, depends very heavily on the blacksmith or the brewery. Of course that doesn't mean all tools are equally good, but all the interesting cases are going to be personal.
There is no scientifically correct answer to what often are not objective questions, you will quickly find out that almost everyone will weigh the questions that the article asks very differently, for often legitimate reasons.
This may be, but it is useful to point out that, in many cases, engineering practices in more tangible scientific areas are also driven by some personal preferences / opinion.
For example, in which situations should you use Redlich-Kwong equations of state over a simple ideal gas or VdW? There will always be one model which gets closer to the measured values in a specific condition, but the entire domain may be difficult to determine which model is most appropriate.
Even in these fields the decision falls to opinion - some models may be more precise and accurate, but require more processing/ development time. Which is just software engineering again.
I'd love to see (or do myself if I ever get the chance) some research into some of the hot-button software architecture, tooling and project structure topics that come up a lot here and in every team I've worked on. They always boil down to a religious debate with strong opinions on all sides, and everyone wanting to copy something they've read about how their favourite MAGMA company does it, but it feels like some of these should have empirically right answers:
1. Under what circumstances is a monorepo beneficial as opposed to multiple smaller repos and what are the determining factors? (Team size, codebase size, tooling, software coupling, project velocity, .... etc?)
2. Under what circumstances are different system architectures better, and what are all the different factors that influence this? (eg microservices, 3-tier, one big blob of code, ... etc)
3. Can we empirically measure or determine whether a language, framework, library etc is the right choice for a given situation and how might we do that? Is it possible to formulate rules to inform good decisions here or is it always going to be a matter of judgement?
4. How do different styles of working (Pair programming, scrum, TDD etc) affect team productivity, code quality, developer happiness, project velocity etc? What are the factors that make one preferable over another in a given situation?
5. What's the right team size for a given project?
6. What's the best way to discover and communicate software requirements?
...
I could go on. But in other engineering disciplines, a lot of the analogous things are solved problems rather than being topics for heated debate.
> my impression is that we’ve gone from packaging or building an installer taking 10% of effort to cloud deployment infrastructure being 25-30% of effort, but that’s just one data point
The old bad days of shrink-wrapped software involved installing in environments we could not control. Tbe direct costs of writing an installer and testing (the larger effort) install, upgrade, uninstall, recovering from partial installs, etc. were nothing compared to the costs of dealing with all the weirdness that could exist on a customer's machine.
We're doing things like blue/green deploys and CD - things not remotely possible in that other world. And we're hooking up all kinds of monitoring and observability tooling that . . . even if we could have shipped that to customers on physical media, we'd be unable to harvest data from in those old times.
I think we're spending a lot less engineering time and getting a bigger bang for the effort.
I worry, though, that because software engineering has the shared professional memory of a goldfish that can always swim to an unvisited side of its bowl, we'll somehow try to repeat that pattern we've overcome.
I would offer the view, that for teams flush with money, everything you said is great. But if you're a solo dev or in a small team, deployments and finding the correct balance is actually tricky to manage. Developers created a solution to the old deployment problem by abandoning it and doing something else. That problem exists and you'd think it'd just be solved now so rolling out a desktop app wasn't such a crazy pain.
I worry that software engineering is building a lot of tech stacks that need active maintenance and can't really go into a classical long term support style use.
Steve McConnell's Code Complete published by Microsoft is a classic text in this area, one of the only ones I'm aware of that actually uses academic sources for its claims. However it was published in the 90s and only updated once in 2004, so I wonder if there's a newer version or new book which updates this?
*At what point is it more economical to throw away a module and write a replacement instead of refactoring or extending the module to meet new needs?*
If you have rotation in the team it is a lot easier to get new people write new code than learn old ways - especially when original authors are all gone.
If you have stable team, as long as knowledge is there and module fits style of the team I think rewrite will not be needed unless someone wants to shove some not fitting requirements there, which would be better handled by just making separate module.
For the first question, putting the documentation into the source code has a major problem, there is no good place to put an overview of the whole system. Block diagrams are difficult to draw with just ASCII, images can't be included, formulas, and more. Also many things in software connect to many other things, and if the documentation is in the source code you can likely only describe one side of these connections (or you have to describe both fully on each side of the connection, thus duplicating a lot of the documentation.) Comments are immensely useful, but I don't think they work for general purpose documentation of a system. They can work well for libraries where each API needs to be described separately, but even then I wouldn't consider them complete documentation of a library (how do you combine graphics functions to build a rendering pipeline for example, do you have to described how to build pipelines for every single function that might be used in a pipeline?)
Some of these could be tested among college students or high-school students. Let students form groups and participate in a tournament. Make the task complex enough that the tournament requires sustained work (e.g., 12 weeks). Then randomly assign coding practices to the participating groups and evaluate:
> Do doctest-style tests (i.e., tests embedded directly in the code being tested) have any impact long-term usability or maintainability compared to putting tests in separate files?
> Has anyone analyzed videos of coding clubs for children or teens to see if girls are treated differently than boys by instructors and by their peers?
> Has anyone ever studied students from the first year to the final year of their program to see what tools they actually start using when. In particular, when (if ever) do they start to use more advanced features of their IDE (e.g., “rename variable in scope”)?
This is an awesome list of questions and it's incredibly thoughtful that you documented and shared it with us.
It reminds me of Hitchhikers Guide where they know the answer but need the question. So many answers are out there floating around that we are often trying to force into certain frames for various motivations.
Having questions that get us thinking, recognizing how unclear things are in current state, this
feels like such a powerful thing to share.
Apologies for the diatribe, very much appreciated.
This is an amazing list, and really highlights how much we take on trust in this industry: do X for good/better/best outcomes.
Taking a step back I wonder: if there did emerge clear, incontrovertible evidence that doing X led to better outcomes, how many teams would actually adopt it?
And how many would dismiss those findings regardless?
In these modern times I suspect we'd see a far higher rejection rate than 10-years-ago me would've anticipated.
I would love to hear what people do for 25? I’m struggling with this myself - with static site generators you can’t always put files in the same directory as the markdown
[+] [-] TheAceOfHearts|3 years ago|reply
My most important question regarding software engineering would probably be something like: how well are resources allocated and could we do a better job? Consider how many brilliant and outstanding people go work on some useless software at a big adtech company rather than helping to improve the world. The greatest tragedy of our time.
[+] [-] worldsayshi|3 years ago|reply
[+] [-] qsort|3 years ago|reply
Considering how well central planning has worked historically, allow me to approach this issue with a bit of skepticism.
[+] [-] wzwy|3 years ago|reply
My naïveté inclines me to think that there are more brilliant people working on world-improvement project than the opposite.
[+] [-] WalterBright|3 years ago|reply
I've had conversations with young people nearing graduation about their career choices. Often it revolved around compensation levels at various tech companies. I'd ask them if they really wanted to dedicate their life to finding better ways to serve advertisements, or do they want to do something great?
[+] [-] mym1990|3 years ago|reply
[+] [-] qsort|3 years ago|reply
[+] [-] Lwepz|3 years ago|reply
This is such an important question and it's just the tip of the iceberg of a very deep problem that is rotting our software systems. We are absolutely pathetic at dealing with complexity and we actually enjoy complexity. We don't tackle questions such as 23. ANYWHERE near as seriously as we should.
Developers overestimate their mental bandwidth which leads them to pompously build over-complicated tech stacks despite only having archaic tools to mitigate and navigate their complexity.
Companies don't need to hire more devs to deal with their complex software systems, they need better tools to navigate their software systems. But because companies don't truly value their money and devs don't truly value their time, we end up in the situation we are in now. We should have hundreds of companies investing on initiatives akin to Moldable Development[1], instead they play the following bingo: 1) let's just hire more devs and hope to land on a 10xer 2) let's build our own framework
Additionally, we overvalue specialization. By overloading developer brains with complex tech stacks, we encourage a culture of specialized profiles who find solace in trivia. Doing so, we limit cross-pollination and stifle true innovation. This attitude is actively killing-off thousands of valuable ideas. Every second, there's a coder out there who thinks of something wild, which requires very specific tools from different fields and finds out that the people who built such tools couldn't be bothered making them accessible under a sensible time-budget to people outside of their niche/ivory tower. So the dev either drops the idea or gets sucked up into a niche.
This is tragic, but hey look! We have a new (totally not low-hanging fruit that could be predicted 10 years ago) Generative Model, WOW! "What a time to be alive"!
[1] https://moldabledevelopment.com/
[+] [-] Barrin92|3 years ago|reply
There is no scientifically correct answer to what often are not objective questions, you will quickly find out that almost everyone will weigh the questions that the article asks very differently, for often legitimate reasons.
[+] [-] chaxor|3 years ago|reply
[+] [-] seanhunter|3 years ago|reply
1. Under what circumstances is a monorepo beneficial as opposed to multiple smaller repos and what are the determining factors? (Team size, codebase size, tooling, software coupling, project velocity, .... etc?)
2. Under what circumstances are different system architectures better, and what are all the different factors that influence this? (eg microservices, 3-tier, one big blob of code, ... etc)
3. Can we empirically measure or determine whether a language, framework, library etc is the right choice for a given situation and how might we do that? Is it possible to formulate rules to inform good decisions here or is it always going to be a matter of judgement?
4. How do different styles of working (Pair programming, scrum, TDD etc) affect team productivity, code quality, developer happiness, project velocity etc? What are the factors that make one preferable over another in a given situation?
5. What's the right team size for a given project?
6. What's the best way to discover and communicate software requirements?
...
I could go on. But in other engineering disciplines, a lot of the analogous things are solved problems rather than being topics for heated debate.
[+] [-] bee_rider|3 years ago|reply
https://icl.utk.edu/magma/
[+] [-] gyulai|3 years ago|reply
What's "MAGMA"?
[+] [-] drewcoo|3 years ago|reply
The old bad days of shrink-wrapped software involved installing in environments we could not control. Tbe direct costs of writing an installer and testing (the larger effort) install, upgrade, uninstall, recovering from partial installs, etc. were nothing compared to the costs of dealing with all the weirdness that could exist on a customer's machine.
We're doing things like blue/green deploys and CD - things not remotely possible in that other world. And we're hooking up all kinds of monitoring and observability tooling that . . . even if we could have shipped that to customers on physical media, we'd be unable to harvest data from in those old times.
I think we're spending a lot less engineering time and getting a bigger bang for the effort.
I worry, though, that because software engineering has the shared professional memory of a goldfish that can always swim to an unvisited side of its bowl, we'll somehow try to repeat that pattern we've overcome.
[+] [-] gonzo41|3 years ago|reply
I worry that software engineering is building a lot of tech stacks that need active maintenance and can't really go into a classical long term support style use.
[+] [-] rwmj|3 years ago|reply
https://www.microsoftpressstore.com/store/code-complete-9780...
[+] [-] spenrose|3 years ago|reply
[+] [-] ozim|3 years ago|reply
If you have rotation in the team it is a lot easier to get new people write new code than learn old ways - especially when original authors are all gone.
If you have stable team, as long as knowledge is there and module fits style of the team I think rewrite will not be needed unless someone wants to shove some not fitting requirements there, which would be better handled by just making separate module.
[+] [-] rapjr9|3 years ago|reply
[+] [-] nequo|3 years ago|reply
> Do doctest-style tests (i.e., tests embedded directly in the code being tested) have any impact long-term usability or maintainability compared to putting tests in separate files?
> Has anyone analyzed videos of coding clubs for children or teens to see if girls are treated differently than boys by instructors and by their peers?
> Has anyone ever studied students from the first year to the final year of their program to see what tools they actually start using when. In particular, when (if ever) do they start to use more advanced features of their IDE (e.g., “rename variable in scope”)?
[+] [-] _57jb|3 years ago|reply
This is an awesome list of questions and it's incredibly thoughtful that you documented and shared it with us.
It reminds me of Hitchhikers Guide where they know the answer but need the question. So many answers are out there floating around that we are often trying to force into certain frames for various motivations.
Having questions that get us thinking, recognizing how unclear things are in current state, this feels like such a powerful thing to share.
Apologies for the diatribe, very much appreciated.
[+] [-] brianmcc|3 years ago|reply
Taking a step back I wonder: if there did emerge clear, incontrovertible evidence that doing X led to better outcomes, how many teams would actually adopt it?
And how many would dismiss those findings regardless?
In these modern times I suspect we'd see a far higher rejection rate than 10-years-ago me would've anticipated.
[+] [-] pydry|3 years ago|reply
But yes, we do take too much on trust.
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] lloydatkinson|3 years ago|reply
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] mym1990|3 years ago|reply
[+] [-] Imnimo|3 years ago|reply
[+] [-] lifeisstillgood|3 years ago|reply
[+] [-] zozbot234|3 years ago|reply
[+] [-] Henrywright23|3 years ago|reply
[deleted]