Unfortunately this is a lesson that often has to be learned the hard way when you have a lead or team that lacks experience/wisdom. If you can survive that educational expense, you end up much better for it but it's painful nonetheless.
Aesthetics are never a legitimate reason to replace a functioning system with something else. I do not care how it offends the eye, the mind, the ear, or any other sensation you have... if it does not have a fundamental functional problem your best option is to maintain or plan an incremental rolling replacement that is tightly scoped at each step.
For large systems that do present functional problems, identify the egregious offenders in that system, abstract the entities/behaviors most responsible, and replace-in-place in a Ship of Theseus fashion. Despite how familiar you believe you are with a monolithic or complex inter-dependent system, you will discover how much you don't actually understand about how it works as a whole if you try and re-implement in its entirety.
I perpetually love the references to Ship of Theseus, and find it particularly applicable to this problem.
You mention fundamental functional problems, and I'd like to add something: sometimes it's not a functional problem, but a changeability problem. The code functions fine, but the process of adding a new feature is incredibly painful.
I've done big-bang partial rewrites of systems before, quite successfully, but I've got a Ship of Theseus rule of my own that I follow: no new features during the rewrite, and no missing features. The first example that comes to mind was a rather complicated front-end application that had become a total spaghetti disaster. It had been written using Backbone, and from my research it fit the Angular model of things quite well.
I took a pound of coffee with me out to the cabin for a weekend, and rewrote the whole frontend. Side-by-side. I started with the first screen, and walked through every code path to populate a queue of screens that were reachable from there. Implement a screen, capture its outbound links, repeat.
Nothing changed, but everything changed. The stylesheet was gnarly too, but I left it 100% untouched. That comes later. By keeping the stylesheet intact, I (somewhat) ensured that the new implementation used identical markup to the old system. The markup was gnarly too, but keeping it identical helped to ensure that nothing got missed.
48-72ish hours later, I emerged a bit scraggly and the rest of the team started going over it. They found 1 or 2 minor things, but within a week or so we did a full cut-over. The best part? Unlike the article, clients had no outward indication that anything had changed. There was no outcry, even though about a quarter of the code in the system had been thrown out.
While I generally agree with this, there is one situation when scorched earth re-write should at least be kicked around as an option.
If you are dealing with a very bad system that has not been maintained and has tons of candidates for those tightly-scoped incremental fixes, but at the same time, you are embedded in a giant, faceless conglomerate sort of company where there is effectively 0% chance that any of those tightly-scoped incremental fixes will ever be greenlighted and everyone in the team knows it.
Then you should consider the scorched earth approach, because it is the only way that the incremental fixes will ever happen. Bureaucrats will always find an excuse why this particular short term time frame is not the right one for the incremental fix that slightly slows productivity and gets them dinged on their bonus. And they will always find a way to pin the blame regarding underinvestment in critical incremental fixes on the development staff.
So sometimes all you can do is deny them that option, even if you know how painful it will be. The aggregated long-run pain will be less, though it won't feel that way for a long time.
I just want to reiterate that I mostly agree with you. It's especially bad to turn your nose up at legacy code that actually has valuable tests, because the tests make the maintenance and incremental fixes so much better. When there are solid tests, you should almost never throw it away.
Nonetheless, sometimes you have to torch it all to deny the bureaucrats the chance to slowly suck the life out of it (and you).
One of the cool things about doing replace-in-place is that you can A/B test the old and new versions. We are doing that on our current Form Builder rewrite at JotForm. We rewrite one piece and make it live to 50% of users. Then we receive daily morning emails about each test. If some metric is in red, we discuss (or watch fullstory, or talk to users) what might be causing it and improve the new version.
Here is a fresh example, we released the new version of PayPal payments pro/express integration, and the success rates stayed in red. The old version was beating the new version. It was 3x better even though almost everything was same. After some head scratching, we found that the old version had a link to a direct PayPal page where the can get their api credentials, and the new version was missing it. From there, the fix was easy and things turned green.
This is a story that has happened over and over again. When you rewrite software, you lose all those hundreds of tiny things which were added for really good reasons. Don't do it blindly.
I used to believe that every company gets one rewrite, but only because I have seen that most places have the patience and the stamina for a little bit less than a single rewrite, but I was on the fence about whether they were a good idea anyway.
Trouble is, I could never put my finger on why, other than that it never seemed to fix the problem. It was a bit like moving to a new city to start over and finding out you brought all your problems with you.
In the last couple of years I have begin to figure out what I know in a way I can articulate. The people who have the skills and discipline to take advantage of a rewrite don't need a rewrite. It's a short distance from that skill set to being able to break the problem down and fix it piece by piece. They just need permission to fix key bits, one bite at a time, and they've probably already insisted on the space to do it, although they may be clever enough never to have said it out loud. They just do it.
Then you have the people who don't have the skills and discipline to continuously improve their code. What the rewrite buys them is a year of nobody yelling at them about how crappy the code is. They have all the goodwill and the hopes of the organization riding on their magical rewrite. They've reset the Animosity Clock and get a do-over. However, as the time starts to run down, they will make all of the same mistakes because they couldn't ever decompose big problems in the first place, and they lack the patience to stick with their course and not chicken out. They go back to creating messes as they go, if they ever actually stopped to begin with. Some of them wouldn't recognize the mess until after its done anyway.
In short, people who ask for a rewrite don't deserve a rewrite, and would squander it if they did. It's a stall tactic and an expensive dream. The ones who can handle a rewrite already are. They never stopped rewriting.
Now, I've left out all of the problems that can come from the business side, and in many cases the blame lies squarely with them (and the pattern isn't all that different). In that case it might take the rewrite for everyone to see how inflexible, unreasonable and demanding the business side is, and so Garbage In, Garbage Out rules the day. But the people with the self respect to not put up with it have moved on, or gotten jaded and bitter in the process.
I have successfully done a full rewrite. 60kLOC of VBScript/ASP to 20kLOC python/django. Added a ton of missing features that wouldn't have been possible in the old, clipboard inheritance style. There was one page on the old site that I had tried to fix up a couple of times and failed completely at. It was ~2500 lines. New system it was nothing special, 25 lines in the view and a big but manageable template.
I did this whole rewrite in about 9 months including all the work to do the database migration which took a bunch of dry runs and then the actual switchover which took nearly 12 hours to run.
What's the catch? I had been working at the company for about 7 years at that point as the sole caretaker of the application so I knew the business needs inside and out, I knew where all the problem areas were and I had plenty of ideas about what features we desperately wanted to add but couldn't. I didn't necessarily add all of them up-front but I was able to make good schema decisions to support those later features.
Rewrites aren't impossible by any means, but I suspect my story is more of the exception than the rule.
I'm currently at my first job, working at a budding unicorn company in silicon valley. I've been here 1.75 years and am already in the midst of a 3rd rewrite - although not my decision (from higher ups)
The last paragraph really hits home for me and i've 100% gotten very jaded and very very bitter.
Software is re-written most often for political reasons, namely for new developers to leave a mark, and put a good line in their resume. Everyone likes to read "designed and implemented" on a resume than "maintained and fixed annoying bugs here and there". We know what looks better, managers know who they'd move to the next round, and so on.
The only thing is, if there is no software to be written software developers will always find a reason to write software. Because they get paid full time, usually above what other professions get paid, to well, ... write software.
I think other reasons such as "it is slow, outdated, not written in <latest-language-fad>, bad UI", are often brought up and used as excuses. But underlying reasons are a lot more personal and subjective.
An architect's first work is apt to be spare and clean. He knows he
doesn't know what he's doing, so he does it carefully and with
great restraint. As he designs the first work, frill after frill and embellishment
after embellishment occur to him. These get stored away to be
used "next time." Sooner or later the first system is finished, and
the architect, with firm confidence and a demonstrated mastery of
that class of systems, is ready to build a second system.
This second is the most dangerous system a man ever designs.
When he does his third and later ones, his prior experiences will confirm each other as to the general characteristics of such systems,
and their differences will identify those parts of his experience that are particular and not generalizable.
And from the OP, frill by frill :-)
I wanted to try out new, shinny technologies like Apache Cassandra, Virtualization, Binary Protocols, Service Oriented Architecture, etc.
To OP's credit, he took a step back and learned from his mistakes.
Also related... Second System Effect may not only apply for your second system ever. It may also bite you designing the second system in a new problem domain you're unfamiliar with. I can admit I've built a few second systems in my time ;-)
>The development officially began in Summer of 2012 and we set end of January, 2013 as the release date. Because the vision was so grand, we needed even more people. I hired consultants and couple of remote developers in India.
Well, that's already a warning sign. If you don't have the talent to do the rewrite, consultants and remote developers in India are only going to make things worse.
> If you don't have the talent to do the rewrite, consultants and remote developers in India are only going to make things worse.
I don't think this is necessarily a given, but I see where you're coming from with the overall perspective.
To get closer to the point of the post: unless there are truly solid and unavoidable blockers, e.g. substantial architectural defects that just can't be resolved without a major rework, a rewrite of any capacity should probably be off the table.
Although I'm sure there are others that I have no insight into, the only "successful" commercial rewrite I can think of would be Windows Vista, and that only made that version of Windows somewhat releasable. Even then, that wasn't a total reboot even of the project; it was a partial rewrite off of a different codebase with as much code salvaged from the original project as possible (the initial Longhorn builds on top of the XP codebase were a total trainwreck, driving the partial reboot on top of Server 2003's codebase instead). They still needed two service packs to make Vista stable, and it took an entire subsequent major release (7) to right the ship, so to speak.
We began early April and we had more FT in-house developers than remote resources. The only reason we got remote workers because it was next to impossible to find good developers.
We initially had a terrible experience with remote workers and it didn't work out at all. But we learned from our mistakes and made it work later on. When I left the company, 100% of its development (i.e. maintenance) is done offshore.
"While parts of code were bad, we could have easily fixed them with refactoring if we had taken time to read and understand the source code that was written by other people."
This is the thing. Many people just HATE and REFUSE to read other people's code.
Maybe they just don't teach code reading and reviewing in computer science classes, but they should.
Printing it out on paper and reading it away from the computer is helpful, but a lot of people write code for wide screens that doesn't print out well, or they just don't give a shit and it looks terrible even on the screen. And some people just don't have the stomach for reading through a huge pile of pages, or think that's below their station in life.
But reading code is one of the best most essential ways to learn how to program, how to use libraries, how to write code that fits into existing practices, how to learn cool tricks and techniques you would have never thought of on your own, and how to gain perspective on code you wrote yourself, just as musicians should constantly listen to other people's music as well as their own.
I just don't feel safe using a library I haven't at least skimmed over to get an idea of how it's doing what it promises to do.
There are ways to make reading code less painful and time consuming, like running it under a debugger and reading it while it's running in the context of a stack and environment, looking at live objects, setting breakpoints and tracing through the execution.
Code reading and reviewing is so important, and it should be one of the first ways people learn to program, and something experienced programmers do all the time.
And of course as the old saying goes, code should be written first and foremost for other people (preferably psychopaths who have your home address, photos of your children, and your social security number) to read, and only incidentally for the computer to execute.
>Maybe they just don't teach that in computer science classes, but they should.
Should they? This seems like a classic conflation of computer science with programming. Perhaps the reason this important craft skill is not taught in CS is because it has nothing to do with complexity classes, automata theory, etc.
I spend a lot of time looking at a profiler view of my application in work, it's a great way to see what the most "important" areas of the code are.
You get call stacks for the hottest code paths, can read the code with annotations about how expensive a certain line is, etc. It's a good way to look at code I've found!
Understanding others code can be difficult, frustrating, or just plain tedious. But who wouldn't have to do it? When software isn't working, seems to be a normal part of figuring out where things are going wrong, which is a normal part of fixing bugs, etc.
Of course, it's agonizingly ironic to discover that reviewing one's own code written long ago may be no easier to decipher. It's a humbling experience when it happens, but what would ever teach the lesson more effectively about writing code that can be read by mere humans.
I know it makes me think about the variable names I use, the merit of "one-liners", how much white space to leave, and the clarity of explanatory comments, among other things.
It's a form of art to know when to be verbose and when not to be.
Some programmers may be superior "code artists", but every programmer should be able to write well enough.
This is one of the reasons I love working on LibreOffice. I think most people acknowledge that LibreOffice is a pretty decent office suite. But it is just over 30 years old now and it has acreted massive amounts of legacy code, some questionable design decisions and in terms of often trying to track it in version control it can be a nightmare if you go back far enough.
But it would definitely be a mistake to rewrite it from scratch. There are millions of users who rely on it, and it's a massive code base that runs on three major operating systems - four if you include Android.
Instead, the LibreOffice developers are slowly working through refactoring the code base. They are doing this quietly and carefully, bit by bit. Massive changes are being done in a strategic fashion. Here are some things:
* changed the build system from a custom dmake based system to GNU guild - leading to reduced build times and improving development efficiency
* the excising of custom C++ containers and iterates with ones from the C++ Standard Template Library
* changing UI descriptor files from the proprietary .src/.hrc format to Glade .ui files
* proper reference counting for widgets in the VCL module
* an OpenGL backend in the VCL module
* removal of mountains of unneeded code
* cleanup of comments, in particular a real push to translate the huge number of German comments to English
* use of Coverity's static analysis tool to locate thousands and thousands of potential bugs most of which have been resolved
* a huge amount of unit tests to catch regressions quickly
This has allowed LibreOffice to make massive improvements in just about every area. It has also allowed new features like tiled rendering - used for Android and LibreOffice Online, the VCL demo which helps improve VCL performance, filter improvements which has increased Office compatibility and allowed importing older formats, a redesign of dialogue boxes, menus and toolbars which has streamlined the workflow in ways that make sense and are non-disruptive to existing users, and many more improvements which I probably don't know about it has slipped my mind.
But my point is - if LibreOffice had decided to do a rewrite it would have been a failure. It is far better to refactor a code base that is widely used than to rewrite it, for all the reasons the article says.
True story. Similar start. A couple of enterprise customers, kludgy old code. Changeability was the problem. Decision made for total rewrite. Budget no probs, team of 10 allocated. I was the guy who sold the deals. Wanted to assess if we had a disaster on our hands. Asked CTO to explain how team was constructed. Here is how the conversation went:
Me: tell me about the team
CTO: well it is actually all being built by two guys.
Me: That's crazy. We have a 10 person team.
CTO: Actually I lied, it is actually all being built by one guy. Before you get me fired, let me explain. One guy just builds prototypes to keep all those idiotic requests from management.
Me: I knew the guy. Fastest damn developer on the planet
CTO: we just throw his code away. The other guy who we don't let you meet actually writes the code that works. You see nothing for weeks, then it emerges and works beautifully.
Me: what do you do with the other 8 guys?
CTO: give them make work to keep them out of the way.
I rewrote from scratch once, after much deliberation and was successful. In my case, I had some extremely good reasons:
The original codebase was build using good old waterfall, where neither programmers nor testers had ever met a user in their life. So all the things that are good about working software kind of go out the window: There are a lot of abstractions in the code, but they didn't match the user's mental model, or helped in any way.
My users didn't like the software at all. It might have me their requirements in theory, but not in practice.
The team that had built the application was pretty much gone, nobody that had any responsibility in any design decision was still in the company.
I actually had managed to get access to people that actually used the software, and people that supported those users, so relying on them, I was able to get a model of the world far better than the original developers ever did.
300K lines of Java, written over three years by 10+ developers were rewritten by a team of two in three months. All the functionality that anyone used was still there, but actually implemented in ways that made sense for user workflows.
I'd not have dared a rewrite had the situation been better though: Working code people want to use trumps everything.
I have always felt that if you need more than 3 developers and 12 months then the software is too big and the scope needs to be cut down. Better a small program that does one thing well than a monolith that does everything badly - why do we never learn from the UNIX philosophy.
300k lines of code. So, 2500 lines of production code rewritten per day. Sustained for three months. No holiday. No days off.
And this code was tested to be equivalent of the old code. So, let's say, being conservative, another 2500 lines of test code per day. (It's a lot more that, of course, in most cases.)
So 5000 lines of working, tested code per day.
Plus, the team of two were checking in with the users to make sure everything was hunky dory. And the new team was designing a new "model of the world far better than the original developers ever did".
I'd have gone along with that bullshit until the final sentence: Working code people want to use trumps everything. No, no, young ninja, rock star, Jedi: Working maintainable code trumps everything.
This forsed them to create object models structured to match the complicated
and inconsitent request and response JSON strings that SAP used, instead of creating the objects
modeled after the business domains. The result was no surprise was a disaster and the client wasted five months with them and I had to redo and complete the project still on time.
I still see many teams make their function signatures in Java so convoluted like below and don't know
what hit them.
interface ToDoService {
ListToDoResponse listToDo(ListToDoRequest request) throws ListToDoException;
AddToDoResponse addToDo(AddToDoRequest request) throws AddToDoException;
.. and so on ..
}
That guy really has a done a rewrite and then got it back to the point where it is used. That is what I felt when reading the article. Many people just rewrite and then complain that their rewrite never gets used. So they don't draw the important conclusions and the next time they will try it again. If you really push yourself, your team and the users to using the rewrite and getting it back on track with the original solution, then you have to go through all the disappointments of your rewrite not being better than the original, the real pains you induce in your users, and solve so many problems you've never signed up for solving because you didn't know that the old, ugly solution actually solved these as well.
The pain of reading, understanding and fixing/improving the old, ugly solftware is never that painful. And the value is much more immediate since your users already use and know the old software.
I have incrementally refactored large codebases by committing to a guide of which constructs to replace with what, what is the final goal and writing test cases. Once the guide was seen beneficial by other developers, rather they were forced to use it, they were surprised how well that worked. All this was behind the scene from the customer and updates bugfixes were going out. Re-write may be the answer sometimes, many times, a well thought incremental re-factoring guide is an easier path.
I'm facing this at work and wanting to rewrite, updates are to slow when one small change needs 1000 other changes in spaghetti code in piles of scripts...
Difference is we can roll our users over as they renew and in a year after release all new customers and old will have the same software.
Only about 2000 users not as big as a scale as the story here.
Jeff Meyerson's interview with David Heinemeir Hanson [DHH] includes rewriting software. It discusses the reasons rewrites have a bad reputation and the difference between rewriting because "it's not my code/language/idiom" and declaring "technical bankruptcy" as an extension of the technical debt metaphor.
Agree with DHH or not, he's thought deeply about a lot of software development tropes.
I listen to every DHH interview I can but his view on a rewrite is different than others that do rewrites. There is no "second system" per say in his world. He leaves the old system up for people that still think it works just fine. Instead of messing up the old versions with upgrades, they leave them up and if a customer wants to go to new version that is "rewritten" they can.
What are everyone's thoughts on rewriting in order to simplify your developer overhead with the likes of React?
We are a small shop and have a non-game app for html, ios, and android. The non-public (used by less than 30 people) backend has two management interfaces in, you might want to sit down for this, Adobe Flex. There are 2 full time developers (1 backend/flex/html and 1 ios/html) and 1-3 part time (android/ios) contractors depending on workload/deadlines. Everyone is pretty siloed with mostly 1 additional cross over app.
I have read articles similar to this as well as worked with some greybeards that carry this philosophy of staying the course and righting the ship rather than jumpin. However, this siren's song of React is luring me in with their philosophy of "learn once, write anywhere"[0]. Instead of siloed developers we'd simply have developers that would feel comfortable in [management interface/html/ios/android]
Sometimes, you've got to dump the old stuff because it is just too painful to make changes on. I've rewritten more than a couple of projects that I inherited from less than competent outsourced contractors... Number one, these projects mostly didn't work, and had so many bugs that I'm not sure how it was ever shipped software that somebody paid for. Number two, it was all the worst kind of WebForms and Visual Basic monstrosity. I'm sure its possible to write non-terrible code with WebForms that looks good, but I've never seen it, and it was a matter of a couple weeks to burn it down and build a proper MVC application instead. My only regret there is that I waited too long to rewrite, and wasted a few months battling the manure pile of the old version and trying to improve it incrementally.
I want to try a re-write project at least once in my career. I've been thinking about how I might tackle it:
The legacy system components are unlikely to have ideal boundaries (unless you're insanely lucky) due to organic growth. So step 1, identify key boundaries and implement them as libraries / micro services / whatever makes sense.
Step 2, identify the highest value component to refactor - that might not be the part that makes the system go better for the users, it might be tackling the bit generating most support noise so you can free up more dev resource.
Begin the refactoring with a release, the new component begins life as a shim between the component interface and the legacy component, calling out to the legacy component for all features initially but as functionality is migrated to the new component it gradually uses the legacy component less and less.
You can always release at any time and you'll have a system using new code where it's written and falling back to old code where not.
Lather, rinse, repeat.
Key challenges I see during the refactoring are managing state during the refactoring of complex behaviors of the original component. It may lead to a private interface which is more granular but only exposed to the new replacement component.
The only cases where I've seen rewrites work are when the software fails to consistently complete the task it was written to do. A lot of software is garbage but gets the job done. When you encounter hot spots its best to slowly refactor until you get the system where it needs to be. Lots of little changes mitigate risk, and it is easier to track down bugs if you make a mistake.
I don't believe in rewrites. But I believe in the evolution of the understanding, what a system is supposed to do and how to do it.
A customer might have a lofty web-scale vision, the first version is built with all scaling and no maintenance in mind. You are then faced with a decision: carry along the visionary nightmare and get almost no real work done, because the system itself is in the way, or rewrite the system with a fresh, minimal approach, focusing on a valuable target.
The point being: Software requirements evolve and sometimes, writing a first version of the software is the spec itself. Just as with specs, you can try to scavenge all good things from the previous version for the next one - and actually getting better over time, despite being in a rewrite mode.
The new Software Development Manager always wants to rewrite the code from scratch.
That's the way it's always been.
I like that the guy who wrote this post had the courage to admit that he was motivated partly by the desire to play with new tech.
reply
[+] [-] fallous|10 years ago|reply
Aesthetics are never a legitimate reason to replace a functioning system with something else. I do not care how it offends the eye, the mind, the ear, or any other sensation you have... if it does not have a fundamental functional problem your best option is to maintain or plan an incremental rolling replacement that is tightly scoped at each step.
For large systems that do present functional problems, identify the egregious offenders in that system, abstract the entities/behaviors most responsible, and replace-in-place in a Ship of Theseus fashion. Despite how familiar you believe you are with a monolithic or complex inter-dependent system, you will discover how much you don't actually understand about how it works as a whole if you try and re-implement in its entirety.
[+] [-] tonyarkles|10 years ago|reply
You mention fundamental functional problems, and I'd like to add something: sometimes it's not a functional problem, but a changeability problem. The code functions fine, but the process of adding a new feature is incredibly painful.
I've done big-bang partial rewrites of systems before, quite successfully, but I've got a Ship of Theseus rule of my own that I follow: no new features during the rewrite, and no missing features. The first example that comes to mind was a rather complicated front-end application that had become a total spaghetti disaster. It had been written using Backbone, and from my research it fit the Angular model of things quite well.
I took a pound of coffee with me out to the cabin for a weekend, and rewrote the whole frontend. Side-by-side. I started with the first screen, and walked through every code path to populate a queue of screens that were reachable from there. Implement a screen, capture its outbound links, repeat.
Nothing changed, but everything changed. The stylesheet was gnarly too, but I left it 100% untouched. That comes later. By keeping the stylesheet intact, I (somewhat) ensured that the new implementation used identical markup to the old system. The markup was gnarly too, but keeping it identical helped to ensure that nothing got missed.
48-72ish hours later, I emerged a bit scraggly and the rest of the team started going over it. They found 1 or 2 minor things, but within a week or so we did a full cut-over. The best part? Unlike the article, clients had no outward indication that anything had changed. There was no outcry, even though about a quarter of the code in the system had been thrown out.
[+] [-] p4wnc6|10 years ago|reply
If you are dealing with a very bad system that has not been maintained and has tons of candidates for those tightly-scoped incremental fixes, but at the same time, you are embedded in a giant, faceless conglomerate sort of company where there is effectively 0% chance that any of those tightly-scoped incremental fixes will ever be greenlighted and everyone in the team knows it.
Then you should consider the scorched earth approach, because it is the only way that the incremental fixes will ever happen. Bureaucrats will always find an excuse why this particular short term time frame is not the right one for the incremental fix that slightly slows productivity and gets them dinged on their bonus. And they will always find a way to pin the blame regarding underinvestment in critical incremental fixes on the development staff.
So sometimes all you can do is deny them that option, even if you know how painful it will be. The aggregated long-run pain will be less, though it won't feel that way for a long time.
I just want to reiterate that I mostly agree with you. It's especially bad to turn your nose up at legacy code that actually has valuable tests, because the tests make the maintenance and incremental fixes so much better. When there are solid tests, you should almost never throw it away.
Nonetheless, sometimes you have to torch it all to deny the bureaucrats the chance to slowly suck the life out of it (and you).
[+] [-] aytekin|10 years ago|reply
Here is a fresh example, we released the new version of PayPal payments pro/express integration, and the success rates stayed in red. The old version was beating the new version. It was 3x better even though almost everything was same. After some head scratching, we found that the old version had a link to a direct PayPal page where the can get their api credentials, and the new version was missing it. From there, the fix was easy and things turned green.
This is a story that has happened over and over again. When you rewrite software, you lose all those hundreds of tiny things which were added for really good reasons. Don't do it blindly.
[+] [-] reitanqild|10 years ago|reply
[+] [-] ralphael|10 years ago|reply
[+] [-] hinkley|10 years ago|reply
I used to believe that every company gets one rewrite, but only because I have seen that most places have the patience and the stamina for a little bit less than a single rewrite, but I was on the fence about whether they were a good idea anyway.
Trouble is, I could never put my finger on why, other than that it never seemed to fix the problem. It was a bit like moving to a new city to start over and finding out you brought all your problems with you.
In the last couple of years I have begin to figure out what I know in a way I can articulate. The people who have the skills and discipline to take advantage of a rewrite don't need a rewrite. It's a short distance from that skill set to being able to break the problem down and fix it piece by piece. They just need permission to fix key bits, one bite at a time, and they've probably already insisted on the space to do it, although they may be clever enough never to have said it out loud. They just do it.
Then you have the people who don't have the skills and discipline to continuously improve their code. What the rewrite buys them is a year of nobody yelling at them about how crappy the code is. They have all the goodwill and the hopes of the organization riding on their magical rewrite. They've reset the Animosity Clock and get a do-over. However, as the time starts to run down, they will make all of the same mistakes because they couldn't ever decompose big problems in the first place, and they lack the patience to stick with their course and not chicken out. They go back to creating messes as they go, if they ever actually stopped to begin with. Some of them wouldn't recognize the mess until after its done anyway.
In short, people who ask for a rewrite don't deserve a rewrite, and would squander it if they did. It's a stall tactic and an expensive dream. The ones who can handle a rewrite already are. They never stopped rewriting.
Now, I've left out all of the problems that can come from the business side, and in many cases the blame lies squarely with them (and the pattern isn't all that different). In that case it might take the rewrite for everyone to see how inflexible, unreasonable and demanding the business side is, and so Garbage In, Garbage Out rules the day. But the people with the self respect to not put up with it have moved on, or gotten jaded and bitter in the process.
[+] [-] jrauser|10 years ago|reply
[+] [-] msandford|10 years ago|reply
I did this whole rewrite in about 9 months including all the work to do the database migration which took a bunch of dry runs and then the actual switchover which took nearly 12 hours to run.
What's the catch? I had been working at the company for about 7 years at that point as the sole caretaker of the application so I knew the business needs inside and out, I knew where all the problem areas were and I had plenty of ideas about what features we desperately wanted to add but couldn't. I didn't necessarily add all of them up-front but I was able to make good schema decisions to support those later features.
Rewrites aren't impossible by any means, but I suspect my story is more of the exception than the rule.
[+] [-] raverbashing|10 years ago|reply
A rewrite is more useful when the original project is irrevocably doomed (bad/obsolete framework, needs a new language, etc)
Most of the time you can refactor the original code
[+] [-] phunkystuff|10 years ago|reply
The last paragraph really hits home for me and i've 100% gotten very jaded and very very bitter.
[+] [-] rdtsc|10 years ago|reply
The only thing is, if there is no software to be written software developers will always find a reason to write software. Because they get paid full time, usually above what other professions get paid, to well, ... write software.
I think other reasons such as "it is slow, outdated, not written in <latest-language-fad>, bad UI", are often brought up and used as excuses. But underlying reasons are a lot more personal and subjective.
[+] [-] ereyes01|10 years ago|reply
Excerpt:
An architect's first work is apt to be spare and clean. He knows he doesn't know what he's doing, so he does it carefully and with great restraint. As he designs the first work, frill after frill and embellishment after embellishment occur to him. These get stored away to be used "next time." Sooner or later the first system is finished, and the architect, with firm confidence and a demonstrated mastery of that class of systems, is ready to build a second system. This second is the most dangerous system a man ever designs. When he does his third and later ones, his prior experiences will confirm each other as to the general characteristics of such systems, and their differences will identify those parts of his experience that are particular and not generalizable.
And from the OP, frill by frill :-)
I wanted to try out new, shinny technologies like Apache Cassandra, Virtualization, Binary Protocols, Service Oriented Architecture, etc.
To OP's credit, he took a step back and learned from his mistakes.
Also related... Second System Effect may not only apply for your second system ever. It may also bite you designing the second system in a new problem domain you're unfamiliar with. I can admit I've built a few second systems in my time ;-)
[+] [-] Terr_|10 years ago|reply
Stuff like "no global variables" or "actually use database transactions" or "support UTF-8"...
[+] [-] coldtea|10 years ago|reply
Well, that's already a warning sign. If you don't have the talent to do the rewrite, consultants and remote developers in India are only going to make things worse.
[+] [-] eganist|10 years ago|reply
I don't think this is necessarily a given, but I see where you're coming from with the overall perspective.
To get closer to the point of the post: unless there are truly solid and unavoidable blockers, e.g. substantial architectural defects that just can't be resolved without a major rework, a rewrite of any capacity should probably be off the table.
Although I'm sure there are others that I have no insight into, the only "successful" commercial rewrite I can think of would be Windows Vista, and that only made that version of Windows somewhat releasable. Even then, that wasn't a total reboot even of the project; it was a partial rewrite off of a different codebase with as much code salvaged from the original project as possible (the initial Longhorn builds on top of the XP codebase were a total trainwreck, driving the partial reboot on top of Server 2003's codebase instead). They still needed two service packs to make Vista stable, and it took an entire subsequent major release (7) to right the ship, so to speak.
[+] [-] perseus323|10 years ago|reply
We initially had a terrible experience with remote workers and it didn't work out at all. But we learned from our mistakes and made it work later on. When I left the company, 100% of its development (i.e. maintenance) is done offshore.
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] copperx|10 years ago|reply
The customer simply didn't want to adopt the shiny new software.
[+] [-] DonHopkins|10 years ago|reply
This is the thing. Many people just HATE and REFUSE to read other people's code.
Maybe they just don't teach code reading and reviewing in computer science classes, but they should.
Printing it out on paper and reading it away from the computer is helpful, but a lot of people write code for wide screens that doesn't print out well, or they just don't give a shit and it looks terrible even on the screen. And some people just don't have the stomach for reading through a huge pile of pages, or think that's below their station in life.
But reading code is one of the best most essential ways to learn how to program, how to use libraries, how to write code that fits into existing practices, how to learn cool tricks and techniques you would have never thought of on your own, and how to gain perspective on code you wrote yourself, just as musicians should constantly listen to other people's music as well as their own.
I just don't feel safe using a library I haven't at least skimmed over to get an idea of how it's doing what it promises to do.
There are ways to make reading code less painful and time consuming, like running it under a debugger and reading it while it's running in the context of a stack and environment, looking at live objects, setting breakpoints and tracing through the execution.
Code reading and reviewing is so important, and it should be one of the first ways people learn to program, and something experienced programmers do all the time.
And of course as the old saying goes, code should be written first and foremost for other people (preferably psychopaths who have your home address, photos of your children, and your social security number) to read, and only incidentally for the computer to execute.
[+] [-] stevetrewick|10 years ago|reply
Should they? This seems like a classic conflation of computer science with programming. Perhaps the reason this important craft skill is not taught in CS is because it has nothing to do with complexity classes, automata theory, etc.
Programming != CS
[+] [-] maccard|10 years ago|reply
You get call stacks for the hottest code paths, can read the code with annotations about how expensive a certain line is, etc. It's a good way to look at code I've found!
[+] [-] jrapdx3|10 years ago|reply
Of course, it's agonizingly ironic to discover that reviewing one's own code written long ago may be no easier to decipher. It's a humbling experience when it happens, but what would ever teach the lesson more effectively about writing code that can be read by mere humans.
I know it makes me think about the variable names I use, the merit of "one-liners", how much white space to leave, and the clarity of explanatory comments, among other things. It's a form of art to know when to be verbose and when not to be.
Some programmers may be superior "code artists", but every programmer should be able to write well enough.
[+] [-] chris_wot|10 years ago|reply
But it would definitely be a mistake to rewrite it from scratch. There are millions of users who rely on it, and it's a massive code base that runs on three major operating systems - four if you include Android.
Instead, the LibreOffice developers are slowly working through refactoring the code base. They are doing this quietly and carefully, bit by bit. Massive changes are being done in a strategic fashion. Here are some things:
* changed the build system from a custom dmake based system to GNU guild - leading to reduced build times and improving development efficiency
* the excising of custom C++ containers and iterates with ones from the C++ Standard Template Library
* changing UI descriptor files from the proprietary .src/.hrc format to Glade .ui files
* proper reference counting for widgets in the VCL module
* an OpenGL backend in the VCL module
* removal of mountains of unneeded code
* cleanup of comments, in particular a real push to translate the huge number of German comments to English
* use of Coverity's static analysis tool to locate thousands and thousands of potential bugs most of which have been resolved
* a huge amount of unit tests to catch regressions quickly
This has allowed LibreOffice to make massive improvements in just about every area. It has also allowed new features like tiled rendering - used for Android and LibreOffice Online, the VCL demo which helps improve VCL performance, filter improvements which has increased Office compatibility and allowed importing older formats, a redesign of dialogue boxes, menus and toolbars which has streamlined the workflow in ways that make sense and are non-disruptive to existing users, and many more improvements which I probably don't know about it has slipped my mind.
But my point is - if LibreOffice had decided to do a rewrite it would have been a failure. It is far better to refactor a code base that is widely used than to rewrite it, for all the reasons the article says.
[+] [-] bernardlunn|10 years ago|reply
Me: tell me about the team
CTO: well it is actually all being built by two guys.
Me: That's crazy. We have a 10 person team.
CTO: Actually I lied, it is actually all being built by one guy. Before you get me fired, let me explain. One guy just builds prototypes to keep all those idiotic requests from management.
Me: I knew the guy. Fastest damn developer on the planet
CTO: we just throw his code away. The other guy who we don't let you meet actually writes the code that works. You see nothing for weeks, then it emerges and works beautifully.
Me: what do you do with the other 8 guys?
CTO: give them make work to keep them out of the way.
The system was a market success for a long time.
[+] [-] nyir|10 years ago|reply
[+] [-] hibikir|10 years ago|reply
The original codebase was build using good old waterfall, where neither programmers nor testers had ever met a user in their life. So all the things that are good about working software kind of go out the window: There are a lot of abstractions in the code, but they didn't match the user's mental model, or helped in any way.
My users didn't like the software at all. It might have me their requirements in theory, but not in practice.
The team that had built the application was pretty much gone, nobody that had any responsibility in any design decision was still in the company.
I actually had managed to get access to people that actually used the software, and people that supported those users, so relying on them, I was able to get a model of the world far better than the original developers ever did.
300K lines of Java, written over three years by 10+ developers were rewritten by a team of two in three months. All the functionality that anyone used was still there, but actually implemented in ways that made sense for user workflows.
I'd not have dared a rewrite had the situation been better though: Working code people want to use trumps everything.
[+] [-] danieltillett|10 years ago|reply
[+] [-] auxbuss|10 years ago|reply
Two devs, so 120 working days.
300k lines of code. So, 2500 lines of production code rewritten per day. Sustained for three months. No holiday. No days off.
And this code was tested to be equivalent of the old code. So, let's say, being conservative, another 2500 lines of test code per day. (It's a lot more that, of course, in most cases.)
So 5000 lines of working, tested code per day.
Plus, the team of two were checking in with the users to make sure everything was hunky dory. And the new team was designing a new "model of the world far better than the original developers ever did".
I'd have gone along with that bullshit until the final sentence: Working code people want to use trumps everything. No, no, young ninja, rock star, Jedi: Working maintainable code trumps everything.
[+] [-] pramalin|10 years ago|reply
The previous team blindly relied on a single method from Spring framework's RestTemplate for every rest service calls with SAP backend.
(ref: https://docs.spring.io/spring/docs/current/javadoc-api/org/s...)This forsed them to create object models structured to match the complicated and inconsitent request and response JSON strings that SAP used, instead of creating the objects modeled after the business domains. The result was no surprise was a disaster and the client wasted five months with them and I had to redo and complete the project still on time.
I still see many teams make their function signatures in Java so convoluted like below and don't know what hit them.
[+] [-] mikevm|10 years ago|reply
[+] [-] erikb|10 years ago|reply
The pain of reading, understanding and fixing/improving the old, ugly solftware is never that painful. And the value is much more immediate since your users already use and know the old software.
[+] [-] peterbotond|10 years ago|reply
[+] [-] cdevs|10 years ago|reply
Difference is we can roll our users over as they renew and in a year after release all new customers and old will have the same software.
Only about 2000 users not as big as a scale as the story here.
[+] [-] brudgers|10 years ago|reply
Agree with DHH or not, he's thought deeply about a lot of software development tropes.
http://softwareengineeringdaily.com/2016/01/13/the-evolution...
[+] [-] dham|10 years ago|reply
[+] [-] chrisan|10 years ago|reply
We are a small shop and have a non-game app for html, ios, and android. The non-public (used by less than 30 people) backend has two management interfaces in, you might want to sit down for this, Adobe Flex. There are 2 full time developers (1 backend/flex/html and 1 ios/html) and 1-3 part time (android/ios) contractors depending on workload/deadlines. Everyone is pretty siloed with mostly 1 additional cross over app.
I have read articles similar to this as well as worked with some greybeards that carry this philosophy of staying the course and righting the ship rather than jumpin. However, this siren's song of React is luring me in with their philosophy of "learn once, write anywhere"[0]. Instead of siloed developers we'd simply have developers that would feel comfortable in [management interface/html/ios/android]
[0]: https://facebook.github.io/react/blog/2015/03/26/introducing...
[+] [-] douche|10 years ago|reply
[+] [-] CraigJPerry|10 years ago|reply
The legacy system components are unlikely to have ideal boundaries (unless you're insanely lucky) due to organic growth. So step 1, identify key boundaries and implement them as libraries / micro services / whatever makes sense.
Step 2, identify the highest value component to refactor - that might not be the part that makes the system go better for the users, it might be tackling the bit generating most support noise so you can free up more dev resource.
Begin the refactoring with a release, the new component begins life as a shim between the component interface and the legacy component, calling out to the legacy component for all features initially but as functionality is migrated to the new component it gradually uses the legacy component less and less.
You can always release at any time and you'll have a system using new code where it's written and falling back to old code where not.
Lather, rinse, repeat.
Key challenges I see during the refactoring are managing state during the refactoring of complex behaviors of the original component. It may lead to a private interface which is more granular but only exposed to the new replacement component.
Interesting stuff for sure.
[+] [-] pg_bot|10 years ago|reply
[+] [-] whichdan|10 years ago|reply
[+] [-] mtrn|10 years ago|reply
A customer might have a lofty web-scale vision, the first version is built with all scaling and no maintenance in mind. You are then faced with a decision: carry along the visionary nightmare and get almost no real work done, because the system itself is in the way, or rewrite the system with a fresh, minimal approach, focusing on a valuable target.
The point being: Software requirements evolve and sometimes, writing a first version of the software is the spec itself. Just as with specs, you can try to scavenge all good things from the previous version for the next one - and actually getting better over time, despite being in a rewrite mode.
[+] [-] hoodoof|10 years ago|reply