top | item 33541297

(no title)

throwaw20221107 | 3 years ago

Sure. I'm just talking about binary deploy. Of course you're gonna use flags to flight things in an industry context, depending on the company and its CI/CD practices.

Let's say the binary deploy (before flighting) is fucked somehow. Then what do you do?

discuss

ratorx|3 years ago

Well, if rollbacks (which shouldn’t require affecting the source tree) are a thing then that. Otherwise you’re right and code needs to be reverted and hope it works.

But in the 2nd case, I’d make sure to increase the priority of having known good rollback versions available (and rollbacks performable) and also carefully consider what CI/CD could be added to catch more broken binaries (e.g canary or staging if it’s important enough) and what code review practices could have prevented it.

throwaw20221107|3 years ago

Ok I agree, you roll back to the known working version. The easiest way to do that is revert the whole PR (or data deploy in case of flags, ofc). My point is not "flags vs. no flags". My point is "each PR should generate one commit because that's easy to revert".

The commit dag of git is a cool feature but shouldn't be in `main`. It's so much easier to work with a linear history and one where each commit contains all the required context to figure out "could this have broken something".

ratorx|3 years ago

I don’t think reverting the broken code should be on the critical mitigation path at all.

Imagine a scenario where a bug didn’t immediately cause an issue or where your release contains more than 1 new PR. If you suspect the latest version of the binary is broken, your first instinct should be to use a version that isn’t. Figuring out the change and rolling it back should come after the rollback, when you have more time to think.

Deciding whether to revert a change is tactical question. Often the issue will be because you tickled an unknown bug in a different part of the code. In that case, it’s a lot easier to fix forward than revert the code that tickled the big and go through the multiple steps of fixing the bug and redoing.

ratorx|3 years ago

Linear history is good, and having multiple commits in a PR doesn’t prevent it. The only change is adding n (ideally well crafted) consecutive commits rather than 1.

throwaw20221107|3 years ago

Ratorx, do we work on the same team at google? I feel like I've heard you before?

ratorx|3 years ago

Possibly? I’m an SRE. I don’t think my position is too different from the SREs I know :)