(no title)
Grambles | 2 years ago
I can't. There's no value in it. What did we, as a team, do that allowed the incident to happen? Yes, John Smith shouldn't have dropped the tables on production, obviously, but does he really not know that as part of the incident response that he's (presumably) also dealing with?
If he's truly not aware that was a mistake, there's an underlying transparency issue that goes way beyond telling an individual they screwed up.
pclmulqdq|2 years ago
"Negligent" doesn't just mean "made a mistake." It means something more like "their carelessness led to a mistake."
That person hearing "you screwed up" will cause significant behavior change. I daresay it will encourage them to make the prod tables very hard to drop, and since they are presumably a smart person, when combined with the postmortem of the incident, it will encourage them to look for and proactively fix similar problems, and generally align the team with good DevOps practices.
It is important in all of this that the right person gets the message. I assume you expect that to not happen, since that is one of the theses of "blameless postmortems."
Grambles|2 years ago
Ideally a blameless post mortem allows the freedom to identify any of the potential fixes that could've stopped this, and empowers anyone who could've dealt with it to deal with future issues. If you blame the manager then that can implicitly absolve everyone else in the chain.
With that said, I would agree that having a primary owner of things does matter. For that reason, sure, making the manager more aware might help in future. I still think it's a bad idea for org culture though because many managers will respond to "You screwed up." with trying to ensure future blameables find their way to another target. Instead, I'd prefer approaching the manager with "We could've caught this in [any of the ways we could've caught it].", and if the manager doesn't care at that point they're just fully incompetent.