(no title)
jroseattle | 2 years ago
Across the teams in my company, I hit nirvana when reviewing something like a major/impactful outage and the involved members in discussion are drilling down to the the essential state and sequence of events that caused said outage. Focus on actions and outcomes, including both those central/peripheral to the outage and those with zero knowledge prior to the event. It takes a high degree of trust in yourself, your peers, and your organization to get to nirvana.
Go through a few of these in a proper way, and a simple principle tends to emerge: if your mechanism is dependent on the perfection of humans, it will eventually fail. The only real discussion beyond that is basically what to do next -- do we need a mechanism that protects against human imperfections? Is the cost of implementing a solution worth the mechanism it would be designed to protect? Can we live with the infallibility of humans in this scenario?
Organizations that can achieve this level of discourse have a distinct advantage in execution.
No comments yet.