It seems, especially for major corporations, that on-call/pager duty is quickly becoming the norm for software development teams. I do agree that pager duty is a symptom of a fundamental flaw within the system/architecture. I think it would be in a company's best interest to devote time in improving the reliability and stability of their infrastructure, instead of relying on the band-aid approach that pager duty seems to be.Regarding #8 though, when you are pressured to resolve a complex issue within a short time window, it can absolutely induce a sense of panic for those who do not handle stress well. In my opinion, I believe the remedy for this would be to have two individuals designated as on-call at a time, assuming the team is large enough.
devicenull|11 years ago
I can't see there ever being a time where there is no on-call requirement. You always need someone standing by in case of some terrible disaster that cannot be handled automatically. Better to have this a formal responsibility that never gets used, then to not have it and end up with an extended downtime because you can't contact anyone.
That being said, if you're getting paged continuously during on-call, then there's a bigger problem that needs to be resolved.
_delirium|11 years ago
If it's a really terrible disaster, a once-a-decade kind of thing where everything goes haywire and you need as many staff as possible to get online ASAP, then yes. But aren't we talking more about the kinds of "disasters" that happen once a month or so, and can be handled by a few staff (not waking up the whole team). To me that sounds more like just staffing for normal operations.
At large engineering companies this is typically handled via literally having someone standing by, i.e. formally on duty, rather than having off-duty employees be on pager duty. There'll be at least a bare-bones staff on the after-hours shift (probably not in all offices, but in some kind of 24/7 operations center), enough of a staff that reasonably foreseeable things can be handled. Of course there are some pros and cons to that from an employee perspective. On the one hand the night shift isn't that pleasant, but on the other hand your responsibilities are at least formally limited to 40 hours/wk; if you're on night shift one week, you don't come in during the day, or carry a pager during the day.
TheSwordsman|11 years ago
We also run persistent systems across the WAN. And, unfortunately, some of these things require the state to be maintained.
You can't just design these systems to be "better". There are often things outside of your control.
Based on your response, you seem to be the type of person causing pain for those with a pager.
Also, I'm sure the company that can make the Internet work every time, all the time, will make a killing.
taco_emoji|11 years ago