As a software engineer doing infrastructure work I often find myself working on operational stuff (mostly chasing weird bugs, some on-call, etc.). In my position I am also expected to release features and do development too, but I feel like it's very difficult to focus because of all the operational issues I am dealing with. How are you guys dealing with that sort of work?
[+] [-] megaman22|8 years ago|reply
I haven't done any significant development work in more than six months, just chasing bugs, doing support, and fussing with email and meetings. It blows; I've got to find a different job.
[+] [-] Jtsummers|8 years ago|reply
If it's one-offs and not consistent misbehavior that the above can deal with, improve testing infrastructure. If you're unable to hit your feature development schedule, point to the problems in the present system and infrastructure.
Ask your boss for clear priorities: Do they want a stable system, or more features. If the present system is this unstable, then more features will only exarcibate this. If they say they want both, and give them equal priority, ask for a pay raise and search for new jobs.
[+] [-] kaikai|8 years ago|reply
That said, some teams at my company are experimenting with having a week-long rotation for "bread box" issues. Those include tending issues/PRs in open source repos, handling bugs as they come in, etc. That frees up the rest of the rest of the team to work on core feature work.
I like to keep a running list of smaller, non-urgent tasks that would otherwise get neglected. When I have a long-running script or need to take a break from another project, I can refer to the list.
[+] [-] mottomotto|8 years ago|reply
Do some developers on the team need to think about scale? Yes. Should all the developers be on call because perhaps the company decided to roll it's own infrastructure and someone has to deal with occasional server with full disks? No.
[+] [-] lin_lin|8 years ago|reply
What's the standard pay for being on-call as a matter of interest?
[+] [-] amriksohata|8 years ago|reply
[+] [-] scarface74|8 years ago|reply
A developer shouldn't be the first person called, there should be an operations staff but they may have to escalate.
On the other hand, any time that a developer is routinely being called in the middle of the night, there is usually either an issue with the software or the infrastructure not being fault tolerant.
[+] [-] bradhe|8 years ago|reply
Good luck with that.
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] flukus|8 years ago|reply
Is the stuff you have to intervene for under your control or external? If you're relying on outside systems that are flakey then you need make your systems more resilient, things like automatically retrying a few minutes later if some third party service is down and/or being more transnational so you can deal with errors.
[+] [-] sqldba|8 years ago|reply
If the problem is that you can’t focus long enough to do non-operations work whats the problem with that?
Are you unhappy you’re not coding? If so then ask for a new hire to take over the part you don’t want or start looking for another job.
Are you unhappy that your boss is still pushing you for results and is an utterly clueless idiot who has no idea where your time actually goes?
Fill us in.
[+] [-] watwut|8 years ago|reply
As for infrastructure and first line support, lobbying management for more people continuously is just about the only long term solution.
The other thing is planning and transparency which helps the above. Keep plan with realistic estimates to show it management each time you talk with them. Do your best work, definitely dont slack etc, but dont skip corners to make something look like done when it is not. Instead, move dates in plan and send it to management again. The point is to convince them that there is really more overall work then possible by one person. (If they get offended over that or treat you badly over that, find a new job.)
[+] [-] Kuraj|8 years ago|reply
My problem is that I have become that someone else.
[+] [-] eitland|8 years ago|reply
(I haven't read this cover to cover but I has more or less read his and Christina J. Hogans book cover to cover I thing and I've also bought a couple of copies of the above book to share.)
Summary of what I've learned and found useful from those and other resources:
Get someone to step in for you half the time. (If only to fill in a ticket or - in a real emergency: call you.)
Manage expectations. (You don't expect hard interrupts except for emergencies. )
Make support requests asynchronous. (Mail, support tickets - not calls. Even when you (or someone else) are available for real time support, - make chat the preferred option.
[+] [-] holydude|8 years ago|reply
[+] [-] pmontra|8 years ago|reply
Some of those activities are paid, but fixes close to a delivery are not and it's OK. Usually I set up a maintenance contract for quick activities, like small new features or investigating puzzling events (not necessarily bugs.) I have a ticketing system to keep track of those activities. Customers have access to it.
Obviously one has to make clear that maintenance will slow down development.
[+] [-] dozzie|8 years ago|reply
[+] [-] lamansion|8 years ago|reply
[+] [-] aprdm|8 years ago|reply
- Separation between development / staging / production environments.
- Integration tests.
- Service / System Metrics.
- Central logging.
- High availability.
- Alerts.
When you have a solid deployment pipeline things don't usually break. Errors and regressions are caught in the staging part of the deployment pipeline and errors in production can be rolled back automatically (and then you add a integration test for the regression!)
All this devopsy work at my company is done by software engineers with advise from systems engineers. And we do it because neither of the groups want to get called in the weekends :) it has been working really well. Last year we had 0 calls. Before we had this in place things would break in a weekly basis.
You can build all of what I mentioned with OSS like:
- Ansible (deployment)
- Jenkins (ci)
- ELK stack (metrics / logging)
- Zabbix (system metrics)
This system has been serving us, on premises, without much maintenance.
[+] [-] thisisit|8 years ago|reply
So you are into devops but doing more ops than dev? This doesn't sound like a problem until your team's agenda and objective is to deliver more ops work.
[+] [-] bradhe|8 years ago|reply
[+] [-] akulbe|8 years ago|reply