I'm reminded of a time that an intern took down us-east1 on AWS, by modifying a configuration file they shouldn't have had access to. Amazon (somehow) did the correct thing and didn't fire them -- instead, they used the experience to fix the security hole. It was a file they shouldn't have had access to in the first place.If the intern "had no experience with the AI lab", is it the right thing to do to fire them, instead of admitting that there is a security/access fault internally? Can other employees (intentionally, or unintentionally) cause that same amount of "damage"?
grogenaut|1 year ago
Twirrim|1 year ago
It didn't override safeguards, but they sure wanted you to think that something unusual was done as part of the incident. What they executed was a standard operational command. The problem was, the components that that command interacted with had been creaking at the edges for years by that point. It was literally a case of "when", and not "if". All that happened was the command tipped it over the edge in combination with everything else happening as part of normal operational state.
Engineering leadership had repeatedly raised the risk with further up the chain and no one was willing to put headcount to actually mitigating the problem. If blame was to be applied anywhere, it wasn't on the engineer following the run book that gave them a standard operational command to execute with standard values. They did exactly what they were supposed to.
Some credit where it's due, my understanding from folks I knew still in that space, is that S3 leadership started turning things around after that incident and started taking these risks and operational state seriously.
tgavVs|1 year ago
Interesting that my experience has been the exact opposite.
Whenever I’ve participated in COE discussions (incident analysis), questions have been focused on highlighting who made the mistake or who didn’t take the right precautions.
notyourwork|1 year ago
evanextreme|1 year ago
DrillShopper|1 year ago
bawolff|1 year ago
You're not firing the person because they broke stuff, you are firing them because they tried to break stuff. If the attempt was a failure and caused no harm, you would still fire them. Its not about the damage they caused its that they wanted to cause damage.
xnavra50|1 year ago
[deleted]
ozim|1 year ago
I do not see any mention of other legal action and article is shallow.
It might’ve been that someone in command chain called it “malicious” to cover up his own mistakes. I think that is parent poster point while writing out Amazon story.
donavanm|1 year ago
Ive never heard of an individual being terminated or meaningfully punished for making an earnest mistake, regardless of impact. I do know of people who were rapid term’d for malicious, or similar, actions like sharing internal information or (attempting to) subvert security controls.
On the whole I did see Amazon “do the right thing” around improving process and tools; people are a fallible _part_ of a system, accountability requires authority, incremental improvements today over a hypothetical tomorrow.
zmgsabst|1 year ago
And that wasn’t even a mistake the SDEs made — they were punished for the economists being reckless and subsequently bullied out of the company, despite the SDEs trying to raise the alarm the whole time.
godelski|1 year ago
But my understanding of this case is that the actions do not appear like simple easy to make mistakes. As I understand, the claim was that the intern was modifying the weights of checkpoints for other peoples' training results in an effort to make their own work better. Mucking about in a checkpoint is not a very common thing to do, so should make someone suspicious in the first place. On top of this it appears he was exploiting weaknesses and injecting code to mess with peoples' optimizers, and to do things that do not have a reasonable explanation for.
So as far as I can tell, not only was he touching files he shouldn't have been touching (and yes, shouldn't have had access to), he was taking steps to bypass the blocks there were in place and was messing with them in ways that are very difficult to explain away with "I thought this might be a good idea." (Things that explicitly look like a bad idea). If that is what in fact happened, I think it is not a reach to claim intentional sabotage. Because if it wasn't, then the actions are represent such a level of incompetence that they are a huge liability to anyone within reach.
[0] https://www.cia.gov/static/5c875f3ec660e092cf893f60b4a288df/...
kleton|1 year ago
dudus|1 year ago
Did the employee have the intent to cause damage? If so just fire him/her.
danpalmer|1 year ago
EE84M3i|1 year ago
rafram|1 year ago
unknown|1 year ago
[deleted]
Aurornis|1 year ago
This wasn’t an accident, though. The intern had malicious intent and was intentionally trying to undermine other people’s work.
This isn’t a case where blameless post-mortems apply. When someone is deliberately sabotaging other people’s work, they must be evicted from the company.
raihansaputra|1 year ago
noobermin|1 year ago
throw3828455|1 year ago
I wonder if we applied this culture talk to Western companies how funny it would sound.
The reason Facebook is firing so many people is because individualism "is far more important for them than 'teaching lessons' to anyone, particularly employees who are probably considered expendable."
unknown|1 year ago
[deleted]