top | item 17568989

(no title)

ksmith14 | 7 years ago

The Google SREs mentioned this in their book; the Chubby locking service had uptime that was so high that folks started to neglect making their own services resilient to Chubby failures: https://landing.google.com/sre/book/chapters/service-level-o...

discuss

order

robax|7 years ago

+1 for this book. As a junior DevOps engineer this book has been super helpful.

philsnow|7 years ago

the book is structured in a way that makes it pretty easy to jump around and pick and choose which parts you want to read or skip, so it's not a very large commitment to read it

AdamM12|7 years ago

Mine just came in the mail today. Pretty stoked.

mav3rick|7 years ago

Still that's bad design on the clients' part. E.g. - Just because malloc "never" fails doesn't mean it can't fail :) so better error check for it.

Filligree|7 years ago

Doesn't matter. Engineering around human failure is part of the profession.

smcameron|7 years ago

Failure of malloc() might be a bad example to pick because on linux, by default, most distros overcommit, so malloc won't fail, generally. Instead, malloc will succeed allocating the address space just fine, but the RAM will get allocated upon first use, meaning that even though malloc gave you a supposedly valid pointer rather than NULL, actually using that pointer will crash your program.