top | item 39804552

(no title)

throw5323446 | 1 year ago

> Instead of messing around trying to repair it, simply kill the machine, or take it out of the pool. Get a new one.

"4:10pm the new machine still has the same performance issue"

discuss

order

FridgeSeal|1 year ago

Sure, but more often than not - esp in cloud scenarios, sometimes you just get a machine that is having a bad day and it’s quicker to just eject it, let the rest of the infra pick up the slack, and then debug from there. Additionally if you’ve axed a machine, and got the same issue, you know it’s not a machine issue, so either go look at your networking layer or whatever configs you’re using to boot your machines from…

tjoff|1 year ago

> esp in cloud scenarios

... so the nice thing about the about the cloud is that you can workaround cloud-specific issues?

jandrese|1 year ago

4:20pm Turns out it was DNS

Propelloni|1 year ago

That made me laugh. Thank you. Of course, it is not DNS. DNS has become the new cabling. DNS is not especially complicated, but cabling is neither. Yet, during dot.com and subsequent years the cabling was causing a lot of the problems so that we get used to first check the cabling. But it only took a few more years to realize that it is not always cabling, actually failures are normally distributed.

Is it wrong to check DNS first? No, but please realize that DNS misconfiguration is not more common than other SNAFUS.

SerCe|1 year ago

That's actually amazing, a reproducible problem is a 90% solved problem!