top | item 35071835

(no title)

shock-value | 3 years ago

This is convenient behavior up until you actually have an incident that coincides with theirs, in which case it becomes catastrophic because you had no idea that outside vigilance was required on account of their ingestion downtime. Not sure why you would laud this. Is it possible to opt out?

discuss

order

palijer|3 years ago

In your scenario you would have no logs etc until the DD incident resolved.

Opting out would just mean all your missing data alerts fire every time Datadog has an incident and you would then check, see that everything is missing, and then identify the cause as the Datadog incident.

Its much better to have them handle it and auto-mute the impacted monitors than communicate to my customers every time about false alerts saying all our services are down.

shock-value|3 years ago

> Opting out would just mean all your missing data alerts fire every time Datadog has an incident and you would then check, see that everything is missing, and then identify the cause as the Datadog incident.

You are missing the last step, which is that, knowing alerts are down, you can actively monitor using other tools/reporting for the duration of their incident.

And why would you have no logs? Even assuming you ingest logs through Datadog (they monitor on much than just logs and not everyone uses all facets of their offering), you would presumably have some way to access them more directly (even tailing output directly if necessary).

And lastly, why would you communicate to your customers without any idea of the scope or cause of the issue? It would likely be clear very quickly that Datadog was having issues when you see that all your metrics are suddenly discontinued without other ill effect.

mmelgard|3 years ago

IIRC you can also just set up a monitor to alert if there is no data on a given metric.