top | item 35076119

(no title)

vinayan3 | 3 years ago

Thanks for responding and providing details.

One follow up is there are instances where Datadog report outages but Metrist says it's green.

Is that because the functional tests are still working but some other part of Datadog was reported as down?

discuss

order

lngarner|3 years ago

In most cases, vendors like Datadog may still manually say its service is still down, even if it's pretty much up and running just to make sure they don't speak too soon about being up and running again. But our tests can see that they are working even before the vendor is ready to announce they are functioning again. What a vendor reports generally isn't usually a real-time reflection of what's happening in their software. Updating the status page is like a press release about someone important recovering from an illness. We're like the medical equipment that monitors that person's health. The press has to take some time to make craft a message when they know the person is healthy and wait a moment to report to make sure the person doesn't relapse and they report health too soon. On the other hand, medical equipment is just there to measure health and it can show that way sooner than the press release. In other cases, Metrist mostly monitors essential functions right now and in the demo we monitor them from our point of view. So a minor part we don't monitor could be down but the major parts we do monitor are up. And so a status page may report certain part of the service as down while we just don't monitor that part. Further, since users experience outages differently and the demo is from our experience with the software, other users could be experiencing an outage while we aren't. So it's important for Metrist users to set up personalized monitoring so they know exactly how an outage is affecting them.