top | item 22915025

Show HN: Errorship, use datadog as an error tracker

47 points| RabbitmqGuy | 6 years ago |errorship.com | reply

30 comments

order
[+] kcmastrpc|6 years ago|reply
Why not use distributed tracing? You get the context of entire request in addition to the exception.
[+] inetknght|6 years ago|reply
Do you have examples of how that might work with other languages or tools?
[+] MastrChefRocks|6 years ago|reply
How does this compare to Datadog's APM[0]? APM looks like it has far more features and is already integrated with Datadog.

[0] https://www.datadoghq.com/apm/

[+] sa46|6 years ago|reply
This looks like an admittedly clever hack to avoid the fantastically expensive APM and still get some useful exception logging. I really wanted to use Datadog’s APM but $31 per host was too much. The downsides to logging to the event stream like Errorship is that you can’t associate exceptions with requests and aggregation is limited to tags. I think event tags have a low-ish cardinality limit similar to metrics.
[+] karmakaze|6 years ago|reply
For Java I'd written a Logback exception sender[0] that creates metrics tagged by level, class, exception class, cause classes. I'll say that it is a quick and convenient way to see changes in system error characteristics. I had used a similar setup for Graphite but didn't work as well without tags.

Since DataDog charges by number of metrics and each tag combination counts as one, actual exception messages should only be included in tags if they don't have instance-varying text.

[0] https://github.com/karmakaze/logback-metrics-datadog

[+] RabbitmqGuy|6 years ago|reply
Hey HN!

errorShip is a python library that sends exceptions/errors generated by your application to your datadog account.

Are you tired of looking at metrics in datadog and then switching over to another website to track your applications exceptions? errorship exists to solve that context-switching problem, among others. It's a bit like sentry, bugsnag, rollbar etc; except implemented in your own datadog account.

I'm happy to receive any feedback or just chat about it.

[+] jere|6 years ago|reply
Incredibly dumb question, but doesn't datadog monitor errors out of the box?
[+] sputnikus|6 years ago|reply
IMO only if you are using the Logs functionality. You can log exceptions/errors, and it should be pretty easy to set-up monitoring around these events.
[+] sa46|6 years ago|reply
I’m going to make the classic engineer mistake and assume that it’s almost as easy to implement this myself. My understanding is that errorship does something like:

- register a global exception handler, probably tweaked to hook into framework specific exception mechanisms.

- make the exceptions “pretty”

- send the exceptions to the datadog event log.

The benefits of rolling my own is avoiding a soft dependency on errorship and that I can tweak exception aggregation and reporting. Is the primary defense that errorship only costs $10 per month or is there additional complexity I’ve missed?

[+] oefrha|6 years ago|reply
Can you resolve exceptions like in Sentry? Having to look at the list of all exceptions and figure out which ones are of interest wouldn’t be ideal. (In the demo I did notice the level/priorities filters, but nothing resembling an unresolved filter.)

Also, nitpick: it’s 2020, maybe update the front page screenshot to Python 3? My immediate reaction seeing that py27 screenshot: is this even maintained?

[+] RabbitmqGuy|6 years ago|reply
> Having to look at the list of all exceptions and figure out which ones are of interest wouldn’t be ideal.

You can filter for exceptions by tags. You can also use full text search to filter for exceptions. ie all the functionality provided[1] by the datadog eventstream[1] is available for your errors.

> Also, nitpick: it’s 2020, maybe update the front page screenshot to Python 3: is this even maintained?

errorship is compatible with both python2 and python3. Yes it is maintained. We have a testsuite that is ran in CI under both python2 and python3. Some of our customers during early trials still had some python2 applications that they had yet to port over.

1. https://docs.datadoghq.com/events/#event-stream

[+] reinkaos|6 years ago|reply
Doesn't sending the exception as tags increase metrics cardinality slowing down other queries?
[+] ohnoesjmr|6 years ago|reply
Benefits over sentry?
[+] RabbitmqGuy|6 years ago|reply
The main benefit is that you do not have to context switch from datadog metrics/logs to go look at your exceptions in sentry.

With errorship, all these are made available in one place; in your datadog account.

You also do not need to maintain two services. If you are already using datadog(maybe their APM and their logging and metrics service) then you might as well use them for error tracking instead of maintaining an additional account with sentry.

However, it is not a must that you give up sentry to use errorship. errorship will work just as fine, if you choose to continue with sentry.

[+] AtroxDev|6 years ago|reply
The library calls back home to validate the key (URL: https://errorship.com/api/?errorshipLicensekey={errorship_li...). If it fails it raises an Exception. I don't think that this is the correct way to do this.

If your server is down, my application would crash too. Just cut the license validation out of the library. If I wanted to use the library without an license I could do so anyway.

edit: as noted by the author below, this is not the case :). If the server is not available, it won't raise an exception. I did miss that part somehow.

[+] citruspi|6 years ago|reply
> If it fails it raises an Exception... If your server is down, my application would crash too

So first off, let me just say that I completely agree. If this were the case, that'd be fucking atrocious and would definitely be a blocker for using it.

But I'm curious how you came to that conclusion.

It took me < 2 minutes of looking at the source code[0] to determine that your claim was incorrect. Not only does it appear to gracefully handle the server being unavailable, the developer literally commented that code explaining that they wanted to ensure users could continue uninterrupted if the errorship server is unavailable.

> We give people the benefit of doubt.

> We only consider people to be not authorized if the backend comes back with an authoritative answer to that effect.

> Else, any errors or any other outcome; we assume authorization is there and also assume they belong to the highest pricing plan: Enterprise

> # failure of errorship should not lead to people been unable to ship exceptions

And it took even less time than that to run a new Python Docker container, install the library, run the sample code, and validate my assumptions[1] (the first attempt fails because the key is invalid, I disabled Internet access for the second attempt and it succeeded).

So I'm legitimately curious - did I miss something? Is there another failure case I didn't catch or test for? Or did you just make an assumption and not bother to verify it? And if it's the latter, why? What was the point? Like, to be frank, if this was a news piece I could understand the (possibly inaccurate) commentary. But why take the time and energy to write your comment and tear down someone's personal project with seemingly inaccurate claims?

(To be clear, no affiliation with errorship, I'm not even a DataDog user. Just a random dev browsing HN).

[0] https://gitlab.com/errorship/errorship_python/-/blob/master/...

[1] https://gist.github.com/citruspi/16d359ac2dafef6fc876e2dd101...

[+] RabbitmqGuy|6 years ago|reply
Hi.

> If your server is down, my application would crash too.

The errorship library is written in such a way that it fails open. If our servers are down(or any other failure), it does not affect your application and your application continues to work okay.