In the past few years, there has been a surge in number of log management solutions — Loggly, LogDNA, Scalyr, Sumo Logic. Which one is being used by you / your company?
Papertrail, super friendly and insightful support.
So let me elaborate. Mostly what will you get from support is: "We are fixing problem", but in our case they were specific, "We have problems with Heroku logspout connection, 'heroku log' should still work." And the other time we went a bit over limit so they upped plan for free for a short period, se we could figure about what the problem was. Alerts are also what we use the most (no limits, no delays) which cannot say for the other providers.
Having tried self-hosted and cloud logging solutions, Papertrail is one of the services that we subscribe to that I feel pangs of gratitude when we receive our monthly billing notification.
Some of the things I liked about Papertrail:
- Super easy to setup (& automate setup). Just a dozen lines of setup in rsyslog.conf
- Also ships with a tiny client written in go if you want to tail specific files
- Sensible pricing. Charges per GB, which for us at least, correlated with how much business we were doing. Now we'd gladly pay more for the same service
- Great dashboard/UI. Having started with loggly, we were used to slow, unresponsive, unintuitive dashboard. As loggly grew, the dashboard increased in complexity. Papertrail by contrast is fast, simple and made sense (at least to us). It's quite surprising how simple it is and yet performs it's job very well. Although we don't use the live log tailing so much, the log grouping & notification system is very intuitive and easy to setup.
- Easy S3 archival. Took about 10 minutes to setup
Using Papertrail as well, but I really miss my terminal and awk, grep and less here. Or even regex search. I know I can download the archives, and I do that, but that puts the output of all in one file. Saving different topics in different files just makes sense, IMHO...
Still getting used to the ways of the cloud, I suppose...
Biggest downside to papertrail is the price. You can log important stuff for a reasonable price. If you want to debug log everything from a fairly big site? You are looking at a papertrail bill 10x your hosting bill.
Not bad idea to send info+ to papertrail, and debug off to S3 or similar directly so you can dig when something weird happens..
We build our own. All events are published on Tank( https://github.com/phaistos-networks/TANK ) , and we have a bunch of consumers that consume from various Tank topics. They process the data and either publish to other Tank topics(to be consumed by other services), or they update state on various services.
- For data exploration, we use memSQL. We keep the last day’s worth of data there(we DELETE rows to keep the memory footprint down), and because most of the time it’s about understanding something that has happened recently, it’s almost always sufficient. Each row contains the event’s representation as JSON, and we also have a few more columns for faster lookup. memSQL’s JSON support is great(we used mySQL for that but it was too slow), so we can take advantage of joins, aggregations, windowing etc.
- For data visualisation, we use ELK (but it’s pretty slow), a tool our ops folks built (“otinanai”: https://github.com/phaistos-networks/otinanai) and we have a few smaller systems that generate graphs and reports.
- For alerts and tickets, our ops folk built another tool that monitors all those events, filters them and executes domain-specific logic that deals with outliers, notifications routing, and more.
This solves most of our needs, but we plan to improve this setup further, by monitoring even more resources and introducing more tools(Tank consumers) to get more out of our data.
Great tools, congrats. The name 'otinanai' though is rather dodgy (for those who knows what it means) although I can see it stems from [...] designed to graph anything.
We're actually building (and using) a log alternative called OverOps (https://www.overops.com), it's a native JVM agent that adds links to each log warning / error / exception that lead to the actual variable state and code that caused them, across the entire call stack. Disclaimer: I work there, would be happy to answer any question.
I'm really excited about Takipi (I guess OverOps now) but the per-JVM pricing kills it for us, especially as we look to moving to microservices. Any plans for alternative pricing, such as per-GB or per-exception?
which is a PITA to do multi-line grep... variable field based searching and multi-system cross-comparison... but sure.. if it works for you.. then great
The one I really wanted to use/like was http://scalyr.com. However even after their redesign, I still can't use their query language. With LogEntries, it's pretty natural.
I was used to Google Cloud logs (comes for free with Appengine).. Now I"m working with an AWS based system with an ELK stack... Its ui is horrible. Finding the right log entries is a hell. And it often breaks and somebody has to update it.. I hope we can move to some log cloud provider soon.
Logmatic.io was not mentionned but we are more known in Europe so far. Disclaimer I work there.
We invest a lot in analytics, parsing/enrichment and fast & friendly UX. We try to be a premium solution for the same reasonable price as others and our users tend to say great things about us (eg http://www.capterra.com/log-management-software/ ). Happy to answer if you have any questions. :)
We are using Logmatic.io in my team (switch from logentries). Our stack is based on mesos/docker with plenty of microservices. Sending logs, building analytics are very easy, and clickable dashboard just amazing.
Logentries. Not sure if I'd say I'm satisfied, but I haven't found anything better.
Pros:
* Decent Java logging integration (some services treat things line-by-line, this is a deal breaker for things like multi-line Java exceptions)
* Reasonably priced
* Alerts are kinda nice
* Hosted
Cons:
* Sometimes UI maxes my Chrome CPU
* Live mode not stable at all
* UI is clunky to say the least. It's not always clear what the context of a search is, the autocomplete is obnoxious. I heard they have a new UI coming out sometime, who knows when
LogDNA: powerful, easy to get started and still improving. Using in parallel with Papertrail and instead of Logentries (which we had horrific problems with earlier in the year).
we had an awful time with logentries: "live" mode never working, the search facility is bizarre, over charging us, terrible UX. been with LogDNA for about 2 months and we are quite happy with it
We use EK but not L, instead writing own daemon that rsyslog sends loglines to and bulk inserts them into ES. We use kibana & grafana for visualization. We index approx 20k log-lines per sec (at peak) w/o a sweat (whereas logstash would choke up fairly often). A little over half a billion log lines a day - retained for a week - costs us around $800/mo on GCE (for storage & compute).
Whatever solution you use to store your logs, I would suggest to generate them as events. This will help you to reconcile two important aspects that have been separated for too long, with no real reason : logging and analytics. It may require a little bit more effort but I believe it's worth it.
Interesting! this is something that I have thought over often. My own experience of log aggregation is limited to ELK stack and Loggly (for a brief time) where the setup worked fine but the workflow didn't. We just stopped browsing logs after a while. Although, a giant centralised system for logs sounds incredible convenient, making sense of them starts to become a huge problem and then, it's just easier to ignore log system.
I am sure the solutions discussed here have features to overcome this (filters / alerts) but IMHO, we'd be better of collecting less things — limited app events that have fixed formatting and are easier to make use of in debugging and monitoring.
Previously used logentries and papertrail, but they became expensive as our log volumes got larger and flexibility was missing.
Now we use self-hosted ELK (elasticsearch, logstash, & kibana) and I'm not itching to go back to any of the hosted services. It's not as good as something like papertrail for tailing log streams live (although that isn't very useful at larger scale) and the UI of Kibana does take a bit of getting used to though.
We use https://github.com/gliderlabs/logspout to forward all our docker logs to Papertrail... it's like you are watching your nodejs services running in your terminal. Seamless experience.
I used sumo logic at some moment.
I was able to query months of data, it was very powerful.
You can achieve additional optimization by processing logs as they get ingested. Very well thought product.
I have also have used ELK, but I have to say that SumoLogic felt like a superior product by far.
[+] [-] dz0ny|9 years ago|reply
So let me elaborate. Mostly what will you get from support is: "We are fixing problem", but in our case they were specific, "We have problems with Heroku logspout connection, 'heroku log' should still work." And the other time we went a bit over limit so they upped plan for free for a short period, se we could figure about what the problem was. Alerts are also what we use the most (no limits, no delays) which cannot say for the other providers.
Good work Papertrail, if you are reading this.
[+] [-] reqres|9 years ago|reply
Some of the things I liked about Papertrail:
- Super easy to setup (& automate setup). Just a dozen lines of setup in rsyslog.conf
- Also ships with a tiny client written in go if you want to tail specific files
- Sensible pricing. Charges per GB, which for us at least, correlated with how much business we were doing. Now we'd gladly pay more for the same service
- Great dashboard/UI. Having started with loggly, we were used to slow, unresponsive, unintuitive dashboard. As loggly grew, the dashboard increased in complexity. Papertrail by contrast is fast, simple and made sense (at least to us). It's quite surprising how simple it is and yet performs it's job very well. Although we don't use the live log tailing so much, the log grouping & notification system is very intuitive and easy to setup.
- Easy S3 archival. Took about 10 minutes to setup
Thank you Papertrail
[+] [-] joepvd|9 years ago|reply
Still getting used to the ways of the cloud, I suppose...
[+] [-] pwelch|9 years ago|reply
Using Graylog at my current job and it's working well so far.
[+] [-] brianwawok|9 years ago|reply
Biggest downside to papertrail is the price. You can log important stuff for a reasonable price. If you want to debug log everything from a fairly big site? You are looking at a papertrail bill 10x your hosting bill.
Not bad idea to send info+ to papertrail, and debug off to S3 or similar directly so you can dig when something weird happens..
[+] [-] cdnsteve|9 years ago|reply
[+] [-] markpapadakis|9 years ago|reply
- For data exploration, we use memSQL. We keep the last day’s worth of data there(we DELETE rows to keep the memory footprint down), and because most of the time it’s about understanding something that has happened recently, it’s almost always sufficient. Each row contains the event’s representation as JSON, and we also have a few more columns for faster lookup. memSQL’s JSON support is great(we used mySQL for that but it was too slow), so we can take advantage of joins, aggregations, windowing etc.
- For data visualisation, we use ELK (but it’s pretty slow), a tool our ops folks built (“otinanai”: https://github.com/phaistos-networks/otinanai) and we have a few smaller systems that generate graphs and reports.
- For alerts and tickets, our ops folk built another tool that monitors all those events, filters them and executes domain-specific logic that deals with outliers, notifications routing, and more.
This solves most of our needs, but we plan to improve this setup further, by monitoring even more resources and introducing more tools(Tank consumers) to get more out of our data.
[+] [-] atmosx|9 years ago|reply
[+] [-] tkfx|9 years ago|reply
Sumo Logic, Graylog, Loggly, PaperTrail, Logentries, Stackify: http://blog.takipi.com/how-to-choose-the-right-log-managemen...
ELK vs Splunk: http://blog.takipi.com/splunk-vs-elk-the-log-management-tool...
Hosted ELK tools: http://blog.takipi.com/hosted-elasticsearch-the-future-of-yo...
We're actually building (and using) a log alternative called OverOps (https://www.overops.com), it's a native JVM agent that adds links to each log warning / error / exception that lead to the actual variable state and code that caused them, across the entire call stack. Disclaimer: I work there, would be happy to answer any question.
[+] [-] real_joschi|9 years ago|reply
[+] [-] crummy|9 years ago|reply
[+] [-] jorrizza|9 years ago|reply
[+] [-] whocanfly|9 years ago|reply
[+] [-] FooBarWidget|9 years ago|reply
Our analysis frontend is plain old SSH, bash, grep and less.
[+] [-] tstack|9 years ago|reply
[+] [-] Daviey|9 years ago|reply
[+] [-] pbowyer|9 years ago|reply
The one I really wanted to use/like was http://scalyr.com. However even after their redesign, I still can't use their query language. With LogEntries, it's pretty natural.
[+] [-] TeeWEE|9 years ago|reply
[+] [-] renaud92|9 years ago|reply
[+] [-] x6i4uybz|9 years ago|reply
[+] [-] crummy|9 years ago|reply
Pros:
* Decent Java logging integration (some services treat things line-by-line, this is a deal breaker for things like multi-line Java exceptions)
* Reasonably priced
* Alerts are kinda nice
* Hosted
Cons:
* Sometimes UI maxes my Chrome CPU
* Live mode not stable at all
* UI is clunky to say the least. It's not always clear what the context of a search is, the autocomplete is obnoxious. I heard they have a new UI coming out sometime, who knows when
[+] [-] wodow|9 years ago|reply
[+] [-] paullth|9 years ago|reply
[+] [-] k33n|9 years ago|reply
[+] [-] jhgg|9 years ago|reply
[+] [-] janvdberg|9 years ago|reply
[+] [-] xbryanx|9 years ago|reply
[+] [-] zer0gravity|9 years ago|reply
I've expanded on this idea here [1]
[1] - https://github.com/acionescu/event-bus#why
[+] [-] balamaci|9 years ago|reply
[+] [-] shubhamjain|9 years ago|reply
I am sure the solutions discussed here have features to overcome this (filters / alerts) but IMHO, we'd be better of collecting less things — limited app events that have fixed formatting and are easier to make use of in debugging and monitoring.
[+] [-] jordanthoms|9 years ago|reply
Now we use self-hosted ELK (elasticsearch, logstash, & kibana) and I'm not itching to go back to any of the hosted services. It's not as good as something like papertrail for tailing log streams live (although that isn't very useful at larger scale) and the UI of Kibana does take a bit of getting used to though.
[+] [-] joeyspn|9 years ago|reply
We use https://github.com/gliderlabs/logspout to forward all our docker logs to Papertrail... it's like you are watching your nodejs services running in your terminal. Seamless experience.
[+] [-] jakozaur|9 years ago|reply
Disclaimer: I work there :-), happy to answer any of your questions.
[+] [-] partycoder|9 years ago|reply
I have also have used ELK, but I have to say that SumoLogic felt like a superior product by far.
[+] [-] wlk|9 years ago|reply
[+] [-] hbz|9 years ago|reply
[+] [-] kevinshinobi|9 years ago|reply
[+] [-] scanr|9 years ago|reply
https://getseq.net