wsargent's comments

wsargent | 3 years ago | on: Discussion: structured, leveled logging

> I'm surprised this is up for debate.

I looked into logging in protobuf when I was seeing if there was a better binary encoding for ring-buffer logging, along the same lines as nanolog:

https://tersesystems.com/blog/2020/11/26/queryable-logging-w...

What I found was that it's typically not the binary encoding vs string encoding that makes a difference. The biggest factors are "is there a predefined schema", "is there a precompiler that will generate code for this schema", and "what is the complexity of the output format". With that in mind, if you are dealing with chaotic semi-structured data, JSON is pretty good, and actually faster than some binary encodings:

https://github.com/eishay/jvm-serializers/wiki/Newer-Results...

wsargent | 3 years ago | on: Discussion: structured, leveled logging

Same! https://github.com/tersesystems/blacklite/

wsargent | 3 years ago | on: Discussion: structured, leveled logging

I see it the other way around: tracing is essentially logging with hierarchy. If you can keep the context of a "parent" span around, then you can log all of the entries out and build up the entire trace from spans, although you can get pretty confused if you don't write them out in the correct order.

However, if you don't have context, then the log entry is the atomic unit -- metrics are log entries with numbers that can be aggregated, spans are log entries with a trace id and a parent span, and "events" in Honeycomb terminology are logs with attributes covering an entire business operation / HTTP request.

wsargent | 3 years ago | on: Discussion: structured, leveled logging

MDC/NDC only reliably works when you don't have asynchronous code (Akka/Vert.x/CompletionStage) through your application. As soon as you start using multiple threads, it becomes significantly more complex to keep MDC context carried over from one thread to the next.

wsargent | 4 years ago | on: Show HN: Echopraxia, a better Java Logging API

Here are the benchmarks [1] using a no-op appender. There are some micro-optimizations I can make, but I think this is a pretty good start.

https://github.com/tersesystems/echopraxia/blob/main/BENCHMA...

wsargent | 4 years ago | on: Show HN: Echopraxia, a better Java Logging API

I'll add that. I do have the benchmarking from Blindsight, the Scala logging API, and it's on the order of nanoseconds:

https://tersesystems.github.io/blindsight/performance/benchm...

wsargent | 4 years ago | on: Ask HN: 90s programmers, what did you expect the future of tech to look like?

I thought we would have different debugging tools.

Specifically, I thought languages would be designed around debugging use cases. The fact that we are still using debuggers and loggers and printlns to debug blows my mind.

wsargent | 5 years ago | on: Why I Built Litestream

This is neat! I have a tool blacklite[1] that logs to sqlite, so using this I can export those logs automatically to S3 and keep persistent logs even if the instance goes away. I can see this being really useful for k8s and docker containers.

[1]: https://tersesystems.com/blog/2020/11/26/queryable-logging-w...

wsargent | 5 years ago | on: Queryable Logging with Blacklite

I've published a logging appender called Blacklite that writes logging events to SQLite databases. It has good throughput and low latency, and comes with archivers that can delete old entries, roll over databases, and compress log entries using dictionary compression; compression that looks for common elements across small messages and extracts it into a common dictionary.

So that's the sales pitch. Now let's do the fun question – how and why did it get here?

I started off this blog post by writing out the requirements for a forensic logger as if I had total knowledge of what the goal was. But that's not what happened. The real story is messy, discovering the requirements piecemeal, and involves lots of backtracking over several months. I think it's more interesting and human to talk about.

wsargent | 5 years ago | on: Show HN: Concise Encoding – a friendly data format for human and machine

How does this compare to Amazon Ion?

https://amzn.github.io/ion-docs/

wsargent | 5 years ago | on: Good Logging

There's also Witchcraft Logging, which has the concept of "safe" parameters and is available in Go, Rust, and Clojure:

https://docs.rs/witchcraft-log/0.3.0/witchcraft_log/

wsargent | 5 years ago | on: Good Logging

Also https://www.usenix.org/sites/default/files/conference/protec... and https://speakerdeck.com/kaushik/tracing-and-profiling-java-a...

wsargent | 5 years ago | on: Good Logging

Lots of people do exactly this:

https://tersesystems.com/blog/2019/10/05/diagnostic-logging-...

wsargent | 5 years ago | on: Good Logging

Yup, I have a demo app showing dumping from ringbuffer to JDBC here:

https://terse-logback-showcase.herokuapp.com/

wsargent | 5 years ago | on: Good Logging

I'd argue that you want to keep that debugging line, but put it inside a guard. Diagnostic logging is used lots in production [1].

[1] https://tersesystems.com/blog/2019/10/05/diagnostic-logging-...

wsargent | 5 years ago | on: Good Logging

You can also hook up your logging to a feature flag system, so it's only activated for a single user. [1]

[1] https://tersesystems.com/blog/2019/07/22/targeted-diagnostic...

wsargent | 5 years ago | on: Good Logging

> Many Java logging systems allow you to enable/disable logging per package, but this is still not enough for a per function decision.

I have something for that. [1]

[1] https://tersesystems.github.io/blindsight/usage/conditional....

wsargent | 5 years ago | on: Good Logging

> The second problem is to figure out if logging is affecting performance.

Spoiler -- it totally can. Kirk Pepperdine talks about logging as a memory allocation bottleneck [1]

So I gave my usual logging rant at a workshop that I gave. I was in the Netherlands about a year or so ago. And that night, they went and stripped all the logging out of their transactional monitoring framework that they’re using, which wasn’t getting them the performance they wanted, which is why I was there giving the workshop in the first place. And when they stripped out all their logging, the throughput jumped by a factor of four. And they’re going like, “Oh, so now our performance problem not only went away, we’ve been tuning this thing so long that actually, when they got rid of the real problem, it went much faster than they needed, like twice as fast as what they needed to go.” – The Trouble with Memory

Unfortunately it can be kind of hard to track the memory allocation rate over time, and it's typically not the sort of thing you're focused on. I put together an audio queue that will play the memory allocation rate as a sine wave [2] so I can tell when it's getting to be a problem.

[1] https://www.infoq.com/presentations/jvm-60-memory/

[2] https://tersesystems.com/blog/2020/07/19/listening-to-jvm-me...

wsargent | 5 years ago | on: Good Logging

Agree with everything you've said, but MDC does have one big drawback: it's only convenient when you're working within a single thread. If you're writing asynchronous code (Scala, Akka, Vert.x etc) then MDC doesn't help you very much. [1]

I generally find it easier to have an explicit "context" object and pass that around in an operation, and then log once that context has been built up. If you're lucky, the logging framework will do that for you.

Bunyan did this the Right Way from the get go, using `log.child`. [2]

[1] https://tersesystems.github.io/blindsight/usage/context.html...

[2] https://github.com/trentm/node-bunyan#logchild

wsargent | 5 years ago | on: Good Logging

Honeycomb has a free tier that is very generous:

https://www.honeycomb.io/pricing?

You can set up with one of their Beelines, or just send JSON to the REST endpoint.