top | item 46617745

(no title)

binarylogic | 1 month ago

I spent a decade in observability. Built Vector, spent three years at Datadog. This is what I think is broken with observability and why.

discuss

order

otterley|1 month ago

And how are you solving the problem? The article does not say.

> I'm answering the question your observability vendor won't

There was no question answered here at all. It's basically a teaser designed to attract attention and stir debate. Respectfully, it's marketing, not problem solving. At least, not yet.

binarylogic|1 month ago

The question is answered in the post: ~40% on average, sometimes higher. That's a real number from real customer data.

But I'm an engineer at heart. I wanted this post to shed light on a real problem I've seen over a decade in this space that is causing a lot of pain; not write a product walkthrough. But the solution is very much real. There's deep, hard engineering going on: building semantic understanding of telemetry, classifying waste into verifiable categories, processing it at the edge. It's not simple, and I hope that comes through in the docs.

The docs get concrete if you want to peruse: https://docs.usetero.com/introduction/how-tero-works

yorwba|1 month ago

I'm curious about the deep details, but the link 404s.

binarylogic|1 month ago

My apologies, I fixed the link. So much for restructuring the docs the night before posting this.

You can read more here: https://docs.usetero.com/data-quality/overview

To loosely describe our approach: it's intentionally transparent. We start with obvious categories (health checks, debug logs, redundant attributes) that you can inspect and verify. No black box.

But underneath, Tero builds a semantic understanding of your data. Each category represents a progression in reasoning, from "this is obviously waste" to "this doesn't help anyone debug anything." You start simple, verify everything, and go deeper at your own pace.