top | item 42657808

(no title)

dimitar | 1 year ago

It is as complicated as you want or need it to be. You can avoid any magic and stick to a subset that is easy to reason about and brings the most value in your context.

For our team, it is very simple:

* we use a library send traces and traces only[0]. They bring the most value for observing applications and can contain all the data the other types can contain. Basically hash-maps vs strings and floats.

* we use manual instrumentation as opposed to automatic - we are deliberate in what we observe and have great understand of what emits the spans. We have naming conventions that match our code organization.

* we use two different backends - an affordable 3rd party service and an all-on-one Jaeger install (just run 1 executable or docker container) that doesn't save the spans on disk for local development. The second is mostly for piece of mind of team members that they are not going to flood the third party service.

[0] We have a previous setup to monitor infrastructure and in our case we don't see a lot of value of ingesting all the infrastructure logs and metrics. I think it is early days for OTEL metrics and logs, but the vendors don't tell you this.

discuss

order

madeofpalk|1 year ago

It's as complicated as you want, but it's not as easy as I want. The floor is pretty high.

I'm still looking for an endpoint just to send simple one-off metrics to from parts of infrastructure that's not scrapable.

hinkley|1 year ago

I did not find that manual instrumentation made things simpler. You’re trading a learning curve that now starts way before you can demonstrate results for a clearer understanding of the performance penalties of using this Rube Goldberg machine.

Otel may be okay for a green field project but turning this thing on in a production service that already had telemetry felt like replacing a tire on a moving vehicle.

dmoy|1 year ago

I've not used otel for anything not greenfield, but I just wanted to say

> felt like replacing a tire on a moving vehicle.

Some people do this as a joke / dare. I mean literally replacing a car tire on a moving vehicle.

You Saudi drift up onto one side, and have people climb out of the side in the air, and then swap the tire while the car is driving on two wheels.

It's pretty insane stuff: https://youtu.be/Str7m8xV7W8?si=KkjBh6OvFoD0HGoh

buzzdenver|1 year ago

Mind sharing that that affordable 3rd party service is?

mikestorrent|1 year ago

Very sane advice. Most folks will already have something for metrics and logs and unless there's ROI on changing it out, why bother?

Groxx|1 year ago

>You can avoid any magic and stick to a subset...

... if (and only if) all the libraries you use also stick to that subset, yea. That is overwhelmingly not true in my experience. And the article shows a nice concrete example of why.

For green-field projects which use nothing but otel and no non-otel frameworks, yea. I can believe it's nice. But I definitely do not live in that world yet.