A Practitioner's Guide to Wide Events

[+] thom|1 year ago|reply

I’m quite looking forward to a future where we’ve finally accepted that all this stuff is just part of the domain and shouldn’t be treated like an ugly stepchild, and we’ve merged OLTP and OLAP with great performance for both, and the wolf also shall dwell with the lamb, and we’ll all get lots of work done.

[+] coolguy4|1 year ago|reply

Wide events are good, but watch out they don't become "god events". The event that every service needs to ingest, and, therefore, if there's new data that a service needs then we just add it onto the god event, because, conveniently, it's already being ingested. Before too long, the query that generates the wide event is getting so complex it's setting the db on fire. Like anything, there are trade offs; practical limits to how wide an event should reasonably become.

[+] physicles|1 year ago|reply

Maybe I’m missing something, but this doesn’t seem like what the article is talking about at all. These events are just telemetry — they’re downstream from everything, and no service is ingesting them or relying on them for actual operational data.

[+] marmaduke|1 year ago|reply

i wonder if there are any semi automated approaches to finding outliers or “things worth investigating” in these traces, or is it just eyeballs all the way down?

[+] valyala|1 year ago|reply

This is possible by semi-automatic detection of anomalies over time for some preset of fields used for grouping the events (aka dimensions) and another preset of fields used in stats calculations (aka metrics). In general case this is hard to resolve taks, since it is impossible to check for anomalies across all the possible combinations of dimensions and metrics for wide events with hundreds of fields.

This is also complicated by the possibility to apply various filters for the events before and after ststs' calculations.

[+] arccy|1 year ago|reply

honeycomb "bubble up"

[+] tomjen3|1 year ago|reply

That seems a good usecase for AI: Its trivial to have it suggest some queries and test if they give interesting results.

[+] valyala|1 year ago|reply

Wide events is a great concept for observability space! This a superset of structured logs and traces. Wide events is basically structured logs, where every log entry contains hundreds of fields with various properties of the log entry. This allows slicing and dicing the collected events by arbitrary subsets of thier fields. This opens an infinite possibilities to obtain useful analytics from the collected events.

Wide events can be stored in traditional databases. But this approach has a few drawbacks:

- Every wide event can have different sets of fields. Such fields cannot be mapped to the classical relational table columns, since the full set of potential fields, which can be seen in wide events, isn't known beforehand.

- The number of fields in wide events is usually quite big - from tens to a few hundreds. If we are going to store them in a traditional relational table, this table will end up with hundreds of columns. Such tables aren't processed efficiently by traditional databases.

- Typical queries over wide events usually refer only a few fields out of hundreds of available fields. Traditional databases usually store every row in a table as a contiguous chunk of data with all the values for all the fields of the row (aka row-based storage). Such a scheme is very inefficient when the query needs to process only a few fields out of hundreds of available fields, since the database needs to read all the hundreds fields per each row and then extract the needed few fields.

It is much better to use analytical databases such as ClickHouse for storing and processing of big volumes of wide events. Such databases usually store values per every field in contiguous data chunks (aka column-oriented storage). This allows reading and processing only the needed few fields mentioned in the query, while skipping the rest of hundreds fields. This also allows efficiently compressing field values, which reduces storage space usage and improves performance for queries limited by disk read speed.

Analytical databases don't resolve the first issue mentioned above, since they usually need creating a table with the pre-defined columns before storing wide events into it. This means that you cannot store wide events with arbitrary sets of fields, which can be unknown before creating the table.

I'm working on a specialized open-source database for wide events, which resolves all the issues mentioned above. It doesn't need creating any table schemas before starting ingesting wide events with arbitrary sets of fields (e.g. it is schemaless). It automatically creates the needed columns for all the fields it sees during data ingestion. It uses column-oriented storage, so it provides query performance comparable to analytical databases. The name of this database is VictoriaLogs. Strange name for the database specialized for efficient processing of wide events :) This is because initially it was designed for storing logs - both plaintext and structured. Later it has been appeared that it's architecture ideally fits wide events. Check it out - https://docs.victoriametrics.com/victorialogs/

[+] bonobocop|1 year ago|reply

Thoughts on stuff like ClickHouse with JSON column support? Less upfront knowledge of columns needed.

[+] oulipo|1 year ago|reply

How is that a "superset" ? From what I gather, it's... just a "JSON-formatted log"? They just decide to put as much data in it as they can and decide that it should be called a "wide event", but it makes no sense... it's just a regular JSON-formatted log, with all the data inside, nothing new?

[+] patrulek|1 year ago|reply

Tldr; just use slog package (structured logs) to log everything and then visualize.

[+] valyala|1 year ago|reply

This works only for Go language, which provides slog package ( https://go.dev/blog/slog ). What about other programming languages?

[+] zahlman|1 year ago|reply

Practitioner of what? What is a "wide event"? In what context is this concept relevant? It took several sentences before I was even confident that this is something to do with programming.

[+] Etheryte|1 year ago|reply

They link to three separate articles right at the start that cover all of this. Not every article needs to start from first principles. You wouldn't expect an article about a new Postgres version to start with what databases are and why someone would need them.

[+] cookie_monsta|1 year ago|reply

I felt like I got the gist after the first two:

> Adopting Wide Event-style instrumentation has been one of the highest-leverage changes I’ve made in my engineering career. The feedback loop on all my changes tightened and debugging systems became so much easier.

[+] djhope99|1 year ago|reply

It’s about observability and strongly related to Honeycombs o11y 2.0 vision.

[+] esafak|1 year ago|reply

"[An Observability] Practitioner's Guide to Wide Events"

That's how I would have titled it.

[+] s1mplicissimus|1 year ago|reply

you just read an advertisement article and some people don't like you pointing that out. hence the downvotes i assume

56 comments