top | item 36344313

Gently Down the Stream – A gentle introduction to Apache Kafka (2021)

88 points| warrenm | 2 years ago |gentlydownthe.stream

14 comments

order
[+] dang|2 years ago|reply
Discussed at the time:

I wrote a children's book / illustrated guide to Apache Kafka - https://news.ycombinator.com/item?id=27541339 - June 2021 (226 comments)

(Reposts are fine after a year or so and links to past threads are just to satisfy extra-curious readers!)

[+] warrenm|2 years ago|reply
thanks - I didn't see it when I searched :)
[+] ramraj07|2 years ago|reply
I’ve found this HN comment to be the single best explanation of the purpose of Kafka for engineers (like me) who never truly appreciated why you’d want it in the first place: https://news.ycombinator.com/item?id=35160555
[+] seer|2 years ago|reply
Wow that’s such an amazing comment. And explains very well what kafka does and why it’s valuable from a technical perspective, but I wanted to add why it’s valuable from “corporate political” perspective. This is how I usually “sell” kafka to new recruits.

Imagine a big corporation - lots and lots of various teams all doing their own thing - business processes, auth, customer admin etc.

When all those micro or macro services want to talk to each other, you definitely don’t want them to all dip their fingers into one large shared database - you want some contractual guarantees between them - graphql / rest with openapi, that kind of thing. And it all works up to a certain scale.

But now imagine this intricate web of network requests … and then one service fails… This can cause a cascade that’s really hard to track.

Or … you have one very reliable service being used by various clients, but suddenly there is one more client that sends 10 times more requests than all the other clients combined - happens all the time in large corps.

And sure there are ways to mitigate both with technical or policy ways, but what kafka offers is a single, ready made solution for all of it.

If the data flowing through a corp does so through kafka, you end up with very strong contractual guarantees about shape of the data both for consumers and producer. Scaling your services becomes mostly a solved problem. Producers of data don’t care too much who or how many consumers they have handling 10x, 100x scale changes becomes if not trivial, at least fairly easy.

What that means in practice is you trade every team being experts in scale and deployment of their services, to needing one really good team to shepherd your kafka cluster and the rest don’t need to deal with it too much.

But hosting kafka at scale is quite tricky - haven’t done it myself, but knew the team that handled it at my previous org and those were top notch guys that still struggled with various things.

Anyway, kafka kinda allows for micro services to scale at big orgs, and that to me is just amazing.

[+] rad_gruchalski|2 years ago|reply
Wow, thank you! Happy to hear it was helpful.
[+] candiddevmike|2 years ago|reply
> Gentle introduction

> Starts talking about scaling via partitions

I think this book could've been shortened or put more details around message creation and retrieval.

[+] SQueeeeeL|2 years ago|reply
You could counter that the whole point of Kafka is scalability. There are many much more trivial solutions for message passing not at scale
[+] AugustoCAS|2 years ago|reply
That's a cute site.

The author forgot to say that teenage otters set `acks=0` when producing messages :D.

[+] xhkkffbf|2 years ago|reply
I still have nightmares about Kafka. I pulled out my hair for two weeks. That product was very appropriately named.