top | item 39355998

Show HN: Kubetail – Web-based real-time log viewer for Kubernetes

70 points| andres | 2 years ago |github.com

Hi Everyone!

Kubetail is a new project I've been working on. It's a private, real-time log viewer for Kubernetes clusters. You deploy it inside your cluster and access it via a web browser, like the Kubernetes Dashboard.

Using kubetail, you can view logs in real-time from multiple Workload containers simultaneously. For example, you can view all the logs from the Pod containers running in a Deployment and the UI will update automatically as the pods come into and out of existence. Kubetail uses your in-cluster Kubernetes API so your logs are always in your possession and it's private by default.

Currently you can filter logs based on node properties such as availability zone, CPU architecture or node ID and we have plans for a lot more features coming up.

Here's a live demo: https://www.kubetail.com/demo

Check it out and let me know what you think!

Andres

33 comments

order
[+] akhenakh|2 years ago|reply
There is an existing project named kubetail, which is quite popular 3.2K starts https://github.com/johanhaleby/kubetail
[+] andres|2 years ago|reply
That project is a bash script to tail Kubernetes logs from multiple pods at the same time. The name collision is a bummer. I found out about it a while after I bought the domain names (kubetail.com, kubetail.dev, etc.), got the social handles and invested a lot of time into coding and branding. If this project is successful and the naming is confusing for users we'll figure out a solution, if not then it's moot.
[+] swozey|2 years ago|reply
This is really nice. I usually use stern for logs ON a cluster but to aggregate all of the logs we usually use something like fluentd to elasticsearch, which when seeing something like this is super complex for what we usually use it for but it does of course let you search everything.
[+] andres|2 years ago|reply
Thanks! Kubetail is super lightweight and requires zero maintenance so hopefully you can use it for some of your real-time logging needs. Let me know if you have any feature requests.
[+] mass_and_energy|2 years ago|reply
Ah yes the ol EFK stack, I almost miss carefully tuning my shard balancing to get the most outta my cluster
[+] remram|2 years ago|reply
I don't know, at that point I think I'd rather use an actual log aggregation system like Loki or Kibana. That way I can search, including on pods that are now gone.

The niche between "easy to get but single-container and no search" on one side and "install with helm but search all containers including historical with full-text and metrics" on the other... seems like a tiny niche to me.

edit: oh you need to install Kubetail cluster-wide too. At least no DaemonSet I guess.

[+] 392|2 years ago|reply
I have tried unsuccessfully to get those setup on my k8s twice. The default configs don't seem sufficient for even a small shop's scale, I think I must be holding them wrong. But I'm ready to try this because it's all I need if it works.
[+] piterrro|2 years ago|reply
Congrats on the launch, nice project! I recently launched https://logdy.dev (OSS: https://github.com/logdyhq/logdy-core) which attempts to address the problem but in a more wide space: any kind of process stdout -> web UI. You can run it with k8s (kubectl logs -f). I'm actually writing a blog post about it as we speak and will definitely mention kubetail as well. Ofc, your project addresses the problem more specifically, I just thought to mention Logdy in case somebody is looking for a swiss-knife solution for all kinds of logs.
[+] andres|2 years ago|reply
Looks interesting! It's like tail -f on steroids.
[+] smock|2 years ago|reply
How is privacy enforced? Are you planning on maintaining this:

https://github.com/kubetail-org/kubetail

as an open source repo?

[+] andres|2 years ago|reply
Kubetail runs in your cluster and uses the cluster’s own Kubernetes API to retrieve logs so the data never leaves your possession. Yes! Kubetail will always be open source. I think it’s important for users to know exactly what code is running inside their clusters especially if they’re granting it access to their private data.
[+] flashgordon|2 years ago|reply
Wow what a small world. I've been looking to build a tool EXACTLY like this (even after seeing Johan's kubetail project) but kept thinking it was too obvious that somebody must have already built it or kXs ecosystem already has something like this and I was just too noob to find it! Only diff - my fe was going to be htmx :). Kudos.

I suppose it is never too late :)

[+] nodesocket|2 years ago|reply
Awesome project. I run a Kubernetes cluster on my homelab on 4x Raspberry Pi 4 Bs. Gonna set this up tonight.

I believe there is no persistence, or does it cache in local storage or anything on the client? Would be awesome to have that option for client side storage for perhaps 24 hours.

[+] andres|2 years ago|reply
Awesome! I'm excited to hear how it goes. If you have any feedback please send me an email ([email protected]).

Currently, there's no persistence. I'll think about how to enable client-side.

[+] hobofan|2 years ago|reply
Demo worked fine until I added the kubetail-demo pod as a source, which crashed my browser tab. Copying the same URL into a new tab loaded the page but stuck in "Loading logs" until it also crashed the tab.
[+] andres|2 years ago|reply
Sorry about that! It's fixed now. Currently, kubetail displays the entire log history by default which can cause the frontend to hang if there's a lot of data. In the case of the kubetail-demo pod, there were a lot of messages from a user continuously retrying to make a websocket connection and the quantity of those messages crashed your tab. I've disabled logging for the app instance so viewing those pod logs won't crash your browser again. Upgrading the kubetail frontend so it can handle more data is next up on the to-do list.
[+] distracteddev90|2 years ago|reply
Does this work well when set up to use my local machine and my personal credentials?
[+] andres|2 years ago|reply
Can you give me more details? Do you want to run kubetail locally but pull logs from your cluster remotely? Yes, this is possible.
[+] cryptonector|2 years ago|reply
I've written a [proprietary, though I have permission to open source it] `tailfhttpd`, which is a tiny, trivial HTTP/1.1 server that supports only `HEAD`s and `GET`s of regular files, but with a twist:

  - it supports `ETag`s, with ETags derived from
    a file's st_dev, st_ino, and inode generation
    number
  - it supports setting *some* response headers via
    xattrs (e.g., ETag, Content-Type, Vary,
    Cache-Control, etc.)
  - it supports conditional requests (i.e.,
    `If-Match:`, `If-None-Match:`,
    `If-Modified-Since:`)
  - it supports `Range:` requests
  - for `Range: bytes=${offset}-` `GET`s
    the response does not finish (i.e.,
    final chunk is not sent) until one of
     - the file is unlinked
     - the file is renamed
     - the server is terminated
    using inotify to find out about file
    unlinks/renames
It does this using `epoll`, `inotify`, and `sendfile()`, with multiple fully-evented, async-I/O-using processes, each process being single-threaded. It is written in C in continuation passing style (CPS) with tiny continuations, so its memory footprint per client is also tiny. As a result it is blazingly fast, though it needs to be fronted with a reverse proxy for HTTPS (e.g., Nginx, Envoy, ...), sadly, but maybe I could teach it to use kssl.

I use it for tailing logs remotely, naturally, and as a poor man's Kafka. Between regular file byte offsets, ETags, and conditional requests one can build a reliable event publication system with this `tailfhttpd`. For example, and event stream can name the next instance ({local-part, ETag}) then be renamed out of the way to end in-progress `GET`s, and clients can resume from the new file.

With a few changes it could "tail" (watch) directories, and even allow `POST`ing events (which could be done by writing to a pipe the reader of which routes events to files that get served by `tailfhttpd`).

Because `tailfhttpd` just serves files, and because of the ETag thing, conditional requests, and xattrs, it's very easy to build more complex systems on top of it -- even shell scripts will suffice.

This chunked-encoding, "hanging-GET" thing is so unreasonably effective and cheap that I'm surprised how few systems support it.

I've visions of rewriting it in Rust and supporting H2 and especially H3/QUIC to reduce the per-client load (think of TCP TCBs and buffers) even more, and using io_uring instead of epoll for even better performance.

Oh, and this approach is fully standards-compliant. It's just a chunked-encoding, indefinite-end ("hanging") GET with all the relevant (but optional) behaviors (ETags, conditional requests, range requests, even the right end of the byte-range being left unspecified is within spec!).