The curse of scalable technology

[+] sseagull|2 years ago|reply

Slightly ranty: Ugh I've been feeling this recently in a few areas. I think a lot of it comes down to imprecise technology and assumptions that are made from that terminology.

1.) We have a very large, powerful server we are using for webapps/database. I wanted a simple way to "orchestrate" containers on it (from a friendly web interface or something). I've come to the conclusion that such a solution does not exist - it's either CLI (docker/docker compose) or run kubernetes (which I tried, and got it working, but it's too complicated for me right now, even on one server. Maybe someday)

2.) I want to aggregate various logs from the apps running on the server, and be able to visualize them in something like Grafana. Some currently log to their own postgres database, others to a file. The answer to this is to install half a dozen (or more) services, each with their own config language, quirks, and 200 page manuals, and hook it all together. The good news is I can use a "simple" config since I have less than 100GB of logs a day (wtf?, I have more like 50MB a day))

There seems to be a completely missing middle class of software/devops/sysadmin information. It's either toy programs or "web-scale" 1000 node clusters.

But coming back to the article, it's really frustrating to try to talk to others or find answers. "I want to aggregate logs" causes people to think my logging needs are bigger than they really are. Same with "container orchestration". And then I get told I'm doing it wrong (which believe me, under my current constraints, is the best we can do). I guess overall I wish people would respect my current constraints.

[+] sh34r|2 years ago|reply

If you're locked into containers as the solution, the answer to your problem is docker-compose. But verify that assumption first. Why containers and not VMs? VMs solved this particular problem years before anyone had ever heard of Docker. You don't need containers if you're just splitting up a single-server monolith. I don't know if I'd call them "friendly," but basically every VMM has a GUI. Docker's the easiest way to solve the runs-on-my-machine problem, but Vagrant isn't much harder.

k8s is a non-starter for this use case. So incredibly overcomplicated, and what did you think you were getting for that complexity? It's for large-scale deployments. You have one server. Resume driven development is getting out of control...

If you don't have a dedicated devops team, or you're not using a managed k8s service like AKS, just don't. Please don't. Stop spreading the madness to these beautifully simple environments. KISS: keep it simple, stupid. Running k8s because Docker CLI is too hard, is like learning to fly a F-16 because riding a bike is too hard.

[+] Jedd|2 years ago|reply

Going into engineer-minded solution-focus mode and ignoring the meta here - Hashicorp Nomad is lightweight wrapper around docker, works just fine on a single-server, GUI or CLI at your discretion, automation options scale nicely.

Logging - look at Loki, also by Grafana Corp, will run in a container, Nomad has a docker driver extension to send logs to a Loki endpoint (podman has some sharper edges, but probably less so on a single-host setup).

EDIT: I haven't updated this in a while, but https://github.com/jedd/nomad-recipes/blob/master/loki.nomad gives a taste for a very basic Loki job under Nomad. In terms of prereqs, you'd need docker & nomad running, and have a persistent volume (to local disk) configured in Nomad.

[+] ricardobeat|2 years ago|reply

Did you consider MRSK[1], k3s[2], or dokku[3]? They are all significantly simpler to operate than Kubernetes, curious to hear your take.

On logs, I agree and have looked for the same. A simple way to aggregate logs in one machine, heck it could even be running SQLite, and query via a web UI. Doesn't seem to exist for this scale.

[1] https://github.com/mrsked/mrsk [2] https://k3s.io/ [3] https://dokku.com/

[+] bsnnkv|2 years ago|reply

I'm an DevOps/SRE guy by trade, spend my days wrangling supermassive Kubernetes clusters etc. I've come to the conclusion that the "middle class" is especially well covered by NixOS. Generally, for any situation where you're running a bunch of stuff on one big ol' server, NixOS is going to be the least stressful and most productive way to go once you become familiar with the ecosystem.

[+] lgas|2 years ago|reply

> I wanted a simple way to "orchestrate" containers on it (from a friendly web interface or something). I've come to the conclusion that such a solution does not exist - it's either CLI (docker/docker compose) or run kubernetes

Have you tried https://www.portainer.io/ ?

[+] code_biologist|2 years ago|reply

I wanted a simple way to "orchestrate" containers on it

I recently started to use Portainer for this. It seems pretty serviceable.

[+] spiffytech|2 years ago|reply

My favorite observation on this topic is that annual industry surveys (Stack Overflow survey, State of JS/CSS) consistently report around a quarter to a third of respondents work in companies of 20 people or fewer.

Companies that size I've worked at often had 2–5 technical staff total. That's where I've spent most of my career.

It contrast, some people feel a project with 25 developers is best described as having "only" 25. And I know that's not the top of the scale.

It's easy to imagine how many choices I'd make differently if I had too many people to reach organic consensus, or could count on a half-dozen team members leaving and getting replaced every year, or couldn't count on a certain baseline of skill or technical taste.

This stuff doesn't really come up when we talk online. We just hand each other assertions without context - "K8s is overkill, you should use systemd on bare metal", or "your app will collapse if you use the DB as a queue", etc.

I think the way we discuss our choices is worse for not clarifying our assumptions about the environment.

[+] perrygeo|2 years ago|reply

There seems to be this notion that we can make technical choices in a vacuum, that there are inherent qualities which universally make X better than Y. The job is then to search out the expert assessment on the matter. Without consideration of the software dimensions mentioned here, you're basically giving up on engineering.

It's a knowledge gap problem, which is precisely why it's comforting to seek out and lean on the consensus opinion. The alternative takes work: filling that knowledge gap with data from empirical observation and logical analysis. Well-designed experiments? Well-specified requirements? Design? Nobody has time for that /s. It's much easier to say "k8s bad 'cause I read it on HN"

[+] jiggawatts|2 years ago|reply

> endless debates where we talk past each other

Ah yes.

My favourite is when I look at a system that is slow as molasses even for one user, and the predictable refrain is: “we can scale up!”

Adding more lanes to a road with a speed limit of 5 doesn’t fix the problem of each car going slowly.

[+] hinkley|2 years ago|reply

Half of programming is making excuses so we don’t have to take a hard look at the other half.

Let’s go on an adventure instead of having a sober conversation about what should be table stakes skills.

[+] AdieuToLogic|2 years ago|reply

Much of this article's concerns can be addressed by a handful of tenets:

- spend time to understand the problem being solved.

- research applicable technologies for the problem which align closest with what the team knows.

- keep an open mind for other applicable technologies, but do not incorporate them without cause.

- accept that there are innumerable alternate ways to solve any given problem, but the time to solve is finite.

- do not emotionally attach oneself to the current solution so that other approaches can be objectively considered and employed when beneficial.

- alter technology used when benefit is identified and risk to success is minimized.

[+] bettercallsalad|2 years ago|reply

A relevant recent article on how Prime Video did something interesting come out of some of these curses. https://www.primevideotech.com/video-streaming/scaling-up-th...

And thread https://news.ycombinator.com/item?id=35853148

[+] tbrownaw|2 years ago|reply

The more different things a tool can be used for, the less context can be assumed from knowing that that tool is being used.

[+] FpUser|2 years ago|reply

Scalability does not really exist in a general case as soon as there are both reads and writes to the same data. There are various tricks exploiting particular properties of particular business case. Mostly physical or temporal data update sharding that sacrifice accessibility.

[+] snowman647|2 years ago|reply

This is mostly the trick - today you start pet project in python, in a year you need to scale it on 1M clients. No one knows anything about what is right technology as requirements change too quickly.

22 comments