top | item 36564909

(no title)

danappelxx | 2 years ago

IMO if you’re concerned about performance and yet are deploying databases this way — mmap should not even be on the radar.

discuss

order

charcircuit|2 years ago

How would containers even hurt performance? How does the database no longer having the ability to see other processes on the machine somehow make it slower?

crabbone|2 years ago

There are many "holes" in these containers.

1. fsync. You cannot "divide" it between containers. Whoever does it, stalls I/O for everyone else.

2. Context switches. Unless you do a lot of configurations outside of container runtime, you cannot ensure exclusive access to the number of CPU cores you need.

3. Networking has the same problem. You would either have to dedicate a whole NIC or SRI-OV-style virtual NIC to your database server. Otherwise just the amount of chatter that goes on through the control plane of something like Kubernetes will be a noticeable disadvantage. Again, containers don't help here, they only get in the way as to get that kind of exclusive network access you need more configuration on the host, and, possible an CNI to deal with it.

4. kubelet is not optimized to get out of your way. It needs a lot of resources and may spike, hindering or outright stalling database process.

5. Kubernetes sucks at managing memory-intensive processes. It doesn't work (well or at all) with swap (which, again, cannot be properly divided between containers). It doesn't integrate well with OOM killer (it cannot replace it, so any configurations you make inside Kubernetes are kind of irrelevant, because system's OOM killer will do how it pleases, ignoring Kubernetes).

---

Bottom line... Kubernetes is lame from infrastructure perspective. It's written for Web developers. To make things appear simpler for them, while sacrificing a lot of resources and hiding a lot of actual complexity... which is impossible to hide, and which, in an even of failure will come to bite you. You don't want that kind of program near your database.

danappelxx|2 years ago

I’ll assume the worst case:

- lots of containers running on a single host

- containers are each isolated in a VM (aka virtualized)

- workloads are not homogenous and change often (your neighbor today may not be your neighbor tomorrow)

I believe these are fair assumptions if you’re running on generic infrastructure with kubernetes.

In this setup, my concerns are pretty much noisy neighbors + throttling. You may get latency spikes out of nowhere and the cause could be any of:

- your neighbor is hogging IO (disk or network)

- your database spawned too many threads and got throttled by CFS

- CFS scheduled your DBs threads on a different CPU and you lost your cache lines

In short, the DB does not have stable, predictable performance, which are exactly the characteristics you want it to have. If you ran the DB on a dedicated host you avoid this whole suite of issues.

You can alleviate most of this if you make sure the DB’s container gets the entire host’s resources and doesn’t have neighbors.