> Each operating system essentially gets a fixed allocation of RAM, something like 32-48 GB. This can lead to quite a lot of wasted resources, VM #3 may really need 64 GB instead of 48 GB at a point where VM #4 may have 24 GB to spare.
I don't know what virtualisation the author has in mind, but static RAM allocation isn't a requirement for VMs. VirtIO's memory balloon can be used to dynamically grow and shrink available memory. It's not exactly something you can do willy-nilly, but when a VM has a large amount of free RAM (like 24GB) you can definitely use ballooning to temporarily reassign memory capacity.
Since the author is talking about using Debian, they could opt for Proxmox to handle ballooning for them. Without Proxmox, scripts similar to this (https://github.com/berkerogluu/auto-ballooning-kvm/blob/mast...) could also be used to control memory distribution, though I imagine a server like this needs something a little more sophisticated.
Since the author is running dual processors, they will need to account for NUMA domains. That is, half of the RAM is assigned to each CPU, and performance will suffer if a thread on CPU 0 is accessing memory belonging to CPU 1. numactl[0] can be used to bind a process to a specific NUMA node on bare metal, but a hypervisor can also set CPU/memory affinity for a given VM without fiddling about at the process level.
Interesting. Would this still allow the virtualized machines to manage their own page tables as though native? I can't say I understand how it's implemented, but I'm concerned it might leave the host system with the same page thrashing the choice to use virtualization was intended to avoid.
> The software is based around memory mapped storage, and Linux’ page fault handler can only put up with so many page faults at any given time.
> A potential way around this is virtualization, to run multiple operating systems on the same machine.
What’s the actual issue? Linux can have trouble servicing many page faults in parallel in a single process (technically mm) due to lock contention. Multiple processes should reduce this contention.
It's a fairly hypothetical problem, like it may not even be a problem, other than the intuition that having 8 applications all thrashing wildly and competing for the same page table may not be the most performant way of allocating resources.
What you mean you're not using Kubernetes, Helm, Redis, Memcached, RabbitMQ, Terraform, three different Apache projects, cloud managed Postgres, S3, and 10 micro services?
How unprofessional! It's like you're building something efficient that isn't going to give massive amounts of money to cloud providers.
@marginalia_nu: Definitely not saying this should be a top priority to fix, but I tried to look at the source out of interest, and the Git repo link in “Feel free to poke about in the source code or contribute to the development” on the search page is currently 404ing.
Hmm, should be fixed now. I used to run my own git forge, but moved over to github, leaving the git.marginalia.nu domain with a redirect that was supposed to direct over to the github repository, but apparently it didn't quite work for that link for some reason.
> I’ll also apologize if this post is a bit chaotic.
Loved this, didn’t find it chaotic at all.
Not sure if I missed it, but how are you planning on moving data from the old to the new storage? Do you have any concerns with corruption at that stage (validation)?
It's already moved over. The data can be reduced to a fairly compressed form where it's about 1.6 TB in total, so it's easy to just tarball and check with an md5sum that it's the same on both ends.
As described in the post, a lot of the data is also heavily redundant so even if something goes wrong in one or a few places, the missing parts can be reconstructed from the rest.
New server is roundabout $20,000 if you were to pay for the free CPU upgrade. Old server was about $5,000.
It's really hard to say how much faster it is going to be, but it's at definitely much faster than the old server. I was not really having performance problems before either, though. The main obstacle was just dealing with insane volumes of data with limited RAM and disk.
[+] [-] jeroenhd|2 years ago|reply
I don't know what virtualisation the author has in mind, but static RAM allocation isn't a requirement for VMs. VirtIO's memory balloon can be used to dynamically grow and shrink available memory. It's not exactly something you can do willy-nilly, but when a VM has a large amount of free RAM (like 24GB) you can definitely use ballooning to temporarily reassign memory capacity.
Since the author is talking about using Debian, they could opt for Proxmox to handle ballooning for them. Without Proxmox, scripts similar to this (https://github.com/berkerogluu/auto-ballooning-kvm/blob/mast...) could also be used to control memory distribution, though I imagine a server like this needs something a little more sophisticated.
[+] [-] snerbles|2 years ago|reply
https://linux.die.net/man/8/numactl
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] amluto|2 years ago|reply
> A potential way around this is virtualization, to run multiple operating systems on the same machine.
What’s the actual issue? Linux can have trouble servicing many page faults in parallel in a single process (technically mm) due to lock contention. Multiple processes should reduce this contention.
Do you have more details?
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] akdkfe223|2 years ago|reply
[+] [-] zxwrt|2 years ago|reply
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] api|2 years ago|reply
How unprofessional! It's like you're building something efficient that isn't going to give massive amounts of money to cloud providers.
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] jrmg|2 years ago|reply
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] smarx007|2 years ago|reply
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] vinnymac|2 years ago|reply
Loved this, didn’t find it chaotic at all.
Not sure if I missed it, but how are you planning on moving data from the old to the new storage? Do you have any concerns with corruption at that stage (validation)?
[+] [-] marginalia_nu|2 years ago|reply
As described in the post, a lot of the data is also heavily redundant so even if something goes wrong in one or a few places, the missing parts can be reconstructed from the rest.
[+] [-] srhtftw|2 years ago|reply
[+] [-] marginalia_nu|2 years ago|reply
It's really hard to say how much faster it is going to be, but it's at definitely much faster than the old server. I was not really having performance problems before either, though. The main obstacle was just dealing with insane volumes of data with limited RAM and disk.
[+] [-] daoudc|2 years ago|reply
Do you know roughly how many pages you have in your index?
[+] [-] marginalia_nu|2 years ago|reply
[+] [-] perfmode|2 years ago|reply
[+] [-] marginalia_nu|2 years ago|reply