top | item 37800753

Moving Marginalia to a new server

129 points| marginalia_nu | 2 years ago |marginalia.nu | reply

39 comments

order
[+] jeroenhd|2 years ago|reply
> Each operating system essentially gets a fixed allocation of RAM, something like 32-48 GB. This can lead to quite a lot of wasted resources, VM #3 may really need 64 GB instead of 48 GB at a point where VM #4 may have 24 GB to spare.

I don't know what virtualisation the author has in mind, but static RAM allocation isn't a requirement for VMs. VirtIO's memory balloon can be used to dynamically grow and shrink available memory. It's not exactly something you can do willy-nilly, but when a VM has a large amount of free RAM (like 24GB) you can definitely use ballooning to temporarily reassign memory capacity.

Since the author is talking about using Debian, they could opt for Proxmox to handle ballooning for them. Without Proxmox, scripts similar to this (https://github.com/berkerogluu/auto-ballooning-kvm/blob/mast...) could also be used to control memory distribution, though I imagine a server like this needs something a little more sophisticated.

[+] snerbles|2 years ago|reply
Since the author is running dual processors, they will need to account for NUMA domains. That is, half of the RAM is assigned to each CPU, and performance will suffer if a thread on CPU 0 is accessing memory belonging to CPU 1. numactl[0] can be used to bind a process to a specific NUMA node on bare metal, but a hypervisor can also set CPU/memory affinity for a given VM without fiddling about at the process level.

https://linux.die.net/man/8/numactl

[+] marginalia_nu|2 years ago|reply
Interesting. Would this still allow the virtualized machines to manage their own page tables as though native? I can't say I understand how it's implemented, but I'm concerned it might leave the host system with the same page thrashing the choice to use virtualization was intended to avoid.
[+] amluto|2 years ago|reply
> The software is based around memory mapped storage, and Linux’ page fault handler can only put up with so many page faults at any given time.

> A potential way around this is virtualization, to run multiple operating systems on the same machine.

What’s the actual issue? Linux can have trouble servicing many page faults in parallel in a single process (technically mm) due to lock contention. Multiple processes should reduce this contention.

Do you have more details?

[+] marginalia_nu|2 years ago|reply
It's a fairly hypothetical problem, like it may not even be a problem, other than the intuition that having 8 applications all thrashing wildly and competing for the same page table may not be the most performant way of allocating resources.
[+] akdkfe223|2 years ago|reply
Thank you for your hard work on this site! Probably my favorite find on hacker news of the 2020s so far.
[+] zxwrt|2 years ago|reply
Don't forget to consider adding ipv6 support since you are lacking it.
[+] marginalia_nu|2 years ago|reply
Yeah should just be a matter of adding an AAAA record when the migration is done, since I've now got a public IPv6 range.
[+] api|2 years ago|reply
What you mean you're not using Kubernetes, Helm, Redis, Memcached, RabbitMQ, Terraform, three different Apache projects, cloud managed Postgres, S3, and 10 micro services?

How unprofessional! It's like you're building something efficient that isn't going to give massive amounts of money to cloud providers.

[+] marginalia_nu|2 years ago|reply
Haha, indeed. I'm a regular medieval LARPer I am.
[+] jrmg|2 years ago|reply
@marginalia_nu: Definitely not saying this should be a top priority to fix, but I tried to look at the source out of interest, and the Git repo link in “Feel free to poke about in the source code or contribute to the development” on the search page is currently 404ing.
[+] marginalia_nu|2 years ago|reply
Hmm, should be fixed now. I used to run my own git forge, but moved over to github, leaving the git.marginalia.nu domain with a redirect that was supposed to direct over to the github repository, but apparently it didn't quite work for that link for some reason.
[+] smarx007|2 years ago|reply
Nice, congrats! What are the specs of the old server?
[+] marginalia_nu|2 years ago|reply
It was a Ryzen 3900X with 128 GB RAM and a mix of NAS drives and a few enterprise SSDs, so this is a bit of a step up :D
[+] vinnymac|2 years ago|reply
> I’ll also apologize if this post is a bit chaotic.

Loved this, didn’t find it chaotic at all.

Not sure if I missed it, but how are you planning on moving data from the old to the new storage? Do you have any concerns with corruption at that stage (validation)?

[+] marginalia_nu|2 years ago|reply
It's already moved over. The data can be reduced to a fairly compressed form where it's about 1.6 TB in total, so it's easy to just tarball and check with an md5sum that it's the same on both ends.

As described in the post, a lot of the data is also heavily redundant so even if something goes wrong in one or a few places, the missing parts can be reconstructed from the rest.

[+] srhtftw|2 years ago|reply
The article left me wondering about the cost of the new and old servers and the difference in their price/performance.
[+] marginalia_nu|2 years ago|reply
New server is roundabout $20,000 if you were to pay for the free CPU upgrade. Old server was about $5,000.

It's really hard to say how much faster it is going to be, but it's at definitely much faster than the old server. I was not really having performance problems before either, though. The main obstacle was just dealing with insane volumes of data with limited RAM and disk.

[+] daoudc|2 years ago|reply
This is so cool, the new machine is a beast!

Do you know roughly how many pages you have in your index?

[+] marginalia_nu|2 years ago|reply
I'm at about 164 million docs now. So hopefully this will take me into the billions :D
[+] perfmode|2 years ago|reply
How many TB of data? 88 TB?
[+] marginalia_nu|2 years ago|reply
70 TB usable space when partitions are in place and harddrive manufacturer math is accounted for.