(no title)
man8alexd | 22 days ago
It is fashionable to disable swap nowadays because everyone has been bitten by a swap thrashing event. Read other comments.
> A memory leak does not lead to thrashing. By definition if you have a leak the memory isn't used, so it goes to swap and stays there.
You assume that leaked memory is inactive and goes to swap. This is not true. Chrome, Gnome, whatever modern Linux desktop apps leak a lot, and it stays in RSS, pushing everything else into swap.
> if the leak continues is swap eventually fills up, and then the OOM killer comes out to play
You assume that the OOM killer comes out to play in time. The larger the swap, the longer it takes for the OOM killer to trigger, if ever, because the kernel OOM-killer is unreliable, so we have a collection of other tools like earlyoom, Facebook oomd and systemd-oomd.
> I've logged into systems that were thrashing
It means that the system wasn't out of memory yet. When it is unresponsive, you won't be able to enter commands into an already open shell. See other comments here for examples.
> The OOM killer on the other hand leaves the system in some undefined state. Some things are dead. Maybe you got lucky and it was just Chrome that was killed, but maybe your sound, bluetooth, or DNS daemons have gone AWOL and things just behave weirdly.
This is not true. By default, the kernel OOM-killer selects one single largest (measured by its RSS+swap) process in the system. By default, systemd, ssh and other socket-activated systemd units are protected from OOM.
rstuart4133|21 days ago
If they disable swap they will get hit by the OOM killer. You seem to prefer it over slowing down. I guess that's a personal preference. However, I think it is misleading to say people are being bitten by a swap thrashing event. The "event" was them running out of RAM. Unpleasant things will happen as a consequence. Blaming thrashing or the OOM killer for the unpleasant things is misleading.
> You assume that leaked memory is inactive and goes to swap. This is not true.
At best, you can say "it's not always true". It's definitely gone to swap in every case I've come across.
> It means that the system wasn't out of memory yet.
Of course it wasn't out of memory. It had lots of swap. That's the whole point of providing that swap - so you can rescue it!
> When it is unresponsive, you won't be able to enter commands into an already open shell.
Again that's just plain wrong. I have entered commands into a system is trashing. It must work eventually if thrashing is the only thing going on, because when the system thrashes the CPU utilization doesn't go to 0. The CPU is just waiting for disk I/O after all, and disk I/O is happening at a furious pace. There's also a finite amount of pending disk I/O. Provided no new work is arriving (time for a cup of coffee?) it will get done, and the thrashing will end.
If the system does die other things have happened. Most likely the OOM killer if they follow your advice, but network timeouts killing ssh and networked shares are also a thing. If you are using Windows or MacOS, the swap file can grow to fill most of free disk space, so you end up with a double whammy.
Which brings me to another observation. In desktop OS's, the default is to provide it, and lots of it. In Windows swap will grow to 3 times RAM. This is pretty universal - even Debian will give you twice RAM for small systems. The people who decided on that design choice aren't following some folk law on they read in some internet echo chamber. They've used real data, they've observed when swapping starts being used systems do slow down giving the user some advance warning, when thrashing starts systems can recover rather than die which gives the user opportunity to save work. It is the right design tradeoff IMO.
> By default, the kernel OOM-killer selects one single largest (measured by its RSS+swap) process in the system.
Yes, it does. And if it is a single large process hogging memory you are in luck - the OOM killer will likely do the right thing. But Chrome (and now Firefox) is not a single large process. Worse if the out of memory is caused by say someone creating zillions of logins, they are so small they are the last thing the OOM killer chooses. Shells, daemons, all sorts of critical things go first. The "largest" process first is just a heuristic, one which can be and in my case has been wrong. Badly wrong.
man8alexd|19 days ago
An unresponsive system is not a slowdown. You keep ignoring that.
>> You assume that leaked memory is inactive and goes to swap. This is not true.
> At best, you can say "it's not always true".
You skipped my sentence that was specifying the scope when "it's not always true", and now you pretend that I'm making a categorical generalized statement. This is a silly attempt at a "strawman".
>> It means that the system wasn't out of memory yet.
> Of course it wasn't out of memory. It had lots of swap. That's the whole point of providing that swap - so you can rescue it!
Swap is not RAM. When the free RAM is below the low watermark, the kernel switches to direct reclaim and blocks tasks that require free memory pages. Blocking of tasks happens regardless of swap. If you are able to log in and fork a new process, the system is not below the low watermark.
>> When it is unresponsive, you won't be able to enter commands into an already open shell.
> Again that's just plain wrong.
You are in denial.
> Provided no new work is arriving (time for a cup of coffee?) it will get done, and the thrashing will end.
This is false. A system can stay unresponsive much longer than a cup of coffee. There is no guarantee that the thrashing will end in a reasonable time.
> even Debian will give you twice RAM for small systems.
> The people who decided on that design choice aren't following some folk law on they read in some internet echo chamber.
That 2x RAM rule is exactly that - an old folk law. You can find it in SunOS/AIX/etc manuals or Usenet FAQs from the 80s and early 90s, before Linux existed.
> They've used real data.
You're hallucinating like an LLM. No one did any research or measurements to justify that 2x rule in Linux.