top | item 46909217

(no title)

man8alexd | 24 days ago

> There is no penalty for giving a system too much swap (apart from disk space)

There is a huge penalty for having too much swap - swap thrashing. When the active working set exceeds physical memory, performance degrades so much that the system becomes unresponsive instead of triggering OOM.

> Monitor it occasionally, particularly if your system slows down.

Swap doesn't slow down the system. It either improves performance by freeing unused memory, or it is a completely unresponsive system when you run out of memory. Gradual performance degradation never happens.

> give your system so much swap you are sure it exceeds the size of stuff that's running but not used. 4Gb is probably fine for a desktop.

Don't do this. Unless hibernation is used, you only need a few hundred megabytes of free swap space.

discuss

rstuart4133|23 days ago

> There is a huge penalty for having too much swap - swap thrashing.

Thrashing is the penality for using too much swap. I was saying there is no penality for having a lot of swap available, but unused.

Although trashing is not something you want happening, if your system is thrashing with swap the alternative without having it available is the OMM killer laying waste to the system. Out of those two choices I prefer the system running slowly.

> Gradual performance degradation never happens.

Where on earth did you get that from? It's wrong most of the time. The subject was very well researched in the late 1960's and 1970's. If load ramps up gradually you get a gradual slowdown until the working set is badly exceeded, then it falls off a cliff. This is a modern example, but there are lots of papers from that era showing the usual gradual response, followed by falling off a cliff: https://yeet.cx/r/ayNHrp5oL0. A seminal paper on the subject: https://dl.acm.org/doi/pdf/10.1145/362342.362356

The underlying driver for that behaviour is the disk system being overwhelmed. Say you have 100 web workers that that spend a fair chunk of their time waiting for networked database requests. If they all fit in memory the response is as fast as it can be. Once swapping starts latency increases gradually as more and more workers are swapped in and out while they wait for clients and the database. Eventually the increasing swapping hits the disk's IOPS limit, active memory is swapped out and performance crashes.

The only reason I can think the gradual slow down is not obvious to you is that modern SSD's are so fast, the initial degradation it's not noticeable to desktop user.

> Don't do this. Unless hibernation is used, you only need a few hundred megabytes of free swap space.

A you seem to recognise having lots of swap on hand and unused, even it it's terabytes of it does not effect performance. The question then becomes: what would you prefer to happen in those rare times when swap usage exceeds the optimal few hundred megabytes? Your options are get your desktop app randomly killed by the OOM killer and perhaps lose your work, or the system slows to a crawl and you take corrective action like closing the offending app. When that happens it seems it's popular to blame the swap system for slowing their system down because they temporarily exceeded the capacity of their computer.

man8alexd|22 days ago

> Thrashing is the penality for using too much swap. I was saying there is no penality for having a lot of swap available, but unused.

Unless you overprovision memory on a machine or have carefully set cgroup limits for all workloads, you are going to have a memory leak and your large unused swap is going to be used, leading to swap thrashing.

> the OMM killer laying waste to the system. Out of those two choices I prefer the system running slowly.

In a swap thrashing event, the system isn't just running slowly but totally unresponsive, with an unknown chance of recovery. The majority of people prefer OOM killer to an unresponsive system. That's why we got OOM killer in the first place.

> If load ramps up gradually you get a gradual slowdown until the working set is badly exceeded, then it falls off a cliff.

Random access latency difference between RAM and SSD is 10^3. When the active working set spills out into swap, liner increase of swap utilization leads to exponential performance degradation. Assuming random access, simple math gives that 0.1% excess causes a 2x degradation, 1% - 10x degradation, 10% - 100x degradation.

> A seminal paper on the subject: https://dl.acm.org/doi/pdf/10.1145/362342.362356

This paper discusses measuring stable working sets and says nothing about performance degradation when your working set increases.

> https://yeet.cx/r/ayNHrp5oL0.

WTF is this graph supposed to demonstrate? Some workload went from 0% to 100% of swap utilization in 30 seconds and got OOM-killed. This is not going to happen with a large swap.

> Once swapping starts latency increases gradually as more and more workers are swapped in and out while they wait for clients and the database

In practice, you never see constant or gradually increasing swap I/O in such systems. You either see zero swap I/O with occasional spikes due to incoming traffic or total I/O saturation from swap thrashing.

> Your options are get your desktop app randomly killed by the OOM killer and perhaps lose your work, or the system slows to a crawl and you take corrective action like closing the offending app.

You seem to be unaware that swap thrashing events are frequently unrecoverable, especially with a large swap. It is better to have a typical culprit like Chrome OOM-killed than to press the reset button and risk filesystem corruption.