It should also be mentioned, Linux Load Average is a complex beast[1]. However, a general rule of thumb that works for most environments is:
You always want the load average to be less than the total number of CPU cores. If higher, you're likely experiencing a lot of waits and context switching.
If you like this post, I would recommend “BPF Performance Tools” and “Systems Performance: Enterprise and the Cloud” by Brenden Gregg.
I have pulled out a few miracles using these tools (identifying kernel bottlenecks or profiling programs using ebpf) and it has been well worth the investment to read through the books.
Literally did miracles at my last job with the first book and that got me my current job, where I also did some impressive proving which libraries had what performance with it again... Seriously valuable stuff.
It is excellent and contains most things you could need. Downside is that it isn't yet a standard tool so you need to get it installed across your fleet
The iostat command has always been important to observe HDD/SDD latency numbers.
Especially SSDs are treated like magic storage devices with infinite IOPS at Planck-scale latency.
Until you discover that SSDs that can do 10GB/s don't do nearly so well (not even close) when you access them in a single thread with random IOPS, with queue depth of 1.
That's where you start down the eBPF rabbit hole with bcc/biolatency and other block device histogram tools. Further, the cache hit rate and block size behavior of the SSD/NVME drive can really affect things if, say, your autonomous vehicle logging service uses MCAP with a chunk size much smaller than a drive block... Ask me how I know
After this article was written, `free -m` on many systems started to have an "available" column that shows the sum of reclaimable and free memory. It's nicer than the "-/+" section shown in this old article.
$ free -m
total used free shared buff/cache available
Mem: 3915 2116 1288 41 769 1799
Swap: 974 0 974
janvdberg|7 months ago
There is no shorter command to show uptime, load averages (1/5/15 minutes), logged in users. Essential for quick system health checks!
mmh0000|7 months ago
You always want the load average to be less than the total number of CPU cores. If higher, you're likely experiencing a lot of waits and context switching.
[1] https://www.brendangregg.com/blog/2017-08-08/linux-load-aver...
chasil|7 months ago
https://nicolargo.github.io/glances/
I have also hacked basic top to add database login details to server processes.
Propelloni|7 months ago
__turbobrew__|7 months ago
I have pulled out a few miracles using these tools (identifying kernel bottlenecks or profiling programs using ebpf) and it has been well worth the investment to read through the books.
yankcrime|7 months ago
wcunning|7 months ago
sour-taste|7 months ago
It is excellent and contains most things you could need. Downside is that it isn't yet a standard tool so you need to get it installed across your fleet
benreesman|7 months ago
louwrentius|7 months ago
Especially SSDs are treated like magic storage devices with infinite IOPS at Planck-scale latency.
Until you discover that SSDs that can do 10GB/s don't do nearly so well (not even close) when you access them in a single thread with random IOPS, with queue depth of 1.
wcunning|7 months ago
mortar|7 months ago
Previous discussions: https://news.ycombinator.com/item?id=10654681 https://news.ycombinator.com/item?id=10652076
microtonal|7 months ago
tomhow|7 months ago
Linux Performance Analysis in 60,000 Milliseconds - https://news.ycombinator.com/item?id=10652076 - Nov 2015 (11 comments)
Linux Performance Analysis - https://news.ycombinator.com/item?id=10654681 - Dec 2015 (82 comments)
Linux Performance Analysis in 60k Milliseconds (2015) [pdf] - https://news.ycombinator.com/item?id=44070741 - May 2025 (1 comment)
5pl1n73r|7 months ago
whalesalad|7 months ago
fduran|7 months ago
CodeCompost|7 months ago
Wait a minute. I thought Netflix famously ran FreeBSD.
craftkiller|7 months ago
drewg123|7 months ago
unknown|7 months ago
[deleted]
ImPostingOnHN|7 months ago
rkachowski|7 months ago
wcunning|7 months ago
emmelaich|7 months ago
mmh0000|7 months ago
Today, you'd want something like:
Prometheus + Node Exporter [1]
[1] https://github.com/prometheus/node_exporter
unknown|7 months ago
[deleted]
appleaday1|7 months ago
AnyTimeTraveler|7 months ago