top | item 7124720

Linux server monitoring tools

248 points| adionditsak | 12 years ago |aarvik.dk | reply

Four tools to get an overview and interact with what is going on with your Linux machine.

78 comments

order
[+] profquail|12 years ago|reply
These tools aren't limited to Linux; if you're running FreeBSD:

* http://www.freshports.org/sysutils/htop/

* http://www.freshports.org/sysutils/py-glances/

* http://www.freshports.org/sysutils/apachetop/

Some others mentioned in the comments:

* http://www.freshports.org/net-mgmt/iftop/

* http://www.freshports.org/net-mgmt/bwm-ng/

* http://www.freshports.org/sysutils/xsysstats/

* http://www.freshports.org/sysutils/atop/

If you're running OS X / FreeBSD / Solaris, there are many useful DTrace scripts for system monitoring and profiling:

http://www.brendangregg.com/dtrace.html

[+] tbrock|12 years ago|reply
I recently had to test compilation of a project for work on FreeBSD and feel like my eyes were opened.

I went into the task thinking "the BSDs are OLD" yet there must be something great about this operating system. It's beloved in certain communities and used at popular successful startups such as Netflix. There must be something I've been missing that those who cherish it have come to understand.

What I found was the perfect blend of modern and old school. It's not old feeling at all!

It had all of the newest GNU software (and otherwise) that I've come to know and love on other operating systems but also a sense of stability about the core that you don't get with Linux. There was a sense of underlying structure that had actually been planned out instead of discovered over time and it made the whole process of learning about how it worked a pleasure.

In addition, the FreeBSD manual was actually helpful and gave me a sense of completeness rather than "the text in this wiki is just scratching the surface of a complex wrapper for x that used to be y".

It's simple yet powerful, up to date but solid and I'd highly recommend FreeBSD as a result of the experience.

[+] zorlem|12 years ago|reply
In addition (or even instead of) to htop I strongly recommend atop [0]. This tool has been of an invaluable help to me during a lot of diagnostic sessions.

It can collect detailed memory usage profile of processes and when combined with some smart scripting it has a nice leak detection functionality [1]. Very useful when you run out of memory and want to find which daemon has used all of it.

[0] http://www.atoptool.nl/

[1] http://www.atoptool.nl/download/case_leakage.pdf

edit: formatting

[+] cyphax|12 years ago|reply
An ncurses disk usage analyzer: http://dev.yorhel.nl/ncdu I find this an extremely handy alternative to du, it's somewhat similar to TreeView on Windows.
[+] rschmitty|12 years ago|reply
That's great, thanks! I've been doing the du -sh * tree crawl for far too long
[+] bearbin|12 years ago|reply
I also find this very useful :) It shows you where your space has got to in a simple and easy to use manner.
[+] purephase|12 years ago|reply
This is awesome. Thanks for sharing it.
[+] Erwin|12 years ago|reply
I'd like to recommend this book to get a current overview of monitoring tools: http://www.amazon.com/Systems-Performance-Enterprise-Brendan...

While most sysadm books are years out of date, this one covers all the hot recent stuff like Dtrace and its equivalents on Linux, pidstat etc. Solid coverage and the author (from Joyent) knows his stuff. Available on Safari too.

[+] Cieplak|12 years ago|reply
If you have multiple nodes, I recommend new relic. It's a bit pricey, but if an issue arises in your stack, new relic can help you immediately pinpoint where and what the issue is.

ps. I can view new relic on my phone, so if I get a pagerduty, I can still see what's up if I'm at the beach.

[+] bowlofpetunias|12 years ago|reply
New Relic is great for application monitoring, but the systems monitoring is kind of meh.

And I've been getting way too may false positives on the systems alerts.

If you only want systems monitoring and not deep application performance insight, New Relic is way too pricey and not really that good.

[+] edwinnathaniel|12 years ago|reply
If you have multiple nodes that includes load balancer and other SOA services and want to link them all to view a request as one transaction (e.g.: a request comes in through a load balancer, gets processed by app-server-1, which in turns calls service-2 that queries to DB-1 {or memcache}, Appneta Traceview can connect them all).

Or you can monitor your web-app using Appview Web too for synthetic monitoring. Plus you can monitor your network as well using Pathview.

[+] dredmorbius|12 years ago|reply
I'll also support New Relic.

If you're just starting out, the free tier is pretty good.

What the monitoring tools lack on specificity (and depending on your stack, it may provide varying levels of awesome -- server monitoring is weak but improving), it has massive win on zero-configuration installation.

Just sign up, instrument, and start monitoring.

If you find bits lacking, there are almost always local tools you can use to supplement.

[+] adionditsak|12 years ago|reply
Yes New Relic is pretty awesome. Lot of information you can monitor there, with a very easy installation. You can get a free account in there aswell, just to test it. I just made one, and i really like the simple UI: http://i.imgur.com/oGGfTrp.png
[+] stiff|12 years ago|reply
Is there a tool that would allow collecting historical data on memory and CPU usage patterns of individual processes? In troubleshooting you are frequently dealing with the situation that some process is "exploding" in memory or/and CPU usage and either you are not there at the moment to run htop or you might not even be able to easily log in on the server to do checks.
[+] Erwin|12 years ago|reply
"atop" can do it to some degree. When you run atop interactively, it uses the process accounting facility to find out not what is running exactly when it takes a snapshot of the system, but also what processes started and exited since the last refresh interval.

Installing atop will also (depending on your distro etc.) set it up to snapshot the system state every 600 seconds. If you run "atop -r" you can review that legacy old data from today or an older day, and switch between the 10-minute snapshot with t and T.

Personally i like "sar" for quick text only overview (sysstat package). Once enabled you have a 10 minute snapshot of a huge amount of performance metrics (e.g. sar -r for memory, sar -b for disk). Of course, it's even better if you use something to collect them centrally (I signed up for DataDog which takes very little effort to integrate compared to rolling your own stuff).

[+] snewman|12 years ago|reply
Scalyr [1] can do this. (Disclaimer: I am the founder of Scalyr, and it's a commercial product.) We aim to be a one-stop-shopping monitoring tool: collect everything you might want to collect, and let you analyze it in any way you want. To your question, we can collect CPU, memory, I/O, and other stats for specified processes [2], and give you graphs, rolled-up dashboards, and alerts on that data.

We're always looking for feedback, and we're happy to give out discounted or free accounts to startups. Drop me a line -- steve@[company domain] -- if you're interested.

[1]: https://www.scalyr.com [2]: https://www.scalyr.com/appDashboard

[+] kyrra|12 years ago|reply
Real simple way: Cron job to dump "top" to a file. It will tell you all processes and their memory/CPU usage every x minutes. Once you need data on a specific process, you can just grep its pid.
[+] lelandbatey|12 years ago|reply
My personal list of tools that I use all the time:

    iftop
    nload
    htop
    goaccess
    ...I'll add more as I remember them, currently gotta sleep.
[+] talloaktrees|12 years ago|reply
I am a contributor to glances, very pleased to see it mentioned. Glances can also run as a server, which can then allow glances clients to connect, or even the android app Android Glances.
[+] ch215|12 years ago|reply
I don't think PowerTop's been mentioned. Maybe because it's more useful for a laptop than a server. Once calibrated, it can output a HTML report on a machine's consumption, which includes a handy list of tunable power saving options. It was written by Intel so it may, or may not, work that well on other processors. https://01.org/powertop
[+] dredmorbius|12 years ago|reply
Enable sysstat and install one of the sysstat graphing tools (or roll your own with your preferred scripting tool).

Noting when things go titsup.com can be particularly useful.

[+] neals|12 years ago|reply
Anybody here now of a ApacheTop-like tool for nginx?
[+] jlgaddis|12 years ago|reply
The default value for "log_format" is "combined" -- identical to the Apache "combined" log format -- so apachetop can read nginx log files without any needed changes.
[+] mgz|12 years ago|reply
apachetop works fine with nginx logs too.
[+] laichzeit0|12 years ago|reply
Strange definition of a "monitoring" tool. It essentially requires a human being to run the tool, look at the graph, and make a decision on the data. There isn't even a baseline to compare the data to. This isn't really monitoring, it's equivalent to typing df and looking at how many bytes on disk is being used, and doing this every 5 minutes.
[+] leephillips|12 years ago|reply
I could never get apachetop to work on Debian or Ubuntu. It displays the data all right, but doesn't respond to most of my keypresses, and it segfaults. Maybe related to its not having been updated since 2005 or so? It's too bad, because it promises to do exactly what I want.