Linux server monitoring tools

[+] profquail|12 years ago|reply

These tools aren't limited to Linux; if you're running FreeBSD:

* http://www.freshports.org/sysutils/htop/

* http://www.freshports.org/sysutils/py-glances/

* http://www.freshports.org/sysutils/apachetop/

Some others mentioned in the comments:

* http://www.freshports.org/net-mgmt/iftop/

* http://www.freshports.org/net-mgmt/bwm-ng/

* http://www.freshports.org/sysutils/xsysstats/

* http://www.freshports.org/sysutils/atop/

If you're running OS X / FreeBSD / Solaris, there are many useful DTrace scripts for system monitoring and profiling:

http://www.brendangregg.com/dtrace.html

[+] tbrock|12 years ago|reply

I recently had to test compilation of a project for work on FreeBSD and feel like my eyes were opened.

I went into the task thinking "the BSDs are OLD" yet there must be something great about this operating system. It's beloved in certain communities and used at popular successful startups such as Netflix. There must be something I've been missing that those who cherish it have come to understand.

What I found was the perfect blend of modern and old school. It's not old feeling at all!

It had all of the newest GNU software (and otherwise) that I've come to know and love on other operating systems but also a sense of stability about the core that you don't get with Linux. There was a sense of underlying structure that had actually been planned out instead of discovered over time and it made the whole process of learning about how it worked a pleasure.

In addition, the FreeBSD manual was actually helpful and gave me a sense of completeness rather than "the text in this wiki is just scratching the surface of a complex wrapper for x that used to be y".

It's simple yet powerful, up to date but solid and I'd highly recommend FreeBSD as a result of the experience.

[+] zorlem|12 years ago|reply

In addition (or even instead of) to htop I strongly recommend atop [0]. This tool has been of an invaluable help to me during a lot of diagnostic sessions.

It can collect detailed memory usage profile of processes and when combined with some smart scripting it has a nice leak detection functionality [1]. Very useful when you run out of memory and want to find which daemon has used all of it.

[0] http://www.atoptool.nl/

[1] http://www.atoptool.nl/download/case_leakage.pdf

edit: formatting

[+] cyphax|12 years ago|reply

An ncurses disk usage analyzer: http://dev.yorhel.nl/ncdu I find this an extremely handy alternative to du, it's somewhat similar to TreeView on Windows.

[+] rschmitty|12 years ago|reply

That's great, thanks! I've been doing the du -sh * tree crawl for far too long

[+] bearbin|12 years ago|reply

I also find this very useful :) It shows you where your space has got to in a simple and easy to use manner.

[+] purephase|12 years ago|reply

This is awesome. Thanks for sharing it.

[+] louwrentius|12 years ago|reply

Very neat, thanks for the tip.

[+] Erwin|12 years ago|reply

I'd like to recommend this book to get a current overview of monitoring tools: http://www.amazon.com/Systems-Performance-Enterprise-Brendan...

While most sysadm books are years out of date, this one covers all the hot recent stuff like Dtrace and its equivalents on Linux, pidstat etc. Solid coverage and the author (from Joyent) knows his stuff. Available on Safari too.

[+] Cieplak|12 years ago|reply

If you have multiple nodes, I recommend new relic. It's a bit pricey, but if an issue arises in your stack, new relic can help you immediately pinpoint where and what the issue is.

ps. I can view new relic on my phone, so if I get a pagerduty, I can still see what's up if I'm at the beach.

[+] bowlofpetunias|12 years ago|reply

New Relic is great for application monitoring, but the systems monitoring is kind of meh.

And I've been getting way too may false positives on the systems alerts.

If you only want systems monitoring and not deep application performance insight, New Relic is way too pricey and not really that good.

[+] edwinnathaniel|12 years ago|reply

If you have multiple nodes that includes load balancer and other SOA services and want to link them all to view a request as one transaction (e.g.: a request comes in through a load balancer, gets processed by app-server-1, which in turns calls service-2 that queries to DB-1 {or memcache}, Appneta Traceview can connect them all).

Or you can monitor your web-app using Appview Web too for synthetic monitoring. Plus you can monitor your network as well using Pathview.

[+] dredmorbius|12 years ago|reply

I'll also support New Relic.

If you're just starting out, the free tier is pretty good.

What the monitoring tools lack on specificity (and depending on your stack, it may provide varying levels of awesome -- server monitoring is weak but improving), it has massive win on zero-configuration installation.

Just sign up, instrument, and start monitoring.

If you find bits lacking, there are almost always local tools you can use to supplement.

[+] adionditsak|12 years ago|reply

Yes New Relic is pretty awesome. Lot of information you can monitor there, with a very easy installation. You can get a free account in there aswell, just to test it. I just made one, and i really like the simple UI: http://i.imgur.com/oGGfTrp.png

[+] stiff|12 years ago|reply

Is there a tool that would allow collecting historical data on memory and CPU usage patterns of individual processes? In troubleshooting you are frequently dealing with the situation that some process is "exploding" in memory or/and CPU usage and either you are not there at the moment to run htop or you might not even be able to easily log in on the server to do checks.

[+] Erwin|12 years ago|reply

"atop" can do it to some degree. When you run atop interactively, it uses the process accounting facility to find out not what is running exactly when it takes a snapshot of the system, but also what processes started and exited since the last refresh interval.

Installing atop will also (depending on your distro etc.) set it up to snapshot the system state every 600 seconds. If you run "atop -r" you can review that legacy old data from today or an older day, and switch between the 10-minute snapshot with t and T.

Personally i like "sar" for quick text only overview (sysstat package). Once enabled you have a 10 minute snapshot of a huge amount of performance metrics (e.g. sar -r for memory, sar -b for disk). Of course, it's even better if you use something to collect them centrally (I signed up for DataDog which takes very little effort to integrate compared to rolling your own stuff).

[+] snewman|12 years ago|reply

Scalyr [1] can do this. (Disclaimer: I am the founder of Scalyr, and it's a commercial product.) We aim to be a one-stop-shopping monitoring tool: collect everything you might want to collect, and let you analyze it in any way you want. To your question, we can collect CPU, memory, I/O, and other stats for specified processes [2], and give you graphs, rolled-up dashboards, and alerts on that data.

We're always looking for feedback, and we're happy to give out discounted or free accounts to startups. Drop me a line -- steve@[company domain] -- if you're interested.

[1]: https://www.scalyr.com [2]: https://www.scalyr.com/appDashboard

[+] blueblob|12 years ago|reply

An answer somewhat stolen from stackoverflow says "ps -o rss $(pgrep executablename)" but I guess that assumes that you only have one process running, maybe it would be easier to put it in a script and use "ps -o rss $!"

http://stackoverflow.com/questions/7278326/how-to-monitor-a-...

[+] jlgaddis|12 years ago|reply

  $ man sar

[+] kyrra|12 years ago|reply

Real simple way: Cron job to dump "top" to a file. It will tell you all processes and their memory/CPU usage every x minutes. Once you need data on a specific process, you can just grep its pid.

[+] shocks|12 years ago|reply

Another good one is dstat[1]

1: http://dag.wiee.rs/home-made/dstat/

[+] ithinkso|12 years ago|reply

This makes me curiouser and curiouser every day - are you guys typing this citation brackets by hand or there is something I'm missing?

[+] afaqurk|12 years ago|reply

This is a bit off topic but I created a tool a little while back for my specific (very basic) needs: https://github.com/afaqurk/linux-dash

Demo here: http://afaq.dreamhosters.com/linux-dash/

Easily extensible if anyone wants to use it.

[+] Ecio78|12 years ago|reply

really nice!. PS you have a typo in the title, dashboad instead of dashboard :)

[+] anton000|12 years ago|reply

looks awesome... will try this out

[+] lelandbatey|12 years ago|reply

My personal list of tools that I use all the time:

    iftop
    nload
    htop
    goaccess
    ...I'll add more as I remember them, currently gotta sleep.

[+] SunboX|12 years ago|reply

If you want to monitor all log files for security purpose, I can recommend OSSEC: http://www.ossec.net/

And OSSEC Web User Interface (ossec wui): https://scottlinux.com/wp-content/gallery/site/ossec_web.png

[+] kbeaty|12 years ago|reply

The sysstat suite [1] is quite handy for single nodes. Includes utilities to monitor system performance and activity over time.

[1]: http://sebastien.godard.pagesperso-orange.fr/documentation.h...

[+] nwh|12 years ago|reply

iftop is also a very good one, gives nice graphs of the network utilisation sorted by host or port— http://www.ex-parrot.com/pdw/iftop/

[+] midas007|12 years ago|reply

Check out bwm-ng

[+] talloaktrees|12 years ago|reply

I am a contributor to glances, very pleased to see it mentioned. Glances can also run as a server, which can then allow glances clients to connect, or even the android app Android Glances.

[+] ch215|12 years ago|reply

I don't think PowerTop's been mentioned. Maybe because it's more useful for a laptop than a server. Once calibrated, it can output a HTML report on a machine's consumption, which includes a handy list of tunable power saving options. It was written by Intel so it may, or may not, work that well on other processors. https://01.org/powertop

[+] dredmorbius|12 years ago|reply

Enable sysstat and install one of the sysstat graphing tools (or roll your own with your preferred scripting tool).

Noting when things go titsup.com can be particularly useful.

[+] neals|12 years ago|reply

Anybody here now of a ApacheTop-like tool for nginx?

[+] adionditsak|12 years ago|reply

https://github.com/ClockworkNet/wtop might be a option

[+] jlgaddis|12 years ago|reply

The default value for "log_format" is "combined" -- identical to the Apache "combined" log format -- so apachetop can read nginx log files without any needed changes.

[+] inurl|12 years ago|reply

GoAccess is awesome, free and open source console based. It may output an HTML, JSON, CSV report too.

http://goaccess.prosoftcorp.com/

[+] mgz|12 years ago|reply

apachetop works fine with nginx logs too.

[+] laichzeit0|12 years ago|reply

Strange definition of a "monitoring" tool. It essentially requires a human being to run the tool, look at the graph, and make a decision on the data. There isn't even a baseline to compare the data to. This isn't really monitoring, it's equivalent to typing df and looking at how many bytes on disk is being used, and doing this every 5 minutes.

[+] leephillips|12 years ago|reply

I could never get apachetop to work on Debian or Ubuntu. It displays the data all right, but doesn't respond to most of my keypresses, and it segfaults. Maybe related to its not having been updated since 2005 or so? It's too bad, because it promises to do exactly what I want.

[+] unknown|12 years ago|reply

[deleted]

[+] adionditsak|12 years ago|reply

I think it is amazing with all the great suggestions everyone here support this post with. Much appreciated - thank you :-) I will add them to the post later, as a list with an URL to their website.

[+] adionditsak|12 years ago|reply

https://news.ycombinator.com/item?id=7180300 - The follow up post is made :-)

78 comments