Network Monitoring with Nagios


Network Monitoring

There are commercial products that compete in the same arena as Nagios - a lot of them. In my experience as an enterprise software developer, I have seen few instances of Nagios deployed, but my experience also tells me that most enterprises are ignorant or afraid of implementing open source solutions.

From my limited perspective, the Nagios featureset contains just about everything you could want. Configuration is a little bit cumbersome - especially for an initial setup, but that goes for just about every commercial tool out there as well.

How Nagios Works

I am by no means a Nagios expert, but gleaning from the documentation and my own recent experience with getting a server set up, there are a few basic things that are involved with monitoring a network resource.

First off, you define basic information about what it is that you are monitoring. You configure separately information about the host you are monitoring (be it a router, a server, or some other network device), the services you want to monitor (http, ldap, etc.), and potentially the ladder of resources that need to be working in order for you to determine that a particular service is unavailable. For instance, in order to check a series of websites for uptime, the router that leads to them must also be up, and the firewall that they all share must also be working. When configured correctly, Nagios is intelligent enough to determine the source of the problem to a degree. You can also define automated actions and notifications to be taken when a given network resource reaches either a critical or a warning state.

You also define timing intervals for monitoring. There is a game of cat and mouse here - you don't want to flood your network resources with so much traffic that the monitoring tool itself is causing problems, but you also don't want to go for extended periods of time between health checks. Most of the out-of-the-box monitoring plugins exhibit very low resource utilization, but it pays to pay attention to what you are doing.

Nagios then takes your configuration files and determines a schedule for when to monitor the resources. In the web interface, you not only have access to health information, but you also can glean information about the last health check and when the next scheduled health check is. Depending on configuration and permissions, you can also re-schedule the health check of a resource that you are interested in.

The monitoring plugins generally report back results as a number. And generally, you define ranges of numbers as healthy, warning, or critical.

Once configured, you simply go into the Nagios web interface to see how things are going. The tactical overview page gives you a simple indication of whether or not any problems exist in your monitored resource pool. Additionally, there are a series of other pages that give you information about your configuration, details of a particular host, host group, service, or service grouping. There are also two (one 3D and one 2D) status maps that give you a visual overview of your network resource status along the defined network path.

More information about Nagios

There are a number of how-to's out there for getting started with Nagios, I'm going to refer you to those resources rather than repeat them and potentially give out bad information.

Once you are up and running, it pays to read through the documentation and to page through the plugins available at http://nagiosplugins.org/. There is also a community site which contains a blog and a wiki that are worth perusing.



What visitors have to say about Network Monitoring with Nagios