Aug 2010

Replacing Ping with Nmap for Nagios

Sometimes a system administrator needs to get around a few rules that are in place for good (or not) reasons. One example is when networks have ICMP turned off (or even just a portion of it). With ICMP off it can be difficult to configure tools like Nagios for simple up and down checks. In this text getting around the no ICMP problem and a script to handle it for Nagios.

First the base syntax. Nmap is the tool of choice to make a quick nmap ping check:

	nmap -sP 192.168.1.6

Next the output to parse:

	Starting Nmap 5.30BETA1 ( http://nmap.org ) at 2010-06-28 14:13 EDT
	Nmap scan report for argos (192.168.1.6)
	Host is up (0.000073s latency).
	Nmap done: 1 IP address (1 host up) scanned in 0.00 seconds

Or the reverse:

	nmap -sP 192.168.1.200

	Starting Nmap 5.30BETA1 ( http://nmap.org ) at 2010-06-28 14:15 EDT
	Note: Host seems down. If it is really up, but blocking our ping \
      probes, try -Pn
	Nmap done: 1 IP address (0 hosts up) scanned in 3.01 seconds

It is academic as to which one will be used. For the purposes of this text if a scan comes back as down an alert is triggered, otherwise a return of 0 with a up message.

Now time for the script. First in the most simple for possible:

	#!/bin/sh
	/usr/local/bin/nmap -sP $1 | grep "Host seems down" 

	if [ "$?" -eq 0 ]; then
		echo "NMAP PING: CRITICAL"
		exit 2
	fi

	echo "NMAP PING: OK"
	exit 0

That script seems a little boring and it does not account for unknown conditions. First a quick rewrite to deal with unknowns. This will require shoving the scan results somewhere then looking at them. A tmp file will be used:

	#!/bin/sh
	NMAP="/usr/local/bin/nmap -sP"
	TMP=/var/tmp/nmap_ping.$$
	CHECK="Nmap Ping"

	$NMAP $1 > $TMP 

	grep "Host seems down" $TMP
	if [ "$?" -eq 0 ]; then
		rm -f $TMP
		echo "$CHECK: CRITICAL"
		exit 2
	fi

	grep "Host is up" $TMP
	if [ "$?" -eq 0 ]; then
		rm -f $TMP
		echo "$CHECK: Ok"
		exit 0
	fi

	rm -f $TMP
	echo "$CHECK: UNKNOWN"
	exit 3

Now the script appears to take all things into account. Actually there are two problems remaining. First there is a repeated command when the temporary file is removed. Second what if nmap never runs? In theory this should generate a unknown but it would still be best to handle this upfront versus waiting until the end of the script. It does not seem to make sense to create a routine just to do a remove so instead a full results routine which takes the error level number and message as an argument is better:

	results_exit()
	{
		retval=$1
		msg=$2

		rm -f $TMP

		echo "$CHECK ${msg}"
		exit $retval
	}

Note that the above routine can deal with nmap not executing properly as well:

    #!/bin/sh
    NMAP="/usr/local/bin/nmap -sP"
    TMP=/var/tmp/nmap_ping.$$
    CHECK="Nmap Ping"

    results_exit()
    {
        retval=$1
        msg=$2

        rm -f $TMP

        echo "$CHECK ${msg}"
        exit $retval
    }


    $NMAP $1 > $TMP || 
		results_exit 255 "Could not execute $NMAP"

    grep "Host seems down" $TMP
    if [ "$?" -eq 0 ]; then
		results_exit 2 "CRITICAL"
    fi

    grep "Host is up" $TMP
    if [ "$?" -eq 0 ]; then
		results_exit 0 "Ok"
    fi

    results_exit 3 "Unknown"

Now, hopefully, the Nagios ping replacement is almost ready. For clarity a switch to simple if checks to clean up the look and feel of the script:

    #!/bin/sh
    NMAP="/usr/local/bin/nmap -sP"
    TMP=/var/tmp/nmap_ping.$$
    CHECK="Nmap Ping"

    results_exit()
    {
        rm -f $TMP
        echo "$CHECK: ${2}"
        exit $1
    }

    $NMAP $1 > $TMP || results_exit 255 "Could not execute $NMAP"

    grep "Host seems down" $TMP
	[ $? -eq 0 ] && results_exit 2 "CRITICAL"

    grep "Host is up" $TMP
	[ $? -eq 0 ] && results_exit 0 "Ok"

    results_exit 3 "Unknown"

With the script done it is time to replace the Nagios Ping command definition with this new one. For the example the script is in /usr/local/nagios/local/ and defintion file in /usr/local/nagios/etc/objects/commands.cfg. The default command defintion should look something like:

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$\
                             -w 3000.0,80% -c 5000.0,100% -p 5
        }

Now make a copy and comment out the old one replacing it with the nmap_ping script:

	# 'check0host-alive' command definition
	define command{
	             command_name    check-host-alive
	             command_line    /usr/local/nagios/local/nmap_ping \
                                  $HOSTADDRESS$
	     }

	# 'check-host-alive' command definition
	#define command{
	#        command_name    check-host-alive
	#        command_line    $USER1$/check_ping -H \
    #                          $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
	#        }

Of course there is still another improvement to be made. Assuming at this particular site that the path of /usr/local/nagios/local will be used for all of the local scripts instead of typing in the entire path a $USERxx$ macro can be used to substitute it. The $USER$ macros can be found in the resource.cfg file (usually in /usr/local/nagios/etc/resource.cfg). In the version of Nagios for this text up to 32 numbered $USERxx$ macros can be defined. For the example 6 will be used and just needs tacked onto the resource.cfg file:

	# Set $USER6$ to our local script directory
	$USER6$=/usr/local/nagios/local

Next modify the commands.cfg file to reflect the easier notation:

	# 'check-host-alive' command definition
	define command{
	        command_name    check-host-alive
	        command_line    $USER6$/nmap_ping $HOSTADDRESS$
	        }

The Catch

Under some circumstances the nmap call within the script has to sudo to root. This is due to the network configuration. The trick to getting it to work is to allow the user NOPASSWD sudo access that executes the script or to script the password in using Expect etc. Please do be careful if you plan on doing so!

Nagios in of itself is an outstanding monitoring solution. Sometimes due to circumstance beyond ones control even parts of Nagios have to be circumvented to reach a goal. Additionally Nmap's excellent capabilities not only make for a great network investigation too but an outstanding monitoring assist.