Categories
Nerding

Nagios Thermometer

Nagios is an open source IT Infrastructure monitoring system. It is mature, widely used, supported, flexible and extensible. Perhaps the industry standard.

Nagios at Home

I use Nagios on my LAN at home to monitor

  • Remote virtual machines that host my
    • personal blogs
    • business sites
  • Status of Raid arrays
    • Linux
    • Microsoft Server 2008r2
  • Disk space availability on multiMedia server
  • Asterisk PBX status for phone(s)
  • Temperature at home
    • indoors
    • outdoors

Here is my main ‘rack’ at home.

HomeServereRackCropedEnumerated

  1. NerdDoro
  2. ePhidgety – Phidgets, MySQL C++
  3. Server 2008R2
  4. Fedora fc.16 – LAMP stack
  5. CentOS Asterisk (retiring PBX)
  6. Wireless phone base station
  7. Grandstream HT502 ATA
  8. Monitor/Keyboard 4 way switch
  9. RaspBerryPi (new Asterisk PBX)
  10. UPS
  11. Future CNC project likes extra slot in monitor switch. Note NEMA stepper motors.

Display of Nagios monitored services including temperature from two separate sensors shown in this status screen. (click image to enlarge) The yellow highlighted line indicating warning, and the line above are temperature readings. The boxes to the left of each with the diagonal red line indicate not to send alerts.

jHomeNagiosServiceStatus

Nagios provides me email updates like this one when my Asterisk PBX, pisterisk, went down.

Subject: ** PROBLEM Host Alert: pisterisk is DOWN **


***** Nagios *****

Notification Type: PROBLEM
Host: pisterisk
State: DOWN
Address: 192.168.1.220
Info: CRITICAL - Host Unreachable (192.168.1.220)

Date/Time: Mon Aug 19 11:01:12 MST 2013

And this email informing me that the problem is resolved, that pisterisk is back up:

Subject: ** RECOVERY Host Alert: pisterisk is UP **


***** Nagios *****

Notification Type: RECOVERY
Host: pisterisk
State: UP
Address: 192.168.1.220
Info: PING OK - Packet loss = 0%, RTA = 0.68 ms

Date/Time: Mon Aug 19 11:03:02 MST 2013

Nagios Thermometer

This article assumes you have a working Nagios system and have some familiarity with Nagios configuration. This blog post will explain how I monitor temperature with Nagios using a hardware sensor compatible with my HomeAmation MS Windows 8 and Windows Phone 8 projects.

Previously I’ve published a couple projects that can deliver XML suitable for this Nagios plugin. A netduino temperature sensor can be found on gitHub as HomeAmationNetDuino and my Parallax Propeller project NerdDoro source is available on github as nerdDoro it produces similar XML in Propeller .spin.

Given an XML file in this format my python Nagios plugin can be configure for normal, warning and alert states.

With this python script I create temperature data for input to Nagios. Here’s a gist at gitHub for it check_temperature_wo

#!/bin/env python
'''
Created on Jan 21, 2012

@author: jeffa aka @jhalbrecht
'''

# jha 8/17/2013
# Prepare code for publishing.

# jha 1/21/2012
# http://www.ibm.com/developerworks/aix/library/au-nagios/#iratings
# http://www.gefoo.org/generalfoo/?p=201

import sys
import getopt
import urllib2
from xml.dom.minidom import parseString

nagios_codes = {'OK': 0, 'WARNING': 1, 'CRITICAL': 2, 'UNKNOWN': 3, 'DEPENDENT': 4}

def nagios_return(code, response):
    # Prints the response message and exits the script with
    # one of the defined exit codes.
    print code + ': ' + response
    sys.exit(nagios_codes[code])

def usage():
    print """Usage: check_temperature [-h|--help] [-w|--warning level] [-c|--critical level]
    Warning level defaults to 85.0
    Critical level defaults to 95.0"""
    sys.exit(3)

def main():

    try:
        options, args = getopt.getopt(sys.argv[1:],
            "h:w:c:",
            "--help --warning= --critical=",)
    except getopt.GetoptError:
        usage()
        sys.exit(3)

    Warning = 85.0 ; Critical = 95.0

    for name, value in options:

        if name in ("-h", "--help"):
            usage()

        if name in ("-w", "--warning"):
            try:
                Warning = 0.0 + float(value)
            except Exception:
                nagios_return('UNKNOWN','Unable to convert to floating point value')

        if name in ("-c", "--critical"):
            try:
                Critical = 0.0 + float(value)
            except Exception:
                nagios_return('UNKNOWN','Unable to convert to floating point value')

    # download the file
    try:
        # jha 8/17/2013 Align with HomeAmation expectations
        # MODIFY HERE
        # with url to your temperature XML FQDN or ip address
        file = urllib2.urlopen("http://x.y.z.h/PutYourUrlHere")
    except Exception:
        nagios_return('UNKNOWN','urlopen failed')

    #convert to string:
    data = file.read()
    #close file because we dont need it anymore:
    file.close()
    #parse the xml you downloaded
    dom = parseString(data)
    #retrieve the first xml tag (data) that the parser finds with name tagName:
    xmlTag = dom.getElementsByTagName('CurrentTemperature0')[0].toxml()
    #strip off the tag (data  --->   data):
    xmlData=xmlTag.replace('','').replace('','')

    # Checking for excessive heat. Could enhance to check for excessive cold
    if float(xmlData) >= Critical:
        nagios_return('CRITICAL','Temperature: %.2f  fahrenheit' % float(xmlData))

    elif float(xmlData) >= Warning:
#        nagios_return('WARNING','Temperature: ' + xmlData + ' fahrenheit')
# bingled a nice format string
        nagios_return('WARNING','Temperature: %.2f  fahrenheit' % float(xmlData))

    else:
        nagios_return('OK','Temperature: %.2f  fahrenheit' % float(xmlData))

if __name__ == '__main__':
    main()

Define the Nagios command

define command{
        command_name    check_temperature_wo        
        command_line    $USER1$/check_temperature_wo -w $ARG1$ -c $ARG2$        
}

And the Nagios Service. Note the warning and alert settings of 79 and 85. My nagios plugins are located on my ‘localhost’ host.

define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Temperature_wo
        check_command                   check_temperature_wo!79!85
        notifications_enabled           0
        }


That’s it. Plug them in to your Nagios system. Remember to change the url for your XML temperature source.

Enjoy!

Leave a Reply

Your email address will not be published. Required fields are marked *