tech.gate.io blog

Nagios on FreeBSD

Problem:

My company needs to monitor servers, services, switches, UPS. The target of this task is to setup a monitoring system, which is able to check the devices and services and send alarms to several people. There should be a difference between critical and non-critical services and devices.

Preface:

Everything you do here, happens at your own risk!
I'm using FreeBSD 7.2 for this task, to be more precise a jailed instance of it. So you should be able to install FreeBSD and update it. Please update your Ports before starting to be sure that you have the newest version of Nagios. I'll describe in an other post how to update your system.

Tip: Back up, because you will break something!

 

Solution:

As you can see in the title I decided to use Nagios. You can find a lot of resources at:
http://www.nagios.org/
http://www.monitoringexchange.org/
http://nagios.manubulon.com/

Installing

Okay lets start our actual work! We'll use something to see the output of our work, so the standard Apache will do that for us.

>cd /usr/ports/www/apache22 && make install clean

Just use the standard setting for the Apache server, you don't need to change the package for this task.

Now we need Nagios:

>cd /usr/ports/net-mgmt/nagios && make install clean

Enable the embedded Perl package and hit okay, non of the x11 packages are needed, as long you don't use x11. When the installer asks you which packages should be compiled for php you have to check Apache, or the mod_php module won't be complied.

For the Nagiosplugins enable all, just a few megabytes of disk space are needed. FreeBSD will fetch them from sourceforge.

Okay, keep waiting a little bit, depends on the power of core/s, but the installer will ask you if you want to create a group "nagios". Answer Yes. After that you'll be ask to create a user called "nagios". Answer Yes. A few moments later the nagiosistaller is finished, and it gives you some advices we'll follow now.

Configuration

Fist of all, we'll edit the httpd.conf of the Apache. This is needed that the GUI of Nagios can be displayed (porperly).

>vi /usr/local/etc/apache22/httpd.conf

Check if the Phpmodul is implemented:

LoadModule php5_module        libexec/apache22/libphp5.so

 
To enable cgi, delete the # in front of the line, maybe you can add the .pl extension, if you want to run perlscripts

AddHandler cgi-script .cgi .pl

Now search for the section and add:

ScriptAlias /nagios/cgi-bin/ /usr/local/www/nagios/cgi-bin/
Alias /nagios/ /usr/local/www/nagios/

As I don't describe any security issue here, we'll make Nagios visible for all. If you want to restrict it, please read the Apache manual: http://httpd.apache.org/docs/2.0/howto/auth.html
With that in mind we add the lines for the static Nagios page:

Order deny,allow
     Allow from all
     php_flag engine on
     php_admin_value open_basedir /usr/local/www/nagios/:/var/spool/nagios/

and for the CGI-Application:

Options +ExecCGI

Okay now it should be possible to start the web server:

>apachectl start

If you working in a Jail the ps -ax doesn't work properly, so just type http://IP of your server/nagios] into the address line of your browser. The rest should be up to Apache and you should see something like this:

Image

 

If the web server doesn't start in the jail you maybe forgot to load a kernelmodul "accf_http". You can make sure if it's loaded using

>kldstat | grep accf

You should see something like:
5 1 0xc6c22000 2000 accf_http.ko

Kernelmoduls can't be loaded in a jail, you have the to that on the jailhost:

>kldload accf_http

Congratulation, you installed Nagios, but you cannot monitor anything now. You installed the static website, but now we have to get the Nagios service up.

 

Nagiosservice

Preparation:

So I'll try to give you a crash course using Nagios. But before starting it, please be sure you know what SNMP is and how to use it and how to snmpwalk.

You should find your config files under /usr/local/etc/nagios
Here are the nagios.cfg-sample, cgi.cfg-sample and the resource.cfg-sample. Copy and rename this files to another location, in my case the samplefolder. The name of the copied files should be nagios.cfg, cgi.cfg and so on.

>mkdir sample
>cp *.cfg-sample sample/
>mv nagios.cfg-sample nagios.cfg

or use my renamescript. Now you should have 3 files and 2 folders: sample and objects.

Lets go to the object folder and do exactly the same:

>cd objects
>mkdir sample
>cp *.cfg sample/
>mv commands.cfg-sample commands.cfg-sample

and so on...
In the end you should have a fileset like this:
commands.cfg
contacts.cfg
localhost.cfg
printer.cfg
sample
switch.cfg
templates.cfg
timeperiods.cfg

Actual work

Time of mindless copy paste is now over, you have to start thinking.
The hole thing with nagios is knowing inheritance. The Nagiosteam did a lot for you, so let's have a look. To be able to view all host and services edit the cgi.cfg and set the parameter

use_authentication=0

from 1 to Zero, or you'll get an error message. (But as the comments in this file say this is for a producing system a bad idea)

 
You can find the Nagios configuration under /usr/local/etc/nagios/nagios.cfg
Here you just need to define which object typs should be used. For example:

# You can specify individual object config files as shown below:
cfg_file=/usr/local/etc/nagios/objects/commands.cfg
cfg_file=/usr/local/etc/nagios/objects/contacts.cfg
cfg_file=/usr/local/etc/nagios/objects/timeperiods.cfg
cfg_file=/usr/local/etc/nagios/objects/templates.cfg

# Definitions for monitoring the local (FreeBSD) host
cfg_file=/usr/local/etc/nagios/objects/localhost.cfg

In these file the behavior of Nagios is defined. We'll add some of our object files a little bit later to monitor a windows server. But for now, use this file set. Let's see what these files for.

commands.cfg
The commands used by nagios are defined here, to check hosts and how to send mails

templates.cfg
Here start the inheritance. This file is very important, because the "skeleton" of the things to monitor are defined here.

contacts.cfg
If a host switches to a warning and critical stat somebody have to be contacted. These contacts are defined here.

printer.cfg
Ready to use script to monitor printers.

switch.cfg
Ready to use script to monitor printers.

timeperiods.cfg
Defines when a staff member should be alarmed

Let's monitor the localhost!

If you've done everything like i told you

>/usr/local/bin/nagios /usr/local/etc/nagios/nagios.cfg

will show something like:

Nagios Core 3.2.0
Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2009
License: GPL

Website: http://www.nagios.org
Nagios 3.2.0 starting... (PID=66618)
Local time is Wed Jan 27 12:05:58 UTC 2010

and you should be glad!

Have a look at you web server and click on host groups:
Image

Let's monitor a printer!

Start with the easy stuff.

>vi /usr/local/etc/nagios/nagios.cfg

and delete the hashmark

cfg_file=/usr/local/etc/nagios/objects/printer.cfg

Now you told Nagios to read printer.cfg on start up.

>cd usr/local/etc/nagios/objects/templats.cfg

Here you find the section about a generic host:

define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        failure_prediction_enabled      1               ; Failure prediction is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
        notification_period             24x7            ; Send host notifications at any time
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

and some line below the printer definiton:

define host{
        name                    generic-printer ; The name of this host template
        use                     generic-host    ; Inherit default values from the generic-host template
        check_period            24x7            ; By default, printers are monitored round the clock
        check_interval          5               ; Actively check the printer every 5 minutes
        retry_interval          1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts      10              ; Check each printer 10 times (max)
        check_command           check-host-alive        ; Default command to check if printers are "alive"
        notification_period     workhours               ; Printers are only used during the workday
        notification_interval   30              ; Resend notifications every 30 minutes
        notification_options    d,r             ; Only send notifications for specific host states
        contact_groups          admins          ; Notifications get sent to the admins by default
        register                0               ; DONT REGISTER THIS - ITS JUST A TEMPLATE
        }

As you can see the generic-printer inheritanced from the generic-host. So if you make changes in generic-host, the genric-printer skeleton will change to! You can override attributes, just by setting the attribute one level depper. So if you add
retain_status_information 0
to the generic printer it will override the 1 inheritances from the generic host and so on.

I want to monitor a HP 3600n Laserjet, so I'll:

>vi /usr/local/etc/nagios/objects/printer.cfg

and add a new host:

define host{
        use             	generic-printer  
        host_name       	NikosPrinter
        alias           	HP3600n @ ITroom
        address         	100.100.100.101
        hostgroups      	network-printers
        notes_url       	http://100.100.100.66/wiki/index.php/Drucker
        action_url      	http://100.100.100.185
        }

The host get his standard setting from the generic-printer, which get his standard setting from generic-host.

Use: where you get the settings
Host_name: how the host is named in nagios
alias: more information in the webinterface
address: IP or FQDN(but prefer IP)
Hostgroup: use to group host, if you got mor printers
notes_url: a link to our internal wiki, where you got more informations
action_url: a link to the webinterface of the printer

 
Okay host is defined, now the services:

define service{
        use                     		generic-service       
        host_name          	NikosPrinter    
        service_description     	Printer Status          
        check_command           	check_hpjd!-C public    
        normal_check_interval  10      ; Check the service every 10 minutes under normal conditions
        retry_check_interval    	1       ; Re-check the service every minute until its final/hard state is determined
        notification_interval   	0
        }

service_description: Name of the service in the webinterface
normal_check_interval: Check the service every 10 minutes under normal conditions
retry_check_interval: Re-check the service every minute until its final/hard state is determined
notification_interval: How often you receive a mail, but I don't want to get spammed by printers so I think one mail I enogh

define service{
        use                     		generic-service
        hostgroup_name          	network-printers
        service_description     	PING
        check_command           	check_ping!3000.0,80%!5000.0,100%
        normal_check_interval   10
        retry_check_interval    	1
        notification_interval   	0
        }

Okay the same as above but here I use a hostgroup to ping instad of a hostname.

Lets have a look at the commands:
check_command check_hpjd!-C public

Arguments are separated by “!”
How to use standard markos can be found in the Nagios documentation, so I'll don't go any further with this.

and from commands.cfg

define command{
        command_name    check_hpjd
        command_line    $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$
        }

 
Restart nagios:

>ps -ax | grep nagios
>kill [pid]
>/usr/local/bin/nagios /usr/local/etc/nagios/nagios.cfg

This should result in:

Image

9 comments


private message
viagra =O viagra online 8-[


private message
rtjXP0 jskspenggkqq


private message
This does look proimsnig. I'll keep coming back for more.



private message
ultram online unprescibed =OO cialis 356099



private message
kamagra viagra cialis apcalis fvs tramadol 10860


Post new comment

Anti-Bot verification code image
Try another code

Feeds List