- Nagios to RRD Installation Guide
-
1) n2rrd installation/configuration
- 1.1) Extracting the source
- 1.2) Edit dist-n2rrd.conf file
- 1.3) Edit /etc/n2rrd/templates/maps/dist-rra_plugin_maps file
- 1.4) Edit /etc/n2rrd/templates/maps/dist-rgb.txt
- 1.5) Edit install.sh
- 1.6) move distribution example files under /etc/n2rrd/templates
- 1.7) Edit /etc/nagios/checkcommands.cfg
- 1.8) Edit /etc/nagios/nagios.cfg to reflect the following variables
- 1.9) Example 1, configure service "check_icmp"
- 1.10) Example 2, map service check
- 1.11) Modify non standard performance data
- 1.12) check nagios configuration
- 1.13) reload nagios
- 1.14) check logfile for progress
- 1.15) Example log file lines with debug mode enabled
- 1.16) check if RRAs are generated in the right place
- 1.17) template search order
-
2) rrd2graph installation/configuration
- 2.1) edit rrd2graph.cgi and change following variables
- 2.2) cp rrd2graph.cgi to your cgi-bin directory
- 2.3) Edit n2rrd.conf and change the following values appropriately
- 2.4) Create graph templates
- 2.5) Edit nagios configuration file serviceextinfo.cfg
- 2.6) Reload Nagios
- 2.7) if DYN_RRA_CREATE is enabled
- 3) Other hints
- 4) comments to
Nagios to RRD Installation Guide
1) n2rrd installation/configuration
1.1) Extracting the source
VER = n2rrd version
- cd /tmp
- tar zxvf n2rrd-VER.tar.gz
- cd n2rrd-VER
1.2) Edit dist-n2rrd.conf file
- and move dist-n2rrd.conf to n2rrd.conf (normally /etc/n2rrd/n2rrd.conf)
- check demo-server n2rrd.conf n2rrd140
1.3) Edit /etc/n2rrd/templates/maps/dist-rra_plugin_maps file
- and move dist-rra_plugin_maps to rra_plugin_maps
- define your DST (Data Source Type)
#File format # plugin_name=variable_name2:DST,variable_name2:DST check_ping=rta:GAUGE,pl:GAUGE check_netstat=active:DERIVE,passive:DERIVE,failed:DERIVE,resets:DERIVE,established:GAUGE
1.4) Edit /etc/n2rrd/templates/maps/dist-rgb.txt
- and move dist-rgb.txt to rgb.txt
- format of lines in this file is
#COLOR_NAME=HEX_VALUE #e.g. DarkBlue=#00008B - define rgb.txt file location in /etc/n2rrd/n2rrd.conf
DYN_RGB_COLORS_MAPS = "templates/maps/rgb.txt"
1.5) Edit install.sh
- change variable to suit your environment
- run ./install.sh
1.6) move distribution example files under /etc/n2rrd/templates
- the following command will help move files to its original name.
for fdist in dist-*; do fnew=`echo $fdist | sed 's/dist-//'`; mv $fdist $fnew; done
- modify it ?
NOTE: Only if its your first installation
1.7) Edit /etc/nagios/checkcommands.cfg
NOTE: location of status.* file is defined in nagios.cfg as variable status_file=/var/log/nagios/status.dat
- add/update following, depending on the changes you made to variables in install.sh
Host Performance processing commanddefine command{ command_name process-host-perfdata command_line /usr/local/bin/n2rrd.pl -d -D "HOST" -N "/var/log/nagios/status.dat" -C '$HOSTCHECKCOMMAND$' -c /etc/n2rrd/n2rrd.conf -T $LASTHOSTCHECK$ -H $HOSTNAME$ -s "check_icmp" -o "$HOSTPERFDATA$" }Service Performance processing commanddefine command{ command_name process-service-perfdata command_line /usr/local/bin/n2rrd.pl -d -N "/var/log/nagios/status.dat" -C '$SERVICECHECKCOMMAND$' -c /etc/n2rrd/n2rrd.conf -T $LASTSERVICECHECK$ -H $HOSTNAME$ -s "$SERVICEDESC$" -o "$SERVICEPERFDATA$" }
you can disable debug mode "option -d", once everything works to avoid huge log file being created.
- Incase you like also to collect Plugins Execution Time and Latency Time, then enable options -e and -l
define command{ command_name process-service-perfdata command_line /usr/local/bin/n2rrd.pl -d -N "/var/log/nagios/status.dat" -C '$SERVICECHECKCOMMAND$' -c /etc/n2rrd/n2rrd.conf -e $SERVICEEXECUTIONTIME$ -l $SERVICELATENCY$ -T $LASTSERVICECHECK$ -H $HOSTNAME$ -s "$SERVICEDESC$" -o "$SERVICEPERFDATA$" }
1.8) Edit /etc/nagios/nagios.cfg to reflect the following variables
- process_performance_data=1
- host_perfdata_command=process-host-perfdata
- service_perfdata_command=process-service-perfdata
1.9) Example 1, configure service "check_icmp"
- create an template /etc/n2rrd/templates/rra/icmp.t
-s 300 # 5minutes DS:rta:GAUGE:600:0:U DS:pl:GAUGE:600:0:U RRA:AVERAGE:0.5:1:1440 #day RRA:AVERAGE:0.5:30:336 #week RRA:AVERAGE:0.5:120:360 #month RRA:AVERAGE:0.5:1440:365 #year RRA:MAX:0.5:1:1440 #day RRA:MAX:0.5:30:336 #week RRA:MAX:0.5:120:360 #month RRA:MAX:0.5:1440:365 #year RRA:MIN:0.5:1:1440 #day RRA:MIN:0.5:30:336 #week RRA:MIN:0.5:120:360 #month RRA:MIN:0.5:1440:365 #year
- now define service check_icmp
define service{ use generic-service ; Name of service template to use host_name www.example.com ; change this to appropriate server name service_description check_icmp ; STRING can be anything, here important is '_icmp' check_command check_icmp }NOTE: I assume that generic-service template is defined or you are using the default one
1.10) Example 2, map service check
Nagios service description = "Physical memory" and mapped to mem for n2rrd
- edit service maps file "/etc/n2rrd/templates/maps/service_name_maps" and add following line:
Physical memory: mem
- create a template "/etc/n2rrd/templates/rra/mem.t"
-s 300 # 5minutes DS:used:GAUGE:600:0:U DS:free:GAUGE:600:0:U RRA:AVERAGE:0.5:1:1440 #day RRA:AVERAGE:0.5:30:336 #week RRA:AVERAGE:0.5:120:360 #month RRA:AVERAGE:0.5:1440:365 #year RRA:MAX:0.5:1:1440 #day RRA:MAX:0.5:30:336 #week RRA:MAX:0.5:120:360 #month RRA:MAX:0.5:1440:365 #year RRA:MIN:0.5:1:1440 #day RRA:MIN:0.5:30:336 #week RRA:MIN:0.5:120:360 #month RRA:MIN:0.5:1440:365 #year
- now define service for "Physical memory"
define service{ use generic-service ; Name of service template to use host_name localhost service_description Physical memory ; maps to template '/etc/n2rrd/templates/rra/mem.t' check_command check_mem!3000!1000 ; with warning and critical limits }
NOTE: in case you for some reason need to use one template with different names,
e.g. eth for eth0 eth1, eth2, hme0 etc
then just symlink it to eth
1.11) Modify non standard performance data
- you have a possibility to evaluate this yourself and return the values in following string format
ds_name=ds_value [ds_name=ds_value] ..
- an example perl code for service "Physical memory"
my $tmp_pdata = ""; # # the following Environment variable is passed by Nagios, see nagios Doc. for more info if ( $ENV{NAGIOS_SERVICEPERFDATA} ) { $tmp_pdata = $ENV{NAGIOS_SERVICEPERFDATA}; } ... # process $tmp_pdata, to create string # used=4096 free=1024 # ... return $tmp_pdata;
- first n2rrd looks for external code in "/etc/n2rrd/templates/code"
e.g /etc/n2rrd/templates/code/mem.pl
if the above perl code exists, then n2rrd will not parse the string, instead expect a string from external perl code as mentioned above.
1.12) check nagios configuration
- nagios -v /etc/nagios/nagios.cfg
1.13) reload nagios
/etc/init.d/nagios reload
OR
kill -HUP `cat /var/run/nagios.pid`
1.14) check logfile for progress
- if necessary fix errors.
1.15) Example log file lines with debug mode enabled
- system load and DS name rewrite
Host = localhost, Service name = Current Load_load, Check result = load1=0.000;5.000;10.000;0; load5=0.000;4.000;6.000;0; load15=0.000;3.000;4.000;0; Filtered ds_names: load_1min:load_5min:load_15min, ds_values: 0.000:0.000:0.000 - Physical memory check, with service name mapping
Host = localhost, Service name = Physical memory, Check result = used= free=51780 Searching map in file "/etc/nagios/templates/service_name_maps" for service "Physical memory" Filtered ds_names: used:free, ds_values: :51780
1.16) check if RRAs are generated in the right place
- ls -l /var/log/nagios/rra (may be you have choosen another place)
1.17) template search order
if exists file
TEMPLATES_DIR/rra/HOSTNAME_SERVICE_NAME.t
# use it
else if exists
TEMPLATES_DIR/rra/SERVICE_NAME.t
# use it
2) rrd2graph installation/configuration
2.1) edit rrd2graph.cgi and change following variables
my $conf_file = "/etc/n2rrd/n2rrd.conf";
my $debug = 0;
2.2) cp rrd2graph.cgi to your cgi-bin directory
- cp rrd2graph.cgi /srv/www/vhosts/www.example.com/cgi-bin
- chmod 755 /srv/www/vhosts/www.example.com/cgi-bin/rrd2graph.cgi
2.3) Edit n2rrd.conf and change the following values appropriately
DOCUMENT_ROOT = /srv/www/vhosts/www.example.com/html
CACHE_DIR = rrd_images_cache
# Thus generated graphs will be stored in directory DOCUMENT_ROOT/CACHE_DIR
2.4) Create graph templates
- create graph template "/etc/n2rrd/templates/graph/mem.t" for "Physical memory"
--imgformat=PNG --lazy --title="$HOSTNAME$ - Memory Usage" --base=1024 --height=200 --width=500 --alt-autoscale-max --lower-limit=0 --vertical-label=GBytes --slope-mode DEF:a="$RRD_FILENAME$":used:AVERAGE DEF:b="$RRD_FILENAME$":free:AVERAGE CDEF:cdefa=a,1024,* CDEF:cdefb=b,1024,* AREA:cdefa#FF3932:"Used" AREA:cdefb#35962B:"Free\n":STACK - create graph template "/etc/n2rrd/templates/graph/icmp.t" for "icmp"
# # $HOSTNAME$ will be replaced with hostname being checked # $RRD_FILENAME$ will be replace with real rrd filename # well nothing is stopping you from adding values from other rrd file, then you have # to explicitly give the file names # # Title -t "$HOSTNAME$ - ICMP RTA" # Vertical label -v "Time in ms" # # Height and Width --height="120" --width="440" --slope-mode # # Define canvas and frame colors -c "BACK#00000F" -c "SHADEA#" -c "SHADEB#" -c "FONT#F7F7F7" -c "CANVAS#2E2E2E" -c "GRID#7F7F7F" -c "MGRID#B8B8B8" -c "FRAME#2E2E2E" -c "ARROW#FFFFFF" # # define atleast one DEF "DEF:icmp_rta=$RRD_FILENAME$:rta:AVERAGE" "DEF:icmp_pl=$RRD_FILENAME$:AVERAGE" "CDEF:icmp_pl_neg=icmp_pl,-1,*" "GPRINT:icmp_rta:LAST:Current\: %5.2lf ms" "GPRINT:icmp_rta:MIN:Min\: %5.2lf ms" "GPRINT:icmp_rta:MAX:Max\: %5.2lf ms" "GPRINT:icmp_rta:AVERAGE:Avg\: %5.2lf ms\n" "GPRINT:icmp_pl:LAST:Current\: %5.2lf ms" "GPRINT:icmp_pl:MIN:Min\: %5.2lf ms" "GPRINT:icmp_pl:MAX:Max\: %5.2lf ms" "GPRINT:icmp_pl:AVERAGE:Avg\: %5.2lf ms\n" "COMMENT:\n" "COMMENT:$CDATE" # # Define CDEF with grading colors, order is top down # "CDEF:g_color2=icmp_rta,0.98,*" "AREA:g_color2#00FF00:Round Trip Average Time" "CDEF:g_color10=icmp_rta,0.90,*" "AREA:g_color10#00FF00" "CDEF:g_color15=icmp_rta,0.85,*" "AREA:g_color15#00F200" "CDEF:g_color20=icmp_rta,0.80,*" "AREA:g_color20#00E500" "CDEF:g_color25=icmp_rta,0.75,*" "AREA:g_color25#00D900" "CDEF:g_color30=icmp_rta,0.70,*" "AREA:g_color30#00CC00" "CDEF:g_color35=icmp_rta,0.65,*" "AREA:g_color35#00BF00" "CDEF:g_color40=icmp_rta,0.60,*" "AREA:g_color40#00B200" "CDEF:g_color45=icmp_rta,0.55,*" "AREA:g_color45#00A600" "CDEF:g_color50=icmp_rta,0.50,*" "AREA:g_color50#" "CDEF:g_color55=icmp_rta,0.45,*" "AREA:g_color55#008C00" "CDEF:g_color60=icmp_rta,0.40,*" "AREA:g_color60#007F00" "CDEF:g_color65=icmp_rta,0.35,*" "AREA:g_color65#" "CDEF:g_color70=icmp_rta,0.30,*" "AREA:g_color70#" "CDEF:g_color75=icmp_rta,0.25,*" "AREA:g_color75#" "CDEF:g_color80=icmp_rta,0.20,*" "AREA:g_color80#004C00" "CDEF:g_color85=icmp_rta,0.15,*" "AREA:g_color85#" # # Negated packet loss "CDEF:g_pl_color2=icmp_pl_neg,0.98,*" "AREA:g_pl_color2#FF0000:Percent Packet Loss" "CDEF:g_pl_color10=icmp_pl_neg,0.90,*" "AREA:g_pl_color10#FF0000" "CDEF:g_pl_color15=icmp_pl_neg,0.85,*" "AREA:g_pl_color15#F20000" "CDEF:g_pl_color20=icmp_pl_neg,0.80,*" "AREA:g_pl_color20#E50000" "CDEF:g_pl_color25=icmp_pl_neg,0.75,*" "AREA:g_pl_color25#D90000" "CDEF:g_pl_color30=icmp_pl_neg,0.70,*" "AREA:g_pl_color30#CC0000" "CDEF:g_pl_color35=icmp_pl_neg,0.65,*" "AREA:g_pl_color35#BF0000" "CDEF:g_pl_color40=icmp_pl_neg,0.60,*" "AREA:g_pl_color40#B20000" "CDEF:g_pl_color45=icmp_pl_neg,0.55,*" "AREA:g_pl_color45#A60000" "CDEF:g_pl_color50=icmp_pl_neg,0.50,*" "AREA:g_pl_color50#" "CDEF:g_pl_color55=icmp_pl_neg,0.45,*" "AREA:g_pl_color55#8C0000" "CDEF:g_pl_color60=icmp_pl_neg,0.40,*" "AREA:g_pl_color60#7F0000" "CDEF:g_pl_color65=icmp_pl_neg,0.35,*" "AREA:g_pl_color65#" "CDEF:g_pl_color70=icmp_pl_neg,0.30,*" "AREA:g_pl_color70#" "CDEF:g_pl_color75=icmp_pl_neg,0.25,*" "AREA:g_pl_color75#" "CDEF:g_pl_color80=icmp_pl_neg,0.20,*" "AREA:g_pl_color80#4C0000" "CDEF:g_pl_color85=icmp_pl_neg,0.15,*" "AREA:g_pl_color85#"an Example output:
http://n2rrd.diglinks.com/images/demo.png
2.5) Edit nagios configuration file serviceextinfo.cfg
- check_icmp
define serviceextinfo{ host_name www.example.com service_description check_icmp notes_url http://YOUR_WEBSERVER_NAME/cgi-bin/rrd2graph.cgi?hostname=$HOSTNAME$&service=$SERVICEDESC$ }
- Physical memory
define serviceextinfo{ host_name localhost service_description Physical memory notes_url http://YOUR_WEBSERVER_NAME/cgi-bin/rrd2graph.cgi?hostname=$HOSTNAME$&service=$SERVICEDESC$ }
- NOTE (3.x users)
above mentioned notes_url and action_url are part of host and service definition attributes,
which basically means you don't have to maintain another configuration file.
2.6) Reload Nagios
- Now you would see icons near service description, click on it to see the graph
2.7) if DYN_RRA_CREATE is enabled
- dynamically crated RRA templates are kept under "*/templates/rra/dyn"
- dynamically created GRAPH templates are kept under "*/template/graph/dyn"
3) Other hints
- In Nagios 3.x you can disable EPN through configuration enable_embedded_perl=<0/1>
3.1) TIPS
- starting 1.4.0, if you eable option DYN_RRD_CREATE, then you can avoid creating RRA/GRAPH templates and once you see all performance data are created, then you can decide if you like to create custome RRA/GRAPH templates.
- I use two different generic templates for services, this way you can avoid maintaining seperate file for notes_url.
# without perfomace data define service{ name generic-service-no-perf ; no performance data gathered or required active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! } # # with perfomace data define service{ name generic-service ; If performance data is gathered active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts notes_url /perl/rrd2graph.cgi?hostname=$HOSTNAME$&service=$SERVICEDESC$ register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! }
3.2) Known problems/issues
- check n2rrd.pl and rrd2graph.cgi if they complain
- perl -cw PATH/n2rrd.pl
- perl -cw PATH/rrd2graph.cgi
- check for file permissions
- n2rdd.pl runs with nagios user permissions:
- can read/write RRA templates directory (normally under /etc/n2rrd/templates/rra)
- can read status.log
- can write to n2rrd.log
- rrd2graph.cgi runs with webserver user permissions:
- check that it can write to CACHE_DIR
- can read status.log
- can write to n2rrd.log
- can read/write template/graph directory (normally under /etc/n2rrd/templates/graph)
- n2rdd.pl runs with nagios user permissions:
- In case you are not seeing Nagios environment variables, then could be that nagios is compiled with EPN (Embeded Perl)
- Diable EPN see Nagios docs for details on EPN
- In nagios 3.x a comment # nagios: -epn in your script should disable EPN.
