I installed a new Nagios server last week, and it's running on Core 4.0.6 with Ubuntu Linux server 12.04. I was having little luck getting anything to work with the online documentation, because a lot of it was out of date, had different syntax, or just didn't work the way I was expecting it to. I looked for some install guides online and found one that was written just a few weeks prior discussing the same exact versions I was trying to get working, so I started working on this:
http://wellsie.net/p/248/
Everything mostly works, but I noticed that the memory alerts were not coming through (I have been testing Site24x7 with our ServiceDesk system for test monitoring). I was getting lots of alerts from servers that were high on physical memory usage, but nothing from Nagios. I noticed that the RAM was being collectively monitored with physical and virtual lumped together. After figuring out that I needed to change from CheckNT to NRPE for memory monitoring, I got it all setup. However, all of the servers I have currently setup in Nagios all show "Critical" in memory, but the numbers say otherwise:
http://imgur.com/Mulp0p4
If I go to the command-line and run ./check_nrpe -H 192.168.0.72 -p 5666 -c CheckMEM -a MaxWarn=80% MaxCrit=90% ShowAll type=physical, this is what I get:
Code: Select all
root@VERSA-NETMON02:/usr/local/nagios/libexec# ./check_nrpe -H 192.168.0.72 -p 5666 -c CheckMEM -a MaxWarn=80% MaxCrit=90% ShowAll type=physical
OK: physical memory: 1.27G|'physical memory %'=31%;80;90 'physical memory'=1.26G;3;3;0;4In my commands.cfg
Code: Select all
# 'check_mem' command definition
define command{
command_name check_mem
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p 5666 -c CheckMEM -a MaxWarn=$ARG1$ MaxCrit=$ARG2$ ShowAll type=physical
}Code: Select all
define service{
use generic-service
host_name versa-dc01
service_description Memory Usage
check_command check_mem!80!90
}