Page 1 of 1

service status is not accurate

Posted: Mon Jul 28, 2014 3:18 pm
by nos09
Hi, I just installed nagios.

I am trying to monitor a webserver but I dont think nagios is showing the right information about it.

I copied localhost.cfg that came with default installation and modified it with only changing address and host name.

installed
  • openssl nagios-nrpe-server nagios-plugins nagios-plugins-basic nagios-plugins-standard
on webserver that I am trying to monitor. checked that nagios-nrpe-server is running.

but when i checked the load of the server it was 1.03,1.10,0.80 but its showing me 0.10, 0.04, 0.05 .... which is the same as localhost(default).

I am really new to nagios .. can somebody help me out ? here is my webserver.cfg

Code: Select all

# A simple configuration file for monitoring the local host
# This can serve as an example for configuring other servers;
# Custom services specific to this host are added here, but services
# defined in nagios2-common_services.cfg may also apply.
#

define host{
        use                     generic-host            ; Name of host template to use
        host_name          dsp.response
        alias                   dsp response
        address                 192.168.10.8

        }

# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       dsp.response
        service_description             Disk Space
        check_command                   check_all_disks!20%!10%
        }



# Define a service to check the number of currently logged in
# users on the local machine.  Warning if > 20 users, critical
# if > 50 users.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       dsp.response
        service_description             Current Users
        check_command                   check_users!20!50
        }


# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 processes.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       dsp.response
        service_description             Total Processes
                check_command                   check_procs!250!400
        }

# Define a service to check the load on the local machine.

define service{
        use                             generic-service         ; Name of service template to use
        host_name                       dsp.response
        service_description             Current Load
                check_command                   check_load!5.0!4.0!3.0!10.0!6.0!4.0


Re: service status is not accurate

Posted: Mon Jul 28, 2014 3:24 pm
by sreinhardt
Without knowing your command definitions, I can only guess, but based on the configs you show and the steps you have performed the checks in place are likely configured to check localhost only. Most of those commands that localhost uses do not actually need or use the host_address and only check the local system anyway. If you have not seen them, here are a few helpful docs:

http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf
http://assets.nagios.com/downloads/nagi ... utions.pdf

Most likely all you need to do, is define some commands on the nagios server side, that use check_nrpe to check the remote system, then make sure the remote system is setup properly, both of these docs should get you on the right path!

Re: service status is not accurate

Posted: Mon Jul 28, 2014 3:24 pm
by tmcdonald
The reason it is showing the same load is because you are running the same plugins with the same arguments (check_load, in this case) which run on the local system. You cannot just ask a server what its load is because that would be a huge security issue, so you have to install an agent (program, daemon, service, etc) on that server, ask the agent to run check_load, and then get the response from that agent. NRPE (Nagios Remote Plugin Executor) is the agent you would want to install in this case:

http://www.tecmint.com/how-to-add-linux ... ng-server/

Re: service status is not accurate

Posted: Mon Jul 28, 2014 11:40 pm
by nos09
Thank you guys.. I ll follow the docs you posted. GOD i thought it was easy ... lol
anyway as they say if you dont want to fall you wont run ever .. it was a silly mistake (apparently i completely ignored 'everything !'). But i guess i am back on track.
once i am done with setup i ll report ! thanks a lot.

Re: service status is not accurate

Posted: Tue Jul 29, 2014 4:31 am
by nos09
hi .. i followed this guide https://www.digitalocean.com/community/ ... untu-12-10
and i am getting ping lost 100%..

also from Nagios-server i am getting

Code: Select all

root@ip-10-5-0-193:/etc/nagios3/conf.d# /usr/lib/nagios/plugins/check_nrpe -4 -H 10.0.6.194
connect to address 10.0.6.194 port 5666: Connection refused
connect to host 10.0.6.194  port 5666: Connection refusedroot@ip-10-5-0-193:/etc/nagios3/conf.d# /usr/lib/nagios/plugins/check_nrpe -4 -H localhost
connect to address 127.0.0.1 port 5666: Connection refused
connect to host localhost port 5666: Connection refusedroot@ip-10-5-0-193:/etc/nagios3/conf.d#
root@ip-10-5-0-193:/etc/nagios3/conf.d# /usr/lib/nagios/plugins/check_nrpe  -H localhost
connect to address 127.0.0.1 port 5666: Connection refused
connect to host localhost port 5666: Connection refusedroot@ip-10-5-0-192:/etc/nagios3/conf.d#
root@ip-10-5-0-193:/etc/nagios3/conf.d#
I just cant get my head wrapped around this ....

Re: service status is not accurate

Posted: Tue Jul 29, 2014 3:25 pm
by lmiltchev
Is your nagios server allowed to connect to your client (the remote box). You will need to add the nagios IP address to the "allowed_hosts=" line in the nrpe.cfg and restart nrpe. Also check your firewall.

Code: Select all

iptables -L -n