wrong notification or alarms

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
DOkuwa
Posts: 114
Joined: Tue Aug 15, 2017 3:54 pm

wrong notification or alarms

Post by DOkuwa »

Hello
I do have some alarms or notifications which are false
examples are where the device is not down

** Notification **

Notification Type: PROBLEM

Service: Network-Ports
Host: uschi-eqn1-cs02
Address: 10.255.43.1
State: WARNING

Date/Time: 2017-11-09 Additional Info : No response from remote host 10.255.43.1
acefreakz
Posts: 9
Joined: Mon Dec 26, 2016 6:20 am

Re: wrong notification or alarms

Post by acefreakz »

Can you show us the service definition for Network-Ports? For now am guessing that the port is closed on the remote host.
I do have some alarms or notifications which are false
Do you consistently getting this? or sometimes?
kyang

Re: wrong notification or alarms

Post by kyang »

Thanks @acefreakz,

@DOkuwa, a service definition for Network-Ports would help us. Is it only this service that is experiencing this problem?

Also, can you explain how it is false? Is the service actually UP?
DOkuwa
Posts: 114
Joined: Tue Aug 15, 2017 3:54 pm

Re: wrong notification or alarms

Post by DOkuwa »

An example is this

Code: Select all

[1510823136] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;UNKNOWN;HARD;3;ERROR: Unable to get status from 10.245.65.1 (alarm timeout)
[1510823318] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;WARNING;HARD;3;No response from remote host '10.245.65.1'
[1510823500] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;UNKNOWN;HARD;3;ERROR: Unable to get status from 10.245.65.1 (alarm timeout)
[1510823858] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;OK;HARD;3;All ports OK.
[1510824157] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;WARNING;HARD;3;No response from remote host '10.245.65.1'
[1510824351] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;UNKNOWN;HARD;3;ERROR: Unable to get status from 10.245.65.1 (alarm timeout)
[1510824703] SERVICE ALERT: ukldn-thca-cs02;Network-Ports;OK;HARD;3;All ports OK

but we can ping it and there are no issues

 ping 10.245.65.1
PING 10.245.65.1 (10.245.65.1) 56(84) bytes of data.
64 bytes from 10.245.65.1: icmp_seq=1 ttl=62 time=1.41 ms
64 bytes from 10.245.65.1: icmp_seq=2 ttl=62 time=3.88 ms
64 bytes from 10.245.65.1: icmp_seq=3 ttl=62 time=2.49 ms
64 bytes from 10.245.65.1: icmp_seq=4 ttl=62 time=3.69 ms
^C
--- 10.245.65.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3002



DOkuwa
Posts: 114
Joined: Tue Aug 15, 2017 3:54 pm

Re: wrong notification or alarms

Post by DOkuwa »

Also did an snmpwalk and this worked

Code: Select all

 /usr/bin/snmpwalk  -v 2c -c xxxxx ukldn-thca-cs02 1.3.6.1.4.1.6527.3.1.2.2.4.2.1.39
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.39.1.35684352 = INTEGER: 5
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.39.1.35717120 = INTEGER: 5
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.39.1.35749888 = INTEGER: 5
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.39.1.35782656 = INTEGER: 5
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.39.1.35815424 = INTEGER: 5
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.39.1.35848192 = INTEGER
DOkuwa
Posts: 114
Joined: Tue Aug 15, 2017 3:54 pm

Re: wrong notification or alarms

Post by DOkuwa »

another hosts that has the same issue

Code: Select all

 more nagios.log | grep uschi-eqn1-cs02
[1510790400] CURRENT HOST STATE: uschi-eqn1-cs02;UP;HARD;1;OK - 10.255.43.1: rta 8.845ms, lost 0%
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;Cpm;WARNING;HARD;3;No response from remote host '10.255.43.1'
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;Fans;WARNING;HARD;3;No response from remote host '10.255.43.1'
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;Lsp;UNKNOWN;HARD;3;ERROR: Unable to get status from 10.255.43.1 (alarm timeout)
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;Network-Ports;WARNING;HARD;3;No response from remote host '10.255.43.1'
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;Psu;WARNING;HARD;3;No response from remote host '10.255.43.1'
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;Temperature;WARNING;HARD;3;No response from remote host '10.255.43.1'
[1510790400] CURRENT SERVICE STATE: uschi-eqn1-cs02;ping;OK;HARD;1;OK - 10.255.43.1: rta 16.862ms, lost 0%
usr/local/nagios/var# /usr/bin/snmpwalk -v 2c -c xxxxx uschi-eqn1-cs02 1.3.6.1.4.1.6527.3.1.2.2.4.2.1.38
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37781504 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37814272 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37847040 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37879808 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37912576 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37945344 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.37978112 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.38010880 = INTEGER: 3
SNMPv2-SMI::enterprises.6527.3.1.2.2.4.2.1.38.1.38043648 = INTEGER: 3

npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: wrong notification or alarms

Post by npolovenko »

@DOkuwa, Was this check working OK in the past or did you just configure it? What manual did you use to set it up? We also need to see the command definition and service definition. Usually you can find those in this folder: /usr/local/nagios/etc/objects
If you're having trouble finding it you can download the whole etc folder using Filezilla or WinSCP /usr/local/nagios/etc/ and upload it here.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
DOkuwa
Posts: 114
Joined: Tue Aug 15, 2017 3:54 pm

Re: wrong notification or alarms

Post by DOkuwa »

I could not find anything
just the ip address is wrong and I have added the new IP address and still the same issue I have checked the hosts.cfg and service .cfg and no issues

Code: Select all

define host{
        host_name                               uschi-eqn1-cs02
        use                             core-host
        alias                           uschi-eqn1-cs02
        address                         10.255.50.1
        _SNMPCOMMUNITY                  10.255.50.1
        hostgroups                      Alcatel_Equipment, Core_Equipmen

Code: Select all

 define service{
st_name                               uschi-eqn1-cs02
        use                             core-host
        alias                           uschi-eqn1-cs02
        address                         10.255.50.1
        _SNMPCOMMUNITY                  10.255.50.1
        hostgroups                      Alcatel_Equipment, Core_Equipmen
kyang

Re: wrong notification or alarms

Post by kyang »

@DOkuwa,

The field you are using for "_SNMPCOMMUNITY" doesn't look, right? Does that actually work?
Could you run this command and show us the output?

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Along with that, could you show us the objects.cache file?

Usually located here: If not you could do a find command --> find / -name objects.cache

Code: Select all

/usr/local/nagios/var/objects.cache
DOkuwa
Posts: 114
Joined: Tue Aug 15, 2017 3:54 pm

Re: wrong notification or alarms

Post by DOkuwa »

Code: Select all


/usr/local/nagios/libexec# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 3.2.3
Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-03-2010
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/usr/local/nagios/etc/hosts.cfg'...
Processing object config file '/usr/local/nagios/etc/services.cfg'...
Processing object config file '/usr/local/nagios/etc/misccommands.cfg'...
Processing object config file '/usr/local/nagios/etc/checkcommands.cfg'...
Processing object config file '/usr/local/nagios/etc/contactgroups.cfg'...
Processing object config file '/usr/local/nagios/etc/contacts.cfg'...
Processing object config file '/usr/local/nagios/etc/hostgroups.cfg'...
Processing object config file '/usr/local/nagios/etc/servicegroups.cfg'...
Processing object config file '/usr/local/nagios/etc/timeperiods.cfg'...
Processing object config file '/usr/local/nagios/etc/escalations.cfg'...
Processing object config file '/usr/local/nagios/etc/dependencies.cfg'...
Processing object config file '/usr/local/nagios/etc/hostextinfo.cfg'...
Processing object config file '/usr/local/nagios/etc/serviceextinfo.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_commands.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_contact.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_contactgroup.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_dependencies.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_escalations.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_host.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_hostgroup.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_services.cfg'...
Processing object config file '/usr/local/nagios/etc/meta_timeperiod.cfg'...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking services...
Warning: Service 'HTTP-Tim' on host 'doctool.exponential-e.net' has no default contacts or contactgroups defined!
Warning: Service 'HTTP-Tim' on host 'tm01-dev.exponential-e.net' has no default contacts or contactgroups defined!
        Checked 2358 services.
Checking hosts...
Warning: Host 'doctool.exponential-e.net' has no default contacts or contactgroups defined!
Warning: Host 'tm01-dev.exponential-e.net' has no default contacts or contactgroups defined!
        Checked 434 hosts.
Checking host groups...
        Checked 11 host groups.
Checking service groups...
        Checked 4 service groups.
Checking contacts...
        Checked 36 contacts.
Checking contact groups...
        Checked 7 contact groups.
Checking service escalations...
        Checked 0 service escalations.
Checking service dependencies...
        Checked 0 service dependencies.
Checking host escalations...
        Checked 0 host escalations.
Checking host dependencies...
        Checked 0 host dependencies.
Checking commands...
        Checked 71 commands.
Checking time periods...
        Checked 6 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 4
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

snmp running on the device
root@ukldn-ixhs-nms01:/usr/local/nagios/libexec./check_alcatel_psu.pl  -C xxxx -H 10.255.50.1
[PSU 1 deviceStateOk] [PSU 2 deviceNotEquipped] [PSU 3 deviceNotEquipped] [PSU 4




The object.cache file is in a attachment
Attachments
objects.cache.txt
(2.33 MiB) Downloaded 450 times
Locked