Passive checks out of sync with log

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
invade
Posts: 29
Joined: Thu Nov 16, 2017 7:45 am

Passive checks out of sync with log

Post by invade »

Hi.

We have various systems sending passive checks to Nagios via Gearman.

While investigating a problem I noticed a strange set of events.

I'm not sure if this is a problem or "working as expected" but, I don't understand what's happening so wondered if someone could explain.

Below is a series of checks being sent from a client, including a timestamp of when they were run:

Code: Select all

[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 0 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:26:19 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 1 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:26:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 2 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:27:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 3 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:28:09 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 4 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:29:09 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=2 --message="Message 5 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:29:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 0 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:30:20 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 1 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:31:10 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 2 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:31:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 3 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:32:10 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 4 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:32:50 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=2 --message="Message 5 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:33:50 BST
and below are the entries in the Nagios log:

Code: Select all

[Wed Oct  2 15:26:41 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 0 - 02-10-2019 @ 15:26:19 BST
[Wed Oct  2 15:27:42 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 1 - 02-10-2019 @ 15:26:40 BST
[Wed Oct  2 15:28:12 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 2 - 02-10-2019 @ 15:27:40 BST
[Wed Oct  2 15:29:12 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 3 - 02-10-2019 @ 15:28:09 BST
[Wed Oct  2 15:29:42 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 4 - 02-10-2019 @ 15:29:09 BST
[Wed Oct  2 15:29:42 2019] SERVICE NOTIFICATION: client;client;Test;CRITICAL;notifyservice-client;Message 5 - 02-10-2019 @ 15:29:40 BST
[Wed Oct  2 15:29:42 2019] SERVICE ALERT: client;Test;CRITICAL;HARD;1;Message 5 - 02-10-2019 @ 15:29:40 BST
[Wed Oct  2 15:30:22 2019] PASSIVE SERVICE CHECK: client;Test;2;Message 5 - 02-10-2019 @ 15:29:40 BST
[Wed Oct  2 15:31:11 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 0 - 02-10-2019 @ 15:30:20 BST
[Wed Oct  2 15:31:42 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 1 - 02-10-2019 @ 15:31:10 BST
[Wed Oct  2 15:32:11 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 2 - 02-10-2019 @ 15:31:40 BST
[Wed Oct  2 15:32:51 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 3 - 02-10-2019 @ 15:32:10 BST
[Wed Oct  2 15:33:52 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 4 - 02-10-2019 @ 15:32:50 BST
[Wed Oct  2 15:33:52 2019] SERVICE NOTIFICATION: client;client;Test;CRITICAL;notifyservice-client;Message 5 - 02-10-2019 @ 15:33:50 BST
[Wed Oct  2 15:33:52 2019] SERVICE ALERT: client;Test;CRITICAL;HARD;1;Message 5 - 02-10-2019 @ 15:33:50 BST
What I noticed is the first check run at 15:26:19 is not logged by Nagios until the next check is run at 15:26:40, this in turn is not logged until the check is run at 15:27:40, and so on.

This continues until I send a check at 15:29:40 with a different return code. This prompts the previous check run at 15:29:09 to be logged and followed immediately by the alert and notification.

Hopefully I've explained the situation clearly.

Software versions in use are the following packages for CentOS:
mod_gearman : 3.1.0
gearmand : 0.33-7
nagios : 4.4.3

If any additional information is required, just ask.

Thanks in advance.
User avatar
eloyd
Cool Title Here
Posts: 2129
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Passive checks out of sync with log

Post by eloyd »

Nagios doesn't always log the result of every status check, but it does log the result of every status check where the result is different from the lats time it ran. You might want to look into state stalking.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoydI'm a Nagios Fanatic!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Passive checks out of sync with log

Post by scottwilkerson »

Thanks @eloyd
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
invade
Posts: 29
Joined: Thu Nov 16, 2017 7:45 am

Re: Passive checks out of sync with log

Post by invade »

Many thanks for the explanation and suggestion.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Passive checks out of sync with log

Post by benjaminsmith »

Many thanks for the explanation and suggestion.
Your welcome. May we close this thread or did you have any other questions?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
invade
Posts: 29
Joined: Thu Nov 16, 2017 7:45 am

Re: Passive checks out of sync with log

Post by invade »

Please close. Thank you.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Passive checks out of sync with log

Post by scottwilkerson »

invade wrote:Please close. Thank you.
Great!

Locking
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked