We have various systems sending passive checks to Nagios via Gearman.
While investigating a problem I noticed a strange set of events.
I'm not sure if this is a problem or "working as expected" but, I don't understand what's happening so wondered if someone could explain.
Below is a series of checks being sent from a client, including a timestamp of when they were run:
Code: Select all
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 0 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:26:19 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 1 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:26:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 2 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:27:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 3 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:28:09 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 4 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:29:09 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=2 --message="Message 5 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:29:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 0 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:30:20 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 1 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:31:10 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 2 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:31:40 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 3 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:32:10 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=0 --message="Message 4 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:32:50 BST
[root@client ~]# date "+%d-%m-%Y @ %T %Z" ; /usr/bin/send_gearman --server=gearman --encryption=yes --key=${KEY} --host=${HOSTNAME} --service=Test --returncode=2 --message="Message 5 - $(date "+%d-%m-%Y @ %T %Z")"
02-10-2019 @ 15:33:50 BST
Code: Select all
[Wed Oct 2 15:26:41 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 0 - 02-10-2019 @ 15:26:19 BST
[Wed Oct 2 15:27:42 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 1 - 02-10-2019 @ 15:26:40 BST
[Wed Oct 2 15:28:12 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 2 - 02-10-2019 @ 15:27:40 BST
[Wed Oct 2 15:29:12 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 3 - 02-10-2019 @ 15:28:09 BST
[Wed Oct 2 15:29:42 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 4 - 02-10-2019 @ 15:29:09 BST
[Wed Oct 2 15:29:42 2019] SERVICE NOTIFICATION: client;client;Test;CRITICAL;notifyservice-client;Message 5 - 02-10-2019 @ 15:29:40 BST
[Wed Oct 2 15:29:42 2019] SERVICE ALERT: client;Test;CRITICAL;HARD;1;Message 5 - 02-10-2019 @ 15:29:40 BST
[Wed Oct 2 15:30:22 2019] PASSIVE SERVICE CHECK: client;Test;2;Message 5 - 02-10-2019 @ 15:29:40 BST
[Wed Oct 2 15:31:11 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 0 - 02-10-2019 @ 15:30:20 BST
[Wed Oct 2 15:31:42 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 1 - 02-10-2019 @ 15:31:10 BST
[Wed Oct 2 15:32:11 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 2 - 02-10-2019 @ 15:31:40 BST
[Wed Oct 2 15:32:51 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 3 - 02-10-2019 @ 15:32:10 BST
[Wed Oct 2 15:33:52 2019] PASSIVE SERVICE CHECK: client;Test;0;Message 4 - 02-10-2019 @ 15:32:50 BST
[Wed Oct 2 15:33:52 2019] SERVICE NOTIFICATION: client;client;Test;CRITICAL;notifyservice-client;Message 5 - 02-10-2019 @ 15:33:50 BST
[Wed Oct 2 15:33:52 2019] SERVICE ALERT: client;Test;CRITICAL;HARD;1;Message 5 - 02-10-2019 @ 15:33:50 BST
This continues until I send a check at 15:29:40 with a different return code. This prompts the previous check run at 15:29:09 to be logged and followed immediately by the alert and notification.
Hopefully I've explained the situation clearly.
Software versions in use are the following packages for CentOS:
mod_gearman : 3.1.0
gearmand : 0.33-7
nagios : 4.4.3
If any additional information is required, just ask.
Thanks in advance.