Page 1 of 1
Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 12:19 pm
by Jklre
I haven't found the ability to create an alert if a heartbeat message has not been received. We have several processes that send out these heartbeats and we want to know if they are not responding. Is this possible? Thanks in advance
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 12:26 pm
by jolson
Can you please elaborate a little bit more? Does this machine send a heartbeat in the form of a 'log' to the Log Server - or does it heartbeat to a different component and send a log to NLS that explains whether the heartbeat was successful or not? Just need a little bit more info here. Thanks!
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 12:54 pm
by Jklre
jolson wrote:Can you please elaborate a little bit more? Does this machine send a heartbeat in the form of a 'log' to the Log Server - or does it heartbeat to a different component and send a log to NLS that explains whether the heartbeat was successful or not? Just need a little bit more info here. Thanks!
This just send a syslog message for example "<14>10001 gpgd_02.exe: gpgd_02.exe v1.8 Heartbeat" every 5 minutes. If the message does not get sent then the process is dead / hung / not running. We have a few other ones that send off a message every 15 - 30 minutes or so but its all through syslog.
So ideally we would want to run a query for messages received in the past 5 minutes and if its 0 then throw an alert.
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 1:11 pm
by Jklre
Jklre wrote:jolson wrote:Can you please elaborate a little bit more? Does this machine send a heartbeat in the form of a 'log' to the Log Server - or does it heartbeat to a different component and send a log to NLS that explains whether the heartbeat was successful or not? Just need a little bit more info here. Thanks!
This just send a syslog message for example "<14>10001 gpgd_02.exe: gpgd_02.exe v1.8 Heartbeat" every 5 minutes. If the message does not get sent then the process is dead / hung / not running. We have a few other ones that send off a message every 15 - 30 minutes or so but its all through syslog.
So ideally we would want to run a query for messages received in the past 5 minutes and if its 0 then throw an alert.
There's some similar functionality to what i'm talking about in MIeventD
http://mathias-kettner.com/checkmk_mkev ... nting.html
What I'm talking about is in the "Expect regular messages" section.
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 2:46 pm
by jdalrymple
It sounds like MK is doing the same thing there as freshness in Nagios.
http://nagios.sourceforge.net/docs/3_0/freshness.html
The question is whether or not Nagios is already receiving "passive checks" or if you're wanting to monitor some 3rd party message queue. It sounds like the latter. There are numerous plugins available on the exchange to monitor logfiles, although I don't know of any that interact with a syslogd to monitor freshness that way.
http://exchange.nagios.org/directory/Plugins/Log-Files
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 3:44 pm
by Jklre
Yeah.. I looked into other nagios plugins and there are none for syslogd and having another third party plugin look at the same syslog stream isn't a very practical solution.
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 4:14 pm
by jdalrymple
Sorry I wasn't paying attention to what forum I was in, I thought I was answering a Nagios question, not NLS. My apologies for sending mixed up info.
Upon further research, you're not the first person to ask for this. We have an internal feature request for this that is outstanding. I'll +1 it to our devs, but as of now we don't have a simple feature like what you're looking for.
It's not the best answer, but it's all I've got. Can I lock this thread?
Re: Alerting for heartbeat messages
Posted: Thu Mar 19, 2015 4:52 pm
by Jklre
jdalrymple wrote:Sorry I wasn't paying attention to what forum I was in, I thought I was answering a Nagios question, not NLS. My apologies for sending mixed up info.
Upon further research, you're not the first person to ask for this. We have an internal feature request for this that is outstanding. I'll +1 it to our devs, but as of now we don't have a simple feature like what you're looking for.
It's not the best answer, but it's all I've got. Can I lock this thread?
Yes please +1 +1 +1

I have a few other questions but i'll start a different thread form them. Thank you.