Email on server reboot

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
stebbo
Posts: 75
Joined: Sat Aug 04, 2012 9:13 pm

Email on server reboot

Post by stebbo »

Hi All,

I would like to receive an email when a server reboots unexpectedly. I've read the thread http://support.nagios.com/forum/viewtop ... 16&t=24936 but didn't understand the solution that pacmag implemented so am hoping for some guidance there or other ideas.

I am mostly monitoring windows servers and am monitoring the uptime variable on those servers. I was hoping it would be relatively easy to either
a) receive a notification if the returned uptime is lower than a previous return value, or
b) receive a notification if the returned uptime is lower than x minutes.

Often a reboot (due to windows updates or an administrator trying the perennial "a reboot will fix it") can be achieved in far less time than the 5 minute checks and so happens without my knowledge.

Cheers,
Chris.
Last edited by stebbo on Mon May 05, 2014 12:47 am, edited 1 time in total.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Email on server reboot

Post by sreinhardt »

Do you have a preferred plugin or preferred way of monitoring (agent, wmi, snmp) those systems? There definitely are ways of doing this, but if you are already monitoring in one way or another its probably best to work with that existing setup, rather than send you down a different path.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
stebbo
Posts: 75
Joined: Sat Aug 04, 2012 9:13 pm

Re: Email on server reboot

Post by stebbo »

Hi Spenser,

I am using nsclient++ on these windows servers, generally 0.4.1, and using check_nt for the command to retrieve the uptime data.

I only do this because this is what the server wizard sets up - if there's an easier or better way I'm happy to set it up differently.

Cheers,
Chris.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Email on server reboot

Post by Box293 »

Hi Chris,
You can do what you're saying relatively easily. However using the check_nt command for uptime is not going to work because it doesn't trigger warning or critical thresholds.

Howerver all is not lost. Instead you can use the check_nrpe command to query your Windows servers for the System Up Time performance counter (still using NSClient++) and then trigger alerts based on those thresholds.

From your Nagios host command line:

Code: Select all

check_nrpe -H <windows_server> -c CheckCounter -a "Counter=\System\System Up Time" ShowAll MinCrit=600
Which should respond with:

Code: Select all

OK: \System\System Up Time: 8921.99|'System Up Time'=8921.994197;0;600;
The number it returns is in seconds, so my system has been up for 148 minutes.

The MinCrit value of 600 equals 10 minutes. So when this check executes and the uptime is less than 10 minutes, it will trigger a critical status. Once the uptime is past 10 minutes the service will return to an OK state.

So when the service enters a critical state, whoever is a contact for that service will receive an alert (hence an email).

FYI when you setup your services for uptime, the performance counter will require double backslashes ...

Code: Select all

Counter=\\System\\System Up Time
Let me know how you go.

Troy
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: Email on server reboot

Post by WillemDH »

Useful info Troy! Gonna try this too some day. :)
Nagios XI 5.8.1
https://outsideit.net
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Email on server reboot

Post by sreinhardt »

Agreed, good post Troy! I'll have to make a note to add warning and critical values to that portion of check_nt.. unless we did that with plugins 2.0, I forget.. :)
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Email on server reboot

Post by Box293 »

Cheers :D

I'm a big fan of performance counters ...
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
stebbo
Posts: 75
Joined: Sat Aug 04, 2012 9:13 pm

Re: Email on server reboot

Post by stebbo »

Hi Troy,

thanks for the response. I tried the command as you entered above from the command line and initially received an error

Code: Select all

Request contained arguments (not currently allowed, check the allow arguments option).
I searched the forum and found some advice to add a line to the ini file

Code: Select all

allow arguments=1
which fixed that problem.

Then I got an illegal metacharacter error, so added

Code: Select all

allow_nasty_meta_chars=1
and all appears well. I shall add these services to this host and see how it goes over time.

Is there any down-side to allowing the arguments / nasty characters?

Thanks again,
Chris.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Email on server reboot

Post by tmcdonald »

Arguments on their own are fine, but if you allow metacharacters it can be a security risk. We put those characters in place to keep people from chaining commands and potentially compromising a system. For example, you could have check_nrpe call something like:

check_disk -w 20% -c 30% && cat /etc/passwd

for example, which is why we disallow the ampersand.
Former Nagios employee
stebbo
Posts: 75
Joined: Sat Aug 04, 2012 9:13 pm

Re: Email on server reboot

Post by stebbo »

Thanks for all the help everyone. This seems to be working quite well. I especially like that I can graph the uptime - the graphs don't look particularly nice but it keeps a history of the reboots which is a bonus.
Locked