Rebbot monitoring for the linux servers

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
informatica
Posts: 99
Joined: Thu Jan 28, 2021 9:55 pm

Rebbot monitoring for the linux servers

Post by informatica »

Hi Team,

we would like to monitor the reboot of server for the linux server.
For Ex: if the server is reboot we will come to know that after 5 min as per nagios standard setting every 5 min and retry 1 and max attempts 5 min.

But we need alert if the server is reboot/restart.
Do we have any service config like if the server is rebooted we should get the alert instead of changing the host check interval.
please let us know if we have such plugin for the linux server.

we hare using nrpe as agent in linux server.
User avatar
jdunitz
Posts: 235
Joined: Wed Feb 05, 2020 2:50 pm

Re: Rebbot monitoring for the linux servers

Post by jdunitz »

You could do something like this as a plugin script:

Code: Select all

#!/bin/bash
# Reboot alert

SECONDS=`cat /proc/uptime | awk -F. '{print $1}'`

if [ "$SECONDS" -lt "60" ]; then
result="CRITICAL"
exitstatus="2"

else
result="OK"
exitstatus="0"
fi

echo "$result - uptime is $SECONDS"
exit $exitstatus

You could modify or add on to this to make it behave how you want, if you needed it to be fancier.

--Jeffrey
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
informatica
Posts: 99
Joined: Thu Jan 28, 2021 9:55 pm

Re: Rebbot monitoring for the linux servers

Post by informatica »

Hi Team,

Idon't know how its working. I tried to execute the script checked with restarting the server but no luck this is not working as expected.

[nagios@in-root]$ /usr/local/nagios/libexec/check_nrpe -H inl77 -c check_uptime_minute
OK - uptime is 822
You have new mail in /var/spool/mail/root

The server is restarted and uptime is 6 min. But in script you mentioned -le 60 in seconds even thought alert is not generated .
[root@test libexec]# uptime
04:40:28 up 6 min, 1 user, load average: 0.02, 0.10, 0.05
User avatar
jdunitz
Posts: 235
Joined: Wed Feb 05, 2020 2:50 pm

Re: Rebbot monitoring for the linux servers

Post by jdunitz »

Sorry, a couple lines got cut off when I pasted the script. Here's the full one:

Code: Select all

#!/bin/bash
# Reboot alert

SECONDS=`cat /proc/uptime | awk -F. '{print $1}'`

if [ "$SECONDS" -lt "60" ]; then
result="CRITICAL"
exitstatus="2"

else
result="OK"
exitstatus="0"
fi

echo $result
exit $exitstatus
And you can see how it works:
After rebooting the other machine, as soon as it came back up and became reachable, I ran the check and got critical:

Code: Select all

[root@jpd-nagiosxi-one libexec]#  ./check_nrpe -H 192.168.1.10  -2  -c check_uptime_minute
CRITICAL

Then I waited another minute or so, and reran it, and:

Code: Select all

[root@jpd-nagiosxi-one libexec]#  ./check_nrpe -H 192.168.1.10  -2  -c check_uptime_minute
OK
[root@jpd-nagiosxi-one libexec]#
There you go!

--Jeffrey
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked