Interesting Situation...
-
rkymtnhigh
- Posts: 95
- Joined: Tue May 12, 2015 11:53 am
Interesting Situation...
Here is what is happening:
We have 2 external to our network Nagios servers that run perl scripts to check that our service is operational.
One is hosted with AWS, another with Digital Ocean.
From time to time, one install will show that the service is DOWN, but the other will remain UP.
It seems that both locations exhibit the same symptoms at random times.
We obviously need this service to let us know if we go DOWN, but we don't want to be falsely alerted (or woken up) if it is an issue isolated to ONE hosting provider.
We have reached out to both services, but they are unhelpful in identifying any issues on their end.
So for now, we have one install muted (the most recently problematic install) and only get alerts from the "longest-running service uptime" Nagios server.
However, then that server will have issues too. Then we switch. This is not a great solution going forward, as we still get alerted when it seems that it is only an issue isolated to ONE hosting provider.
My question is, does anyone know of a way to "link" the 2 nagios installs to make them aware of each other, and possibly ONLY alert when BOTH service checks are down?
Thank you so much, this has been quite the headache for our Operations team!
RMH
We have 2 external to our network Nagios servers that run perl scripts to check that our service is operational.
One is hosted with AWS, another with Digital Ocean.
From time to time, one install will show that the service is DOWN, but the other will remain UP.
It seems that both locations exhibit the same symptoms at random times.
We obviously need this service to let us know if we go DOWN, but we don't want to be falsely alerted (or woken up) if it is an issue isolated to ONE hosting provider.
We have reached out to both services, but they are unhelpful in identifying any issues on their end.
So for now, we have one install muted (the most recently problematic install) and only get alerts from the "longest-running service uptime" Nagios server.
However, then that server will have issues too. Then we switch. This is not a great solution going forward, as we still get alerted when it seems that it is only an issue isolated to ONE hosting provider.
My question is, does anyone know of a way to "link" the 2 nagios installs to make them aware of each other, and possibly ONLY alert when BOTH service checks are down?
Thank you so much, this has been quite the headache for our Operations team!
RMH
Re: Interesting Situation...
You should be able to use an agent on both of the XI machines (one on DO, one on AWS) (NRPE). From there, you'll want to setup one or both machines to run check_nrpe.
Using check_nrpe you should be able to run a remote check from each machine, and then use BPI to group these checks. Here's an example of the checks which would run on both machines using check_http and let's call your LAN nagios.com. This is only an outline, you'll still need to setup a check_http command with NRPE -
AWS:
check_http nagios.com
check_nrpe -H DO -c check_http
DO:
check_http nagios.com
check_nrpe -H AWS -c check_http
Now, using the BPI wizard you should be able to accomplish this. You can have both machines notify only if both services are down.
Will this work for you?
Using check_nrpe you should be able to run a remote check from each machine, and then use BPI to group these checks. Here's an example of the checks which would run on both machines using check_http and let's call your LAN nagios.com. This is only an outline, you'll still need to setup a check_http command with NRPE -
AWS:
check_http nagios.com
check_nrpe -H DO -c check_http
DO:
check_http nagios.com
check_nrpe -H AWS -c check_http
Now, using the BPI wizard you should be able to accomplish this. You can have both machines notify only if both services are down.
Will this work for you?
Former Nagios Employee
-
rkymtnhigh
- Posts: 95
- Joined: Tue May 12, 2015 11:53 am
Re: Interesting Situation...
Thank you! That does sound like it should work.
Just to clarify a couple things, you are suggesting installing the nagios client / NRPE on both CentOS Nagios boxes?
Something like this: http://ithelpblog.com/os/linux/redhat/c ... -6-3-rhel/
Then set a check on DO to check AWS's service availability? (And vice versa)
Then set up the nagios boxes as hosts on the other nagios boxes? Lets say our appcheck is check_app. Add a custom "check_remote_app" command that uses check_nrpe and runs the check_app on the remote host?
Once that is working, then use BPI wizard to somehow link the two? I don't have any experience using that tool, but can probably figure it out.
Thank you,
RMH
Just to clarify a couple things, you are suggesting installing the nagios client / NRPE on both CentOS Nagios boxes?
Something like this: http://ithelpblog.com/os/linux/redhat/c ... -6-3-rhel/
Then set a check on DO to check AWS's service availability? (And vice versa)
Then set up the nagios boxes as hosts on the other nagios boxes? Lets say our appcheck is check_app. Add a custom "check_remote_app" command that uses check_nrpe and runs the check_app on the remote host?
Once that is working, then use BPI wizard to somehow link the two? I don't have any experience using that tool, but can probably figure it out.
Thank you,
RMH
-
rkymtnhigh
- Posts: 95
- Joined: Tue May 12, 2015 11:53 am
Re: Interesting Situation...
Well, I didn't get too far. When trying to install NRPE and the client I get this:
Code: Select all
--> Processing Conflict: nagiosxi-deps-5.2.3-1.noarch conflicts nagios-nrpe
--> Processing Conflict: nagiosxi-deps-5.2.3-1.noarch conflicts nrpe
Re: Interesting Situation...
You got it. Do you perhaps have NRPE installed already?
What command were you running to produce the conflict?
What command were you running to produce the conflict?
Former Nagios Employee
-
rkymtnhigh
- Posts: 95
- Joined: Tue May 12, 2015 11:53 am
Re: Interesting Situation...
It says NRPE is available, but not installed. Something along those lines.
I am running the
When I try to start service start nrpe, it gives me unrecognized service.
Thank you.
I am running the
Code: Select all
yum install nagios-nrpe nagios-develThank you.
Re: Interesting Situation...
Give this document a look for information on installing NRPE. https://assets.nagios.com/downloads/nag ... ios-XI.pdf
Were you following a certain set of instructions that said to install those packages?
EDIT: ^ just saw that was in the link you posted. It may be because XI is running on the current machine.
EDIT2: Can you post the result of a yum repolist?
Were you following a certain set of instructions that said to install those packages?
EDIT: ^ just saw that was in the link you posted. It may be because XI is running on the current machine.
EDIT2: Can you post the result of a yum repolist?
Former Nagios Employee
Re: Interesting Situation...
Most probably NRPE is running under xinetd on these machines. Do you have the following file on either of these boxes - "/etc/xinetd.d/nrpe"? What is the output of the following commands?
Code: Select all
service xinetd restart
netstat -an | grep 5666
/usr/local/nagios/libexec/check_nrpe -H localhostBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: Interesting Situation...
@lmiltchev is right - NRPE is installed on the Nagios XI machine by default (didn't realize this). If the word 'localhost' does not work, try 127.0.0.1.
You'll need to modify the /etc/xinetd.d/nrpe file, specifically the only_from to allow the reciprocals to allow access between the two.
You'll need to modify the /etc/xinetd.d/nrpe file, specifically the only_from to allow the reciprocals to allow access between the two.
Former Nagios Employee
-
rkymtnhigh
- Posts: 95
- Joined: Tue May 12, 2015 11:53 am
Re: Interesting Situation...
Looks like it is installed under xinetd.
I see that port listening, NRPE v2.14
Now I just need to set up my checks and get everything configured. Will report back.
Thank you!
I see that port listening, NRPE v2.14
Now I just need to set up my checks and get everything configured. Will report back.
Thank you!