Not getting alerts when Wan link goes down in Network device
-
shinuvarghesea
- Posts: 7
- Joined: Fri Nov 17, 2017 3:35 pm
Not getting alerts when Wan link goes down in Network device
While configuring, I have enabled Nagios to Monitor services related to the network devices.
Nagios auto discovers various ports on every Firewall including Wan ports.
I have two internet links attached to firewall, one as primary and other as backup which are connected to WAN ports.
Using Nagios I can see the Bandwidth utilization of those wan ports but when any port or link goes down, lets say internet link of primary wan1 port goes down, it does not give any alert, neither it shows the wan1 service down. It always shows green "OK" even when its showing down in firewall.
How does the Nagios monitoring of Network device services work? What can I do to get alerts when either of wan port goes down.
Nagios auto discovers various ports on every Firewall including Wan ports.
I have two internet links attached to firewall, one as primary and other as backup which are connected to WAN ports.
Using Nagios I can see the Bandwidth utilization of those wan ports but when any port or link goes down, lets say internet link of primary wan1 port goes down, it does not give any alert, neither it shows the wan1 service down. It always shows green "OK" even when its showing down in firewall.
How does the Nagios monitoring of Network device services work? What can I do to get alerts when either of wan port goes down.
Re: Not getting alerts when Wan link goes down in Network de
What is the check that you are using to monitor the status of your port? How often are you checking it? Are you using the check_ifoperstatus plugin?
Here's an example of two checks - one OK, and one critical:
Here's an example of two checks - one OK, and one critical:
Code: Select all
/usr/local/nagios/libexec/check_ifoperstatus -H 192.168.x.x -C community -k 1 -v 2 -p 161
OK: Interface 1 (index 1) is up.
/usr/local/nagios/libexec/check_ifoperstatus -H 192.168..x.x -C community -k 10 -v 2 -p 161
CRITICAL: Interface 10 (index 10) is down.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
shinuvarghesea
- Posts: 7
- Joined: Fri Nov 17, 2017 3:35 pm
Re: Not getting alerts when Wan link goes down in Network de
Hi,
I am not very well versed with Nagios commands. I have configured it using GUI and in the core manager I am able to see following check command which is active for monitoring and its the same as mentioned by you.
$USER1$/check_ifoperstatus -H $HOSTADDRESS$ -C $ARG1$ -k $ARG2$ $ARG3$
After executing "Run check command" I can see below output. -
/usr/local/nagios/libexec/check_ifoperstatus -H 10.x.x.x -C abcd -k efghij12345 12
where - abcdefghi12345 is the value of my community string.
I have set it to run every 5 minutes.
I am not very well versed with Nagios commands. I have configured it using GUI and in the core manager I am able to see following check command which is active for monitoring and its the same as mentioned by you.
$USER1$/check_ifoperstatus -H $HOSTADDRESS$ -C $ARG1$ -k $ARG2$ $ARG3$
After executing "Run check command" I can see below output. -
/usr/local/nagios/libexec/check_ifoperstatus -H 10.x.x.x -C abcd -k efghij12345 12
where - abcdefghi12345 is the value of my community string.
I have set it to run every 5 minutes.
Re: Not getting alerts when Wan link goes down in Network de
Instead of this:
try running:
It seems like your community string was "split" by the "-k" flag. I don't know why this happened... It's possible that you modified your service in the CCM, and accidentally messed it up.
Code: Select all
/usr/local/nagios/libexec/check_ifoperstatus -H 10.x.x.x -C abcd -k efghij12345 12Code: Select all
/usr/local/nagios/libexec/check_ifoperstatus -H 10.x.x.x -C abcdefghij12345 -k 12Be sure to check out our Knowledgebase for helpful articles and solutions!
-
shinuvarghesea
- Posts: 7
- Joined: Fri Nov 17, 2017 3:35 pm
Re: Not getting alerts when Wan link goes down in Network de
Hi there,
I did not make any changes in CCM, flag k is there for all the devices which are configured using the GUI by default.
Even if I configure any new device here, I am getting that flag.
I have already tried running the command without the k flag as suggested by you, but it is giving an error.
Error is as below-
efghij12345: event not found
One thing we can notice here is that the string in the error line starts from the part which comes after k in default configuration.
Ill like to add one note that the alphabet e here in our sample string is actually symbol "!" in my original string that I am using. So not sure if ! has anything to do with the string separation or error.
I did not make any changes in CCM, flag k is there for all the devices which are configured using the GUI by default.
Even if I configure any new device here, I am getting that flag.
I have already tried running the command without the k flag as suggested by you, but it is giving an error.
Code: Select all
/usr/local/nagios/libexec/check_ifoperstatus -H 10.x.x.x -C abcdefghij12345 -k 12
efghij12345: event not found
One thing we can notice here is that the string in the error line starts from the part which comes after k in default configuration.
Ill like to add one note that the alphabet e here in our sample string is actually symbol "!" in my original string that I am using. So not sure if ! has anything to do with the string separation or error.
Re: Not getting alerts when Wan link goes down in Network de
The "!" could definitely cause issues as it is used as a delimiter in nagios. You could try placing your community string in the resource.cfg file, and use a user macro. Read more on the topic here:
https://assets.nagios.com/downloads/nag ... Macros.pdf
https://assets.nagios.com/downloads/nag ... Macros.pdf
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
shinuvarghesea
- Posts: 7
- Joined: Fri Nov 17, 2017 3:35 pm
Re: Not getting alerts when Wan link goes down in Network de
Hi,
So I have created a macro user $USER9$.
Now I am getting below error
WARNING: SNMP error: No response from remote host '10.x.x.x'
I verified that the value for macro user is correct.
Tried to use $USER9$ to configure a random device(Through GUI) instead of entering my complete string and it was able to discover everything properly.
The SNMP string is Correct. What could this error be?
So I have created a macro user $USER9$.
Now I am getting below error
WARNING: SNMP error: No response from remote host '10.x.x.x'
I verified that the value for macro user is correct.
Tried to use $USER9$ to configure a random device(Through GUI) instead of entering my complete string and it was able to discover everything properly.
The SNMP string is Correct. What could this error be?
Re: Not getting alerts when Wan link goes down in Network de
The user macro should work in the GUI. It will probably not resolve when you run the check from the command line. When you set it up in the GUI, and force an immediate check, does it work then?
Can you go to the CCM > Services > <your service>, and show us a screenshot of the page?
Also, go to Home > Service Status > <your service> > Force an immediate check (under Quick Actions), and show a screenshot of this page.
Can you go to the CCM > Services > <your service>, and show us a screenshot of the page?
Also, go to Home > Service Status > <your service> > Force an immediate check (under Quick Actions), and show a screenshot of this page.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
shinuvarghesea
- Posts: 7
- Joined: Fri Nov 17, 2017 3:35 pm
Re: Not getting alerts when Wan link goes down in Network de
Hi,
Below are the screenshots.
I am getting Warning in the GUI as well and the string seems to be fine in the CCM
Below are the screenshots.
I am getting Warning in the GUI as well and the string seems to be fine in the CCM
You do not have the required permissions to view the files attached to this post.
Re: Not getting alerts when Wan link goes down in Network de
Edit your service in the CCM by wrapping the $USER9$ macro in the $ARG1$ field in double quotes. Save, and apply configuration. Schedule a forced immediate check again. Did it work now?
Be sure to check out our Knowledgebase for helpful articles and solutions!