I would like to seek some help regarding nagiosxi.
We have been receiving a frequent notification that says "CRITICAL - 10.20.20.58: rta nan, lost 100%".
While investigating on the server in question, it appears that it is UP all the time, it is pingable (no packet loss) and we can ssh to the box.
We have checked the server's uptime and there are no reboots/shutdown whatsoever.
I also have checked the interface port of the switch where this LAN is connected to and it is showing no down time as well.
Please note that there are no system update done on this server and it is not connected to the internet.
Anyone have encountered this in your environment and what is your solution?
Thank you very much.
NagiosXI keeps sending notification "CRITICAL - x.x.x.x: rt
-
support.lta
- Posts: 10
- Joined: Mon Jul 03, 2017 9:28 am
NagiosXI keeps sending notification "CRITICAL - x.x.x.x: rt
You do not have the required permissions to view the files attached to this post.
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: NagiosXI keeps sending notification "CRITICAL - x.x.x.x:
@support.lta, Is the host check OK or in Critical right now? If this happens irregularly, I'd add a timeout value. Perhaps when the server is busy it responds slower and the check times out. What's the name of the server?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: NagiosXI keeps sending notification "CRITICAL - x.x.x.x:
I see the following entry in the mariadb log:
Also, you have multiple nagios processes running, which can cause the issue:
Note: This problem was more common with some of the older versions of Nagios XI. The issue is rarely seen in the newer version of XI. I would recommend that you upgrade to the latest version of Nagios XI.
Let us know if this helped. Thank you!
Does this IP exist on your server? Have you changed the IP of your Nagios XI server lately?Version: '5.5.52-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
171212 10:24:47 [Warning] IP address '172.30.61.50' could not be resolved: Name or service not known
Also, you have multiple nagios processes running, which can cause the issue:
In order to resolve the issue, run the following commands from the command line:nagios 5064 1 0 2017 ? 03:06:27 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 5420 1 0 2017 ? 02:58:58 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 8652 1 0 Jan19 ? 00:19:29 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12187 1 0 2017 ? 03:03:52 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12219 1 0 2017 ? 03:03:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12714 1 0 2017 ? 02:56:43 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 19675 1 0 2017 ? 02:59:47 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 28688 1 0 2017 ? 03:05:13 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 31439 1 0 2017 ? 02:43:15 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 31711 1 0 2017 ? 02:57:23 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Code: Select all
service nagios stop
killall nagios
service nagios startLet us know if this helped. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
support.lta
- Posts: 10
- Joined: Mon Jul 03, 2017 9:28 am
Re: NagiosXI keeps sending notification "CRITICAL - x.x.x.x:
It is not critical at the moment because it recovers quickly like in seconds which frustrating due to notification sent across the team and is false.npolovenko wrote:@support.lta, Is the host check OK or in Critical right now? If this happens irregularly, I'd add a timeout value. Perhaps when the server is busy it responds slower and the check times out. What's the name of the server?
This server is the our application server and the interface in question is going to the backup.
How do I add/change the timeout value, we are quite new to nagiosxi?
-
support.lta
- Posts: 10
- Joined: Mon Jul 03, 2017 9:28 am
Re: NagiosXI keeps sending notification "CRITICAL - x.x.x.x:
Thanks for the looking into this.lmiltchev wrote:I see the following entry in the mariadb log:Does this IP exist on your server? Have you changed the IP of your Nagios XI server lately?Version: '5.5.52-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
171212 10:24:47 [Warning] IP address '172.30.61.50' could not be resolved: Name or service not known
Also, you have multiple nagios processes running, which can cause the issue:In order to resolve the issue, run the following commands from the command line:nagios 5064 1 0 2017 ? 03:06:27 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 5420 1 0 2017 ? 02:58:58 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 8652 1 0 Jan19 ? 00:19:29 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12187 1 0 2017 ? 03:03:52 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12219 1 0 2017 ? 03:03:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 12714 1 0 2017 ? 02:56:43 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 19675 1 0 2017 ? 02:59:47 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 28688 1 0 2017 ? 03:05:13 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 31439 1 0 2017 ? 02:43:15 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 31711 1 0 2017 ? 02:57:23 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfgNote: This problem was more common with some of the older versions of Nagios XI. The issue is rarely seen in the newer version of XI. I would recommend that you upgrade to the latest version of Nagios XI.Code: Select all
service nagios stop killall nagios service nagios start
Let us know if this helped. Thank you!
The IP in question is 10.20.20.58 and we have not change any IP in our Nagios XI server.
Regarding these processes, we will look into it as well.
But to give you some note that we have not done any update in our Nagios Xi for a year.
We did not encounter any of these issues until 2-3 months ago. Is there any permanent solution for this?
This is a production server and everyone gets alarm whenever there is a notification coming in and is also a CRITICAL..
I will propose this upgrade to our team, is this something we just need to run or it take a few process and configuration?
Again, thank you for the response.
Re: NagiosXI keeps sending notification "CRITICAL - x.x.x.x:
It is possible that you added more checks, some of which may take a very long time to execute. If nagios doesn't exit "cleanly", and you have some timeouts, you may end up with multiple nagios processes (instances of nagios) running on the same server. It is hard to say what caused the issue after the fact. If you start experiencing the issue again, open a new ticket here: https://support.nagios.com/tickets and upload your profile (Admin > System Profile > Download Profile)We did not encounter any of these issues until 2-3 months ago. Is there any permanent solution for this?
I would recommend fixing the issue with the multiple nagios instances first.I will propose this upgrade to our team, is this something we just need to run or it take a few process and configuration?
Code: Select all
service nagios stop
killall nagios
service nagios starthttps://assets.nagios.com/downloads/nag ... ctions.pdf
Be sure to check out our Knowledgebase for helpful articles and solutions!