I am running Nagios XI 2014R2.7 and NSClient ++ on the Windows servers and I am starting to see more and more service checks fire off critical alerts for Return code of 255 is out of bounds. The server I am running Nagios on is a VM with 16vCPU and 8GB of RAM. I am currently monitoring 1985 hosts and 11244 service checks. Version of the NSClient ++ is I (0.4.3.131 2015-02-15) seem to be doing fine.
I am not sure if this is network related or if I need to add more CPU or RAM to my Nagios XI server. Or if this is an agent issue. When I first implemented things were going good but here recently I get a few of these a day which really get people barking mad when they get woken up and find the service is alright.
Thoughts?
(Return code of 255 is out of bounds)
Re: (Return code of 255 is out of bounds)
Is it a similar check that you're running when you're doing this?
Former Nagios Employee.
me.
me.
Re: (Return code of 255 is out of bounds)
They vary is which makes me believe my Nagios server is getting overworked or network congestion.
One time it could happen on ServerA for disk check, another on ServerB for memory and the next day nothing then the following day completely different.
One time it could happen on ServerA for disk check, another on ServerB for memory and the next day nothing then the following day completely different.
Re: (Return code of 255 is out of bounds)
There are some performance tweaks you're going to want to do when you get to that level of checks.
First off, I would bump your RAM up to *at least* 16GB.
Implementing a RAMDisk is also something that can really enhance your performance, and generally recommended by us when you have a large number of checks.
First off, I would bump your RAM up to *at least* 16GB.
Implementing a RAMDisk is also something that can really enhance your performance, and generally recommended by us when you have a large number of checks.
Former Nagios Employee.
me.
me.
Re: (Return code of 255 is out of bounds)
I'll see what we can bump the RAM up to. I'll see if I can get 32GB but no less than 16GB. If that doesn't resolve the issue I'll look at adding the RAM disk or do you think I should do that anyways?
Re: (Return code of 255 is out of bounds)
I think it would be a good idea, but if resources are limited, bumping to at least 16GB would be a good place to start.
Former Nagios Employee.
me.
me.
Re: (Return code of 255 is out of bounds)
I have created the RAM disk and increased RAM on the VM to 32GB and still getting clients that are sending critical notifications with this message. What else can I do?
Re: (Return code of 255 is out of bounds)
You could upgrade the NSClient++ agent on the Windows host and see if that helps.
One thing you can check is to see if the Windows system isn't dropping the network connections from the Nagios server.
That could be causing the intermittent failures.
Take a look at this link and see if that helps.
https://support.microsoft.com/en-us/kb/2553549
One thing you can check is to see if the Windows system isn't dropping the network connections from the Nagios server.
That could be causing the intermittent failures.
Take a look at this link and see if that helps.
https://support.microsoft.com/en-us/kb/2553549
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: (Return code of 255 is out of bounds)
Our servers get rebooted monthly during our monthly maintenance window. So the Windows article doesn't appear to be relevant.
You recommending I update all my clients? I guess what is the max hosts and service checks a single Nagios XI instance can handle? Or recommended to handle? Is there a way to load balance Nagios XI instances or is there something that will help offset the primary Nagios server from performing all the work? Just digging here as these are questions I know Management will come back with.
You recommending I update all my clients? I guess what is the max hosts and service checks a single Nagios XI instance can handle? Or recommended to handle? Is there a way to load balance Nagios XI instances or is there something that will help offset the primary Nagios server from performing all the work? Just digging here as these are questions I know Management will come back with.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: (Return code of 255 is out of bounds)
First, can we go back and understand your configuration, is XI actively checking the hosts, or are they sending passive results?
What is the check frequency of these checks?
What is the check frequency of these checks?