Page 1 of 1

Nagios windows agent issue

Posted: Mon Mar 12, 2018 10:18 am
by padu_3891
Hello Team,

My nagios server is a virtual machine, all of sudden, the alerts were triggered for agent "Unable to establish communication with Agent" of 100 servers. I have tried executing the command, for first execution, I got the results and for second execution (immediate after first), got the error "Unable to establish communication with Agent" and it goes on....

The issue persisted for 3 continuous days, now everything is back to normal.

what could be the cause of the issue?

Is it related to network?

We have check the nagios server load, CPU, etc, all looks fine. Similarly we checked with network team, no issues as well.

Please let us know how can we find the cause of this issue to take preventive action.

Thank you,
Padma Muthu

Re: Nagios windows agent issue

Posted: Tue Mar 13, 2018 9:05 am
by mcapra
I am going to assume the agent you are using is NSClient++. Please correct me if I am wrong.

Which plugin is being used on the Nagios Core side of things to reach out to NSClient++? What version of that plugin are you using?

Which version of NSClient++ is being used on your machines? Do you have a standard NSClient++ configuration these machines use and, if so, could you share it?
padu_3891 wrote:We have check the nagios server load, CPU, etc, all looks fine.
Did you also check the Nagios Core machine's available file descriptors, open file limits, and available sockets?

Re: Nagios windows agent issue

Posted: Tue Mar 13, 2018 12:05 pm
by kyang
Thanks for the help @mcapra!

Re: Nagios windows agent issue

Posted: Fri Mar 16, 2018 3:54 pm
by padu_3891
I am going to assume the agent you are using is NSClient++. You are correct.

Which plugin is being used on the Nagios Core side of things to reach out to NSClient++? What version of that plugin are you using?

Check_nrpe, version 2.12


Which version of NSClient++ is being used on your machines? Do you have a standard NSClient++ configuration these machines use and, if so, could you share it?

Nsclient++ version 4.3.1, yes it is a standard configuration. Do you want to share the nsclient.ini file?

Did you also check the Nagios Core machine's available file descriptors, open file limits, and available sockets?

Yes, everything is fine, no issues found

Re: Nagios windows agent issue

Posted: Mon Mar 19, 2018 10:15 am
by mcapra
Can you share the full historical nagios log that contains these ~100 or so failures? Typically the historical logs can be found here:

Code: Select all

/usr/local/nagios/var/archives
I'd like to see the full log from a given day if possible, not just a handful of entries demonstrating the error message.

Which OS and version of that OS is this machine using? Which hypervisor is hosting the VM?

Also, if you happen to have a copy of your system's primary log file (/var/log/messages on CentOS/RHEL) from that same time period, that may be useful.

I'm fairly confident this is some sort of system/network related issue rather than a failure of NSClient++ or the check_nrpe plugin specifically (I could be wrong). I've seen setups executing ~100 or so simultaneous check_nrpe calls to various agents (mostly NSClient++) without totally tanking. Besides that, given how check_nrpe functions, I don't think it would make sense for a few hundred agents to simultaneously stop responding unless there was some sort of network/system issue that prevented check_nrpe from correctly establishing a connection.

Re: Nagios windows agent issue

Posted: Mon Mar 19, 2018 10:25 am
by tmcdonald
Thanks for the assist, @mcapra!

@padu_3891, let us know if you have further (related) questions.

Re: Nagios windows agent issue

Posted: Wed Mar 21, 2018 7:35 am
by padu_3891
@mcapra Thanks a lot for your suggestion . as you said i found both the issues . My server resource CPU utlisation was high that may be one cause.


i am going to increase the server resource as of now and let you know if i face more issues .

Just one more query .

Having the nagios server in VMWARE environment will cause any issue ? .. stand alone machine or Virtual machine which one will you suggest ?

Re: Nagios windows agent issue

Posted: Wed Mar 21, 2018 10:11 am
by tmcdonald
Core can run equally well in a VM or on physical hardware. The differences in performance are minor for the most part, and really don't show themselves until an environment becomes quite large.