Hi
I have Nagios monitoring a number of servers, one of which keeps throwing back the following problem which emails itself as the following alert.
***** Nagios XI Alert *****
Nagios has detected a problem with this service.
Notification Type: PROBLEM
Service: CCMS Master Service
Host: AACC Co-Res Server 1
Address: 10.10.10.2
State: WARNING
Info:
No data was received from host!
Date/Time: 05/02/2014 19:01:47
Respond: http://10.10.10.3/nagiosxi//rr.php?uid= ... 2729adabd3
Nagios URL: http://10.10.10.3/nagiosxi/
which then subsequently resolves itself with the following with the following alert
***** Nagios XI Alert *****
Nagios has detected this service has recovered.
Notification Type: RECOVERY
Service: CCMS Master Service
Host: AACC Co-Res Server 1
Address: 10.10.10.2
State: OK
Info:
ccmssrvc.exe: Running
Date/Time: 05/02/2014 19:02:22
Respond: http://10.10.10.3/nagiosxi//rr.php?uid= ... 2729adabd3
Nagios URL: http://10.10.10.3/nagiosxi/
Nagios XI 2012R1.6
Please can I have some pointers on how to isolate and resolve this problem.
I am monitoring whether a bunch of services/applications are running on this windows box and getting alerts against them all, which show no data received from host, and then later, all is okay.
Thanks!
No data was received from host!
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: No data was received from host!
Are you using NSClient++ to monitor this server? Are there some how multiple copies of NSclient++ running? Is something else periodically grabbing port 5666 or 12489 "which nsclient uses"?
Re: No data was received from host!
Hi!
Thanks for the prompt response, I appreciate that.
I've looked in task Manager and have only one instance of nscp.exe running which i believe to be the nsclient++
looking at netstat, my interpretation is that nscp.exe is permanently listening on port 12489 and 5666
where 10.10.10.3 is nagios and 10.10.10.2 is the windows server itself and the pid of nscp is 3720
Thanks for the prompt response, I appreciate that.
I've looked in task Manager and have only one instance of nscp.exe running which i believe to be the nsclient++
looking at netstat, my interpretation is that nscp.exe is permanently listening on port 12489 and 5666
where 10.10.10.3 is nagios and 10.10.10.2 is the windows server itself and the pid of nscp is 3720
Code: Select all
TCP 10.10.10.2:12489 10.10.10.3:52447 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52447 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52477 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52479 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52480 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52481 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52482 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52483 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52485 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52487 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52544 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52549 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52551 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52552 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52553 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52560 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52565 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52567 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52569 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52572 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52666 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52696 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52698 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52700 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52701 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52702 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52703 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52705 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52709 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52751 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52757 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52761 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52766 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52767 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52768 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52773 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52775 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52776 TIME_WAIT 0
TCP 10.10.10.2:12489 10.10.10.3:52780 TIME_WAIT 0
TCP [::]:5666 [::]:0 LISTENING 3720
TCP 0.0.0.0:5666 0.0.0.0:0 LISTENING 3720
TCP 0.0.0.0:12489 0.0.0.0:0 LISTENING 3720-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: No data was received from host!
Hmm, are this server's checks constantly throwing this result? Or is this sporadic?
Re: No data was received from host!
Hi, can I provide you with any more information to make further progress on this?
Re: No data was received from host!
Yes, please answer:rpope wrote:Hi, can I provide you with any more information to make further progress on this?
slansing wrote:Hmm, are this server's checks constantly throwing this result? Or is this sporadic?
Former Nagios employee
Re: No data was received from host!
Thats a co-incidence, our posts were at exactly the same times!
Yes its sporadic, for example, I got approx 80 alerts overnight, and none all day today from the server.
Thanks
Yes its sporadic, for example, I got approx 80 alerts overnight, and none all day today from the server.
Thanks
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: No data was received from host!
Haha, it happens..
So are these issues commonly occurring over night? Is there something in your network between nagios and that host, or on that host that happens at night? Windows does some silly things.
So are these issues commonly occurring over night? Is there something in your network between nagios and that host, or on that host that happens at night? Windows does some silly things.
Re: No data was received from host!
Actually its so sporadic its not limited to either day OR night. For example, I received the alerts all day yesterday, and all night last night, but had none at all today.
its completely random when the alert occur, I cant even begin to predict when they will start up again. There is nothing of note that happens on these servers in terms of change because they are extremely highly critical to business function as such any change to them has to be approved by a ridiculous number of people. I can get some log files from the problem server itself if they will be helpful?
Note, in case its not clear, there is actually nothing wrong with the server, just these "false" alarms.
its completely random when the alert occur, I cant even begin to predict when they will start up again. There is nothing of note that happens on these servers in terms of change because they are extremely highly critical to business function as such any change to them has to be approved by a ridiculous number of people. I can get some log files from the problem server itself if they will be helpful?
Note, in case its not clear, there is actually nothing wrong with the server, just these "false" alarms.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: No data was received from host!
It would be a good idea to make sure that logging is enabled in the nsc/nsclient.ini file on that server, and then, when these issues pop up again, grab that log file and send it in with the ticket you will be opening, that should give us a better view.