Could not read host and service status information
Posted: Wed Apr 15, 2015 2:35 pm
Hello. I have an instance of Nagios Core that was inherited by my team a couple of years ago. It has been working fine until this week. Yesterday there were alerts going out for a service that did not appear to exist in the web UI, but it was found in the configs. That particular service wasn't really needed, so I just commented it out and all was well.
However, today the same thing happened. A host that should have many services applied to it is showing no services at all, but alerts are still going out for them.
I restarted nagios, and now I am getting this from the web UI when I try to click on anything:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Whoops!
Error: Could not read host and service status information!
The most common cause of this error message (especially for new users), is the fact that Nagios is not actually running. If Nagios is indeed not running, this is a normal error message. It simply indicates that the CGIs could not obtain the current status of hosts and services that are being monitored. If you've just installed things, make sure you read the documentation on starting Nagios.
Some other things you should check in order to resolve this error include:
Check the Nagios log file for messages relating to startup or status data errors.
Always verify configuration options using the -v command-line option before starting or restarting Nagios!
Make sure you read the documentation on installing, configuring and running Nagios thoroughly before continuing. If all else fails, try sending a message to one of the mailing lists. More information can be found at http://www.nagios.org.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I found this issue already posted in a few places, but none of them see to be the same situation. I am not running SELinux. I have tried running in daemon mode. In nagios.log I see this but not much else on startup:
[1429126364] Nagios 3.2.2 starting... (PID=16852)
[1429126364] Local time is Wed Apr 15 14:32:44 CDT 2015
[1429126364] LOG VERSION: 2.0
[1429126364] Finished daemonizing... (New PID=16853)
[1429126364] Warning: File '/dev/shm/host-perfdata' could not be opened - host performance data will not be written to file!
[1429126364] Warning: File '/dev/shm/service-perfdata' could not be opened - service performance data will not be written to file!
[1429126365] Error: Unable to open file '/dev/shm/status.dat' for writing: Bad file descriptor
[1429126365] Error: Unable to rename file '/var/nagios/nagios.tmpAyUGm6' to '/dev/shm/status.dat': Bad file descriptor
[1429126365] Error: Unable to update status data file '/dev/shm/status.dat': Bad file descriptor
Checks appear to be running fine in the background when I look at processes running under nagios user.
This is very urgent to my company, because although inherited, this is a critical enterprise instance. We are migrating it to a new XI instance in the same datacenter, but we need this working until that work is completed. Any assistance much appreciated.
However, today the same thing happened. A host that should have many services applied to it is showing no services at all, but alerts are still going out for them.
I restarted nagios, and now I am getting this from the web UI when I try to click on anything:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Whoops!
Error: Could not read host and service status information!
The most common cause of this error message (especially for new users), is the fact that Nagios is not actually running. If Nagios is indeed not running, this is a normal error message. It simply indicates that the CGIs could not obtain the current status of hosts and services that are being monitored. If you've just installed things, make sure you read the documentation on starting Nagios.
Some other things you should check in order to resolve this error include:
Check the Nagios log file for messages relating to startup or status data errors.
Always verify configuration options using the -v command-line option before starting or restarting Nagios!
Make sure you read the documentation on installing, configuring and running Nagios thoroughly before continuing. If all else fails, try sending a message to one of the mailing lists. More information can be found at http://www.nagios.org.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I found this issue already posted in a few places, but none of them see to be the same situation. I am not running SELinux. I have tried running in daemon mode. In nagios.log I see this but not much else on startup:
[1429126364] Nagios 3.2.2 starting... (PID=16852)
[1429126364] Local time is Wed Apr 15 14:32:44 CDT 2015
[1429126364] LOG VERSION: 2.0
[1429126364] Finished daemonizing... (New PID=16853)
[1429126364] Warning: File '/dev/shm/host-perfdata' could not be opened - host performance data will not be written to file!
[1429126364] Warning: File '/dev/shm/service-perfdata' could not be opened - service performance data will not be written to file!
[1429126365] Error: Unable to open file '/dev/shm/status.dat' for writing: Bad file descriptor
[1429126365] Error: Unable to rename file '/var/nagios/nagios.tmpAyUGm6' to '/dev/shm/status.dat': Bad file descriptor
[1429126365] Error: Unable to update status data file '/dev/shm/status.dat': Bad file descriptor
Checks appear to be running fine in the background when I look at processes running under nagios user.
This is very urgent to my company, because although inherited, this is a critical enterprise instance. We are migrating it to a new XI instance in the same datacenter, but we need this working until that work is completed. Any assistance much appreciated.