last check = null, wrong system time and time zone ++
last check = null, wrong system time and time zone ++
Hi,
We are in the process of implementing Nagios to monitor a number of servers.
We are using the Nagiosxi VMware appliance version 2011R2.2
However I have struck a couple of problems.
1. The server time appears to be out, as is probably the timezone. Standard CentOS commands to fix this are missing/not installed eg system-config-date and system-config-time
2. The "last check" time on the majority of hosts is showing as null i.e. 1970-01-01 00:00:00. This next check scheduled time always appears to be correct e.g. 2012-05-03 22:35:41
3. During configuration teething issues I see that there are orphan host configuration files in /usr/local/nagios/etc/hosts. This usually occurs if a configuration change fails to commit and errors. if i read the error log i can identify them , when I delete the orphans the configuration then commits, however they seem to return which suggests that they are being stored in the database and are being recreated on ever commit, yet they don't always cause issues. One orphan file was the same spelling, but a different case.
Is it possible to access the database and perform a purge so we are only left with valid services?, I can then go forth and recreate the hosts using the GUI and link them to the appropriate services.
Cheers,
KB.
We are in the process of implementing Nagios to monitor a number of servers.
We are using the Nagiosxi VMware appliance version 2011R2.2
However I have struck a couple of problems.
1. The server time appears to be out, as is probably the timezone. Standard CentOS commands to fix this are missing/not installed eg system-config-date and system-config-time
2. The "last check" time on the majority of hosts is showing as null i.e. 1970-01-01 00:00:00. This next check scheduled time always appears to be correct e.g. 2012-05-03 22:35:41
3. During configuration teething issues I see that there are orphan host configuration files in /usr/local/nagios/etc/hosts. This usually occurs if a configuration change fails to commit and errors. if i read the error log i can identify them , when I delete the orphans the configuration then commits, however they seem to return which suggests that they are being stored in the database and are being recreated on ever commit, yet they don't always cause issues. One orphan file was the same spelling, but a different case.
Is it possible to access the database and perform a purge so we are only left with valid services?, I can then go forth and recreate the hosts using the GUI and link them to the appropriate services.
Cheers,
KB.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: last check = null, wrong system time and time zone ++
1 & 2
There is a post here
http://support.nagios.com/forum/viewtop ... 26&p=24962
that goes over this pretty well, and you also need to set the timezone in the /etc/php.ini, instructions are here
http://support.nagios.com/wiki/index.ph ... e.22_Error
For 3.
I think the orphaned host problem is fixed in 2011R2.4
There currently is no way to purge via the database invalid services.
There is a post here
http://support.nagios.com/forum/viewtop ... 26&p=24962
that goes over this pretty well, and you also need to set the timezone in the /etc/php.ini, instructions are here
http://support.nagios.com/wiki/index.ph ... e.22_Error
For 3.
I think the orphaned host problem is fixed in 2011R2.4
There currently is no way to purge via the database invalid services.
Re: last check = null, wrong system time and time zone ++
Hi,
Thanks for the link Scott.
The timezone is now correct and php.ini has also been updated.
I have not been able to install NTP however as the Nagios server is in a secure zone and cannot access the internet. It will be getting time from the ESXi host however which is getting its time from our NTP server. Is this sufficient?
I have also notice that instead of the last check time being "1970-01-01 00:00:00" they all now show "N/A" and are not updating from there. Any ideas?
Cheers,
C.
Thanks for the link Scott.
The timezone is now correct and php.ini has also been updated.
I have not been able to install NTP however as the Nagios server is in a secure zone and cannot access the internet. It will be getting time from the ESXi host however which is getting its time from our NTP server. Is this sufficient?
I have also notice that instead of the last check time being "1970-01-01 00:00:00" they all now show "N/A" and are not updating from there. Any ideas?
Cheers,
C.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: last check = null, wrong system time and time zone ++
Can you go to Admin -> System Profile and post back here what is displayed in the "Date/Time" section
Re: last check = null, wrong system time and time zone ++
Hi,
Information as follows:
Date/Time
PHP Timezone: Australia/Sydney
PHP Time: Tue, 08 May 2012 08:25:19 +1000
System Time: Tue, 08 May 2012 08:25:19 +1000
Rgds,
C.
Information as follows:
Date/Time
PHP Timezone: Australia/Sydney
PHP Time: Tue, 08 May 2012 08:25:19 +1000
System Time: Tue, 08 May 2012 08:25:19 +1000
Rgds,
C.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: last check = null, wrong system time and time zone ++
What is listed as the Next check time for your services that aren't updated?
Re: last check = null, wrong system time and time zone ++
Hi,
In the Home:Host Detail view the "Last Check" column contains the string "N/A". The Status information usually contains " Host check is pending... Check is scheduled for YYYY-MM-DD HH:MM:SS" where YYYY-MM-DD HH:MM:SS is the time for the next polling cycle.
If I click on a hostname and go to its "Host Status Detail" view I see the following Status Details
Host State : Pending
Duration: nd nnh nnm nns
Host Stability: Unchanging (stable)
Last Check: Never
Next Check YYYY-MM-DD HH:MM:SS (where YYYY-MM-DD HH:MM:SS is the time for the next polling cycle.)
cheers
KB
In the Home:Host Detail view the "Last Check" column contains the string "N/A". The Status information usually contains " Host check is pending... Check is scheduled for YYYY-MM-DD HH:MM:SS" where YYYY-MM-DD HH:MM:SS is the time for the next polling cycle.
If I click on a hostname and go to its "Host Status Detail" view I see the following Status Details
Host State : Pending
Duration: nd nnh nnm nns
Host Stability: Unchanging (stable)
Last Check: Never
Next Check YYYY-MM-DD HH:MM:SS (where YYYY-MM-DD HH:MM:SS is the time for the next polling cycle.)
cheers
KB
Re: last check = null, wrong system time and time zone ++
Are you having issues with a few services or all of them?
Is the check under question is an active or a passive one?
Can you "Schedule an immediate check"?
Did you tried restarting your nagios server?
Can you post the output of the following commands?
Is the check under question is an active or a passive one?
Can you "Schedule an immediate check"?
Did you tried restarting your nagios server?
Can you post the output of the following commands?
Code: Select all
service ndo2db status
ll /usr/local/nagiosxi
tail /usr/local/nagios/var/nagios.log
tail /var/log/messages
tail /var/log/httpd/error_logBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: last check = null, wrong system time and time zone ++
Hi,
Thanks for the questions, it got me thinking.
the only server that was showing a correct last check was the localhost i.e. nagios, and we never set that up.
We set up our servers by using the windows server wizard once and then generalizing the service configurations it created and then added more hosts to them.
As I can't compare them to localhost, I used the wizard again and added a known good server. it reports OK. So i then compared this to all the other hosts and noticed that the _xiwizard_windowshost template was missing from the common settings tab along with the _xiwizard/windowsserver free variable definition from Misc settings.
After I added these to one of our 'broken' hosts and waited it corrected itself on the next polling cycle and turned green.
I am currently working though applying these, once done I will reboot the server and report back.
Cheers,
C.
Thanks for the questions, it got me thinking.
the only server that was showing a correct last check was the localhost i.e. nagios, and we never set that up.
We set up our servers by using the windows server wizard once and then generalizing the service configurations it created and then added more hosts to them.
As I can't compare them to localhost, I used the wizard again and added a known good server. it reports OK. So i then compared this to all the other hosts and noticed that the _xiwizard_windowshost template was missing from the common settings tab along with the _xiwizard/windowsserver free variable definition from Misc settings.
After I added these to one of our 'broken' hosts and waited it corrected itself on the next polling cycle and turned green.
I am currently working though applying these, once done I will reboot the server and report back.
Cheers,
C.
Re: last check = null, wrong system time and time zone ++
Confirmed . all the hosts are now showing as green and the last check time is updating correctly.
Thanks for your Help!
Our first Nagios support experience has been a really good one!
KB.
Thanks for your Help!
Our first Nagios support experience has been a really good one!
KB.