Page 1 of 2
Nagiosxi not loading the host and services after reboot.
Posted: Fri Dec 09, 2016 9:13 am
by monit_burb
Hi,
I've a clone of my production environment that I was using for some test and I realized that when I reboot the server Nagios don't work properly. The website loads fine and I can see on the "System Status" and "Monitoring Engine Status" all Ok in green but the checks are not being updated. If I go to "System profile" In the "Nagios XI Data" section It says that the total number of host and services are both 0 even tough I can see all my host if Ido a search or all the section of HOME like Quick view or Details.
I can solve the issue just by doing a "service nagios stop" and "service nagios start" but I'll like to figure out the reason why this is happening. The only possible reason I can see in the logs is the following two lines that are repeated constantly:
Code: Select all
ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_logentries SET instance_id='1', logentry_time=FROM_UNIXTIME(1481289430), entry_time=FROM_UNIXTIME(1481289430), entry_time_usec='994723', logentry_type='262144', logentry_data='wproc: Core Worker 2455: job 1167 \(pid=3259\) timed out\. Killing it', realtime_data='1', inferred_data_extracted='1''
Code: Select all
ndo2db: mysql_error: 'Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed'
My guess is some sort of connection problem with the Mysql database that I have offloaded to a secondary server.
Re: Nagiosxi not loading the host and services after reboot.
Posted: Fri Dec 09, 2016 12:57 pm
by dwhitfield
You should be able to run something like from the nagios machine
Code: Select all
mysqlcheck -f -r -u nagiosxi -pnagios --databases nagiosxi -h xxx.xxx.xxx.xxx
mysqlcheck -f -r -u nagios -pnagios --databases nagios -h xxx.xxx.xxx.xxx
mysqlcheck -f -r -u nagiosql -pnagios --databases nagiosql -h xxx.xxx.xxx.xxx
If that doesn't work for you, can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the Download Profile (not Show Profile) button in the top right corner. If for whatever reason you cannot download the profile, please put the output of Show Profile in the thread (that will at least get us some info).
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
UPDATE: Profile received and shared with techs.
Re: Nagiosxi not loading the host and services after reboot.
Posted: Mon Dec 12, 2016 7:10 am
by monit_burb
dwhitfield wrote:You should be able to run something like from the nagios machine
Code: Select all
mysqlcheck -f -r -u nagiosxi -pnagios --databases nagiosxi -h xxx.xxx.xxx.xxx
mysqlcheck -f -r -u nagios -pnagios --databases nagios -h xxx.xxx.xxx.xxx
mysqlcheck -f -r -u nagiosql -pnagios --databases nagiosql -h xxx.xxx.xxx.xxx
If that doesn't work for you, can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the Download Profile (not Show Profile) button in the top right corner. If for whatever reason you cannot download the profile, please put the output of Show Profile in the thread (that will at least get us some info).
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
Hi, PM send with the profile attached.
I have an old installation where nagiosxi database is still on postgresql in the localhost so I can't check it with that command but the other two for nagios and nagiosql works fine. Is there an equivalent command for pgsql I can use? I've never used pgsql before by tried to get the privileges with the following commands.
Code: Select all
nagiosxi=> \du
List of roles
Role name | Attributes | Member of
-----------+-------------+-----------
nagiosxi | | {}
postgres | Superuser | {}
: Create role
: Create DB
Code: Select all
nagiosxi=> \l
List of databases
Name | Owner | Encoding | Collation | Ctype | Access privileg
es
-----------+----------+----------+-------------+-------------+------------------
-----
nagiosxi | nagiosxi | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
: postgres=CTc/post
gres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
: postgres=CTc/post
gres
(4 rows)
I inherited this installation so I don't know how it was set up.
Re: Nagiosxi not loading the host and services after reboot.
Posted: Mon Dec 12, 2016 11:57 am
by dwhitfield
What error are you getting? My understanding (based on
https://assets.nagios.com/downloads/nag ... OUtils.pdf) is that currently MySQL is only supported with NOutils. Furthermore, both of those errors specifically mention MySQL. I just want to make sure I am tackling the right problem. Thanks!
Re: Nagiosxi not loading the host and services after reboot.
Posted: Mon Dec 12, 2016 12:59 pm
by monit_burb
dwhitfield wrote:What error are you getting? My understanding (based on
https://assets.nagios.com/downloads/nag ... OUtils.pdf) is that currently MySQL is only supported with NOutils. Furthermore, both of those errors specifically mention MySQL. I just want to make sure I am tackling the right problem. Thanks!
Hi,
I think that mysql error I saw before was just temporal like something that hadn't start yet or something because I can't see them anymore. I tried to reboot again the whole server and still the same as expected.
I can't see any strange error on the logs but as I said,in the System profile" In the "Nagios XI Data" section It says that the total number of host and services are both 0. Also, a lot of the host stay forever in grey with the status "Service Check is pending...." and "Host Check is pending..." in case of the Host
I do have NDOUtils installed and pointing to my mysql DB. I do have a mix set up of nagiosxi database running on postgresql and NDOUtils and nagiosql databases on mysql. As I said, I can my system work if I do a service nagios restart after the reboot but I don't know why I have to do that service restart.
Re: Nagiosxi not loading the host and services after reboot.
Posted: Mon Dec 12, 2016 4:02 pm
by ssax
Is the nagios service running right after reboot?
Code: Select all
service nagios status
chkconfig --list |grep "ndo\|nagios"
Re: Nagiosxi not loading the host and services after reboot.
Posted: Tue Dec 13, 2016 3:24 am
by monit_burb
ssax wrote:Is the nagios service running right after reboot?
Code: Select all
service nagios status
chkconfig --list |grep "ndo\|nagios"
Yes, Nagios and all the other nagios related services are running after a reboot and if I do a
tail -f /usr/local/nagios/var/nagios.log I can see some checks being executed. If I look for one of the servers that are executing a check according to the nagios.log I can see it being updated as shown here where it was updated a minute ago.
na1.PNG
But then If I access it It says like in all of them "Service Check is pending"
na2.PNG
And also a "You are not authorized to access this feature." If I go to the Performance Graph section on that check
Re: Nagiosxi not loading the host and services after reboot.
Posted: Tue Dec 13, 2016 6:01 pm
by ssax
Please PM me these files:
Code: Select all
/usr/local/nagios/var/objects.cache
/usr/local/nagios/var/retention.dat
Also, are you seeing anything in
/var/log/messages or in the
dmesg command output around the time of reboot that could be related?
Re: Nagiosxi not loading the host and services after reboot.
Posted: Wed Dec 14, 2016 3:43 am
by monit_burb
ssax wrote:Please PM me these files:
Code: Select all
/usr/local/nagios/var/objects.cache
/usr/local/nagios/var/retention.dat
Also, are you seeing anything in
/var/log/messages or in the
dmesg command output around the time of reboot that could be related?
PM send.
I don't see any strange error or anything. The most strange thing I can see in messages is the following error but it seems to solve itself after 15 seconds or so and afterwards the first checks starts to show up in the log.
Code: Select all
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod: Could not open data sink! I'll keep trying, but some output may get lost...
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for process data
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for log data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for system command data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for event handler data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for notification data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for comment data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for downtime data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for flapping data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for program status data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for host status data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for service status data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for adaptive program data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for adaptive host data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for adaptive service data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for external command data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for aggregated status data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for retention data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for contact data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for contact notification data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for acknowledgement data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for state change data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for contact status data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: ndomod registered for adaptive contact data'
Dec 14 08:15:59 ESBARLMONAPP05 nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Dec 14 08:16:03 ESBARLMONAPP05 nagios: Successfully launched command file worker with pid 2534
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARVAPP;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARLLOG02;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: SERVICE DOWNTIME ALERT: ESBARLHANA03;Disk state;STARTED; Service has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARLCPOC01;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARLCPOC02;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: SERVICE DOWNTIME ALERT: ESBARLHANA01;Disk state;STARTED; Service has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARLCPOC03;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: SERVICE DOWNTIME ALERT: ESBARLXAPP05;/var Disk Usage;STARTED; Service has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: SERVICE DOWNTIME ALERT: ESBARLHANA05;Disk state;STARTED; Service has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARLXAPP05;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: SERVICE DOWNTIME ALERT: ESBARLXAPP02;/var Disk Usage;STARTED; Service has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: CNPEKWPRN01;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:03 ESBARLMONAPP05 nagios: HOST DOWNTIME ALERT: ESBARWTXTTEMP01;STARTED; Host has entered a period of scheduled downtime
Dec 14 08:16:08 ESBARLMONAPP05 nagios: SERVICE ALERT: TMS02;SQL compilations;WARNING;SOFT;1;WARNING - 1.45 initial compilations / sec
Dec 14 08:16:15 ESBARLMONAPP05 nagios: ndomod: Successfully connected to data sink. 24658 items lost, 5000 queued items to flush.
Dec 14 08:16:15 ESBARLMONAPP05 nagios: ndomod: Successfully flushed 5000 queued items to data sink.
Re: Nagiosxi not loading the host and services after reboot.
Posted: Wed Dec 14, 2016 4:52 pm
by ssax
Also, please PM me these files as well:
Code: Select all
/usr/local/nagiosxi/html/config.inc.php
/var/www/html/nagiosql/config/settings.php
/usr/local/nagiosxi/etc/components/ccm_config.inc.php