The monitoring engine shows: not running
The monitoring engine shows: not running
I recently took over our nagios eval and we have it partially configured. The current issue seems to be that the hosts to be monitored don't show up in the webinterface.
The monitoring engine shows up on the system status and as not running in the monitoring engine status, and the start doesn’t seem to work.
Ultimately there seems to be hosts that are discovered / monitored but I don’t see them in any of the operations or host status pages.
The reconfigure command line seems to work.
./reconfigure_nagios.sh
--- reset_config_perms.sh ------------
> Setting script permissions
> Setting CCM script permissions
> Setting special script permissions
> Setting special component script permissions
> Setting configuration file/directory permissions
> Setting perfdata directory and RRD permissions
> Setting libexec directory permissions
> Setting Nagios XI config permissions
> Setting NOM checkpoint user:group permissions
> + Setting Recurring Downtime file user:group permissions
> + Setting BPI configuration file user:group permissions
--------------------------------------
--- ccm_import.php -------------------
> Setting import directory: /usr/local/nagios/etc/import/
> Importing config files into the CCM
No files to import
--------------------------------------
--- ccm_export.php -------------------
> Writing CCM configuration to Nagios files
Finished writing out configuraton
--------------------------------------
--------------------------------------
> Verifying configuration with Nagios Core
> Output:
Nagios Core 4.4.5
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-08-20
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 826 services.
Checked 274 hosts.
Checked 1 host groups.
Checked 1 service groups.
Checked 4 contacts.
Checked 2 contact groups.
Checked 132 commands.
Checked 11 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 274 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 11 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
The monitoring engine shows up on the system status and as not running in the monitoring engine status, and the start doesn’t seem to work.
Ultimately there seems to be hosts that are discovered / monitored but I don’t see them in any of the operations or host status pages.
The reconfigure command line seems to work.
./reconfigure_nagios.sh
--- reset_config_perms.sh ------------
> Setting script permissions
> Setting CCM script permissions
> Setting special script permissions
> Setting special component script permissions
> Setting configuration file/directory permissions
> Setting perfdata directory and RRD permissions
> Setting libexec directory permissions
> Setting Nagios XI config permissions
> Setting NOM checkpoint user:group permissions
> + Setting Recurring Downtime file user:group permissions
> + Setting BPI configuration file user:group permissions
--------------------------------------
--- ccm_import.php -------------------
> Setting import directory: /usr/local/nagios/etc/import/
> Importing config files into the CCM
No files to import
--------------------------------------
--- ccm_export.php -------------------
> Writing CCM configuration to Nagios files
Finished writing out configuraton
--------------------------------------
--------------------------------------
> Verifying configuration with Nagios Core
> Output:
Nagios Core 4.4.5
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-08-20
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 826 services.
Checked 274 hosts.
Checked 1 host groups.
Checked 1 service groups.
Checked 4 contacts.
Checked 2 contact groups.
Checked 132 commands.
Checked 11 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 274 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 11 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
Re: The monitoring engine shows: not running
Thank you posting the issue here, and welcome to the forum!
First thing, can you show me what happens if you just run systemctl restart nagios.service ? I'm wondering if it's just quietly throwing some kind of error.
After that, if you could download a system profile and send that in, that would be great. You can get a system profile from Admin -> System Profile -> Download Profile. You can send that to me in a private message.
Once we get the monitoring engine up and running, we'll work on getting the status of the configured hosts and services.
First thing, can you show me what happens if you just run systemctl restart nagios.service ? I'm wondering if it's just quietly throwing some kind of error.
After that, if you could download a system profile and send that in, that would be great. You can get a system profile from Admin -> System Profile -> Download Profile. You can send that to me in a private message.
Once we get the monitoring engine up and running, we'll work on getting the status of the configured hosts and services.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: The monitoring engine shows: not running
no errors from the restart. I sent the profile.zip thanks.
Re: The monitoring engine shows: not running
I'm seeing a lot of errors related to Apache. Do you know if the Apache configs were customized at all?
Also can you show me the output of ls -lh /usr/local/nagiosxi/var/ ?
Also can you show me the output of ls -lh /usr/local/nagiosxi/var/ ?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: The monitoring engine shows: not running
ls -lh /usr/local/nagiosxi/var/
total 24K
drwxrwxr-x 2 nagios nagios 66 Nov 13 05:21 certs
drwsrwsr-x 3 apache nagios 83 Jan 14 18:01 components
drwxrwxr-x 2 nagios nagios 20 Nov 13 05:19 keys
-rw-r--r-- 1 apache apache 1.1K Nov 13 13:21 load_url.log
-rw-rw-rw- 1 nagios nagios 0 Nov 13 05:19 NXTI_Write_Test
drwxrwxr-x 2 nagios nagios 22 Nov 13 20:41 subsys
drwxrwxr-x 2 nagios nagios 6 Nov 13 05:19 upgrades
-rw-r--r-- 1 apache apache 0 Nov 13 05:19 wkhtmltox.log
-rw-r--r-- 1 nagios nagios 7 Nov 13 05:19 xi-itype
-rw-r--r-- 1 nagios nagios 6.5K Nov 13 05:19 xi-sys.cfg
-rw-r--r-- 1 nagios nagios 37 Nov 13 05:19 xi-uuid
-rw-r--r-- 1 nagios nagios 196 Nov 13 05:19 xiversion
total 24K
drwxrwxr-x 2 nagios nagios 66 Nov 13 05:21 certs
drwsrwsr-x 3 apache nagios 83 Jan 14 18:01 components
drwxrwxr-x 2 nagios nagios 20 Nov 13 05:19 keys
-rw-r--r-- 1 apache apache 1.1K Nov 13 13:21 load_url.log
-rw-rw-rw- 1 nagios nagios 0 Nov 13 05:19 NXTI_Write_Test
drwxrwxr-x 2 nagios nagios 22 Nov 13 20:41 subsys
drwxrwxr-x 2 nagios nagios 6 Nov 13 05:19 upgrades
-rw-r--r-- 1 apache apache 0 Nov 13 05:19 wkhtmltox.log
-rw-r--r-- 1 nagios nagios 7 Nov 13 05:19 xi-itype
-rw-r--r-- 1 nagios nagios 6.5K Nov 13 05:19 xi-sys.cfg
-rw-r--r-- 1 nagios nagios 37 Nov 13 05:19 xi-uuid
-rw-r--r-- 1 nagios nagios 196 Nov 13 05:19 xiversion
Re: The monitoring engine shows: not running
That looks good. Let's get some more information. Can you get me the output of these commands,
Code: Select all
cat /etc/hosts
ps -aux | grep nagios.cfg
grep nagios /etc/group
chage nagios
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: The monitoring engine shows: not running
I found the notify me button this should go faster now.
----------------------------------------------------------------------
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
grep nagios.cfg
nagios 322 0.2 0.0 87680 51748 ? Ss Jan14 3:15 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 371 0.0 0.0 85880 4204 ? S Jan14 0:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
grep nagios /etc/group
nagcmd1001:apache,nagios
nagios1003:apache,nagios
chage -l nagios
Last password change : Nov 13, 2019
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
----------------------------------------------------------------------
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
grep nagios.cfg
nagios 322 0.2 0.0 87680 51748 ? Ss Jan14 3:15 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 371 0.0 0.0 85880 4204 ? S Jan14 0:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
grep nagios /etc/group
nagcmd1001:apache,nagios
nagios1003:apache,nagios
chage -l nagios
Last password change : Nov 13, 2019
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: The monitoring engine shows: not running
I'm seeing these in the syslog on the profile you sent
This leads me to believe there is possibly something wrong with sudo on your system, which will be required to the crons to successfully run which populate the status in the UI.
Can you show the output of the following:
Code: Select all
Jan 14 17:57:07 bolamon1 sudo: PAM unable to dlopen(/usr/lib64/security/pam_fprintd.so): /usr/lib64/security/pam_fprintd.so: cannot open shared object file: No such file or directory
Can you show the output of the following:
Code: Select all
tail -20 /var/log/cron
Re: The monitoring engine shows: not running
looks like mostly complaining about the lack of a home but i thought that was established as ok?
[root@bolamon1 ~]# tail -20 /var/log/cron
Jan 16 21:37:01 bolamon1 CROND[29916]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29917]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php >> /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29922]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/snmptt_service_results.php >> /usr/local/nagiosxi/var/snmptt_service_results.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29918]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php >> /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29923]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29915]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29914]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29919]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php >> /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29916]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29920]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29921]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29924]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29917]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29923]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29922]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29918]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29920]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29921]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29919]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29924]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
[root@bolamon1 ~]# tail -20 /var/log/cron
Jan 16 21:37:01 bolamon1 CROND[29916]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29917]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php >> /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29922]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/snmptt_service_results.php >> /usr/local/nagiosxi/var/snmptt_service_results.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29918]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php >> /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29923]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29915]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29914]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29919]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php >> /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29916]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29920]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29921]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29924]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29917]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29923]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29922]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29918]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29920]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29921]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29919]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29924]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: The monitoring engine shows: not running
Does the nagios user home directory exist?
Can you also confirm the sudoers syntax
Code: Select all
ls -al /home/nagios
Code: Select all
visudo -c