The monitoring engine shows: not running

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
mcd
Posts: 9
Joined: Mon Jan 13, 2020 5:22 pm

The monitoring engine shows: not running

Post by mcd »

I recently took over our nagios eval and we have it partially configured. The current issue seems to be that the hosts to be monitored don't show up in the webinterface.
The monitoring engine shows up on the system status and as not running in the monitoring engine status, and the start doesn’t seem to work.
Ultimately there seems to be hosts that are discovered / monitored but I don’t see them in any of the operations or host status pages.

The reconfigure command line seems to work.
./reconfigure_nagios.sh

--- reset_config_perms.sh ------------
> Setting script permissions
> Setting CCM script permissions
> Setting special script permissions
> Setting special component script permissions
> Setting configuration file/directory permissions
> Setting perfdata directory and RRD permissions
> Setting libexec directory permissions
> Setting Nagios XI config permissions
> Setting NOM checkpoint user:group permissions
> + Setting Recurring Downtime file user:group permissions
> + Setting BPI configuration file user:group permissions
--------------------------------------

--- ccm_import.php -------------------
> Setting import directory: /usr/local/nagios/etc/import/
> Importing config files into the CCM
No files to import
--------------------------------------

--- ccm_export.php -------------------
> Writing CCM configuration to Nagios files
Finished writing out configuraton
--------------------------------------

--------------------------------------
> Verifying configuration with Nagios Core
> Output:
Nagios Core 4.4.5
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-08-20
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
Checked 826 services.
Checked 274 hosts.
Checked 1 host groups.
Checked 1 service groups.
Checked 4 contacts.
Checked 2 contact groups.
Checked 132 commands.
Checked 11 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 274 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 11 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: The monitoring engine shows: not running

Post by mbellerue »

Thank you posting the issue here, and welcome to the forum!

First thing, can you show me what happens if you just run systemctl restart nagios.service ? I'm wondering if it's just quietly throwing some kind of error.

After that, if you could download a system profile and send that in, that would be great. You can get a system profile from Admin -> System Profile -> Download Profile. You can send that to me in a private message.

Once we get the monitoring engine up and running, we'll work on getting the status of the configured hosts and services.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
mcd
Posts: 9
Joined: Mon Jan 13, 2020 5:22 pm

Re: The monitoring engine shows: not running

Post by mcd »

no errors from the restart. I sent the profile.zip thanks.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: The monitoring engine shows: not running

Post by mbellerue »

I'm seeing a lot of errors related to Apache. Do you know if the Apache configs were customized at all?

Also can you show me the output of ls -lh /usr/local/nagiosxi/var/ ?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
mcd
Posts: 9
Joined: Mon Jan 13, 2020 5:22 pm

Re: The monitoring engine shows: not running

Post by mcd »

ls -lh /usr/local/nagiosxi/var/
total 24K
drwxrwxr-x 2 nagios nagios 66 Nov 13 05:21 certs
drwsrwsr-x 3 apache nagios 83 Jan 14 18:01 components
drwxrwxr-x 2 nagios nagios 20 Nov 13 05:19 keys
-rw-r--r-- 1 apache apache 1.1K Nov 13 13:21 load_url.log
-rw-rw-rw- 1 nagios nagios 0 Nov 13 05:19 NXTI_Write_Test
drwxrwxr-x 2 nagios nagios 22 Nov 13 20:41 subsys
drwxrwxr-x 2 nagios nagios 6 Nov 13 05:19 upgrades
-rw-r--r-- 1 apache apache 0 Nov 13 05:19 wkhtmltox.log
-rw-r--r-- 1 nagios nagios 7 Nov 13 05:19 xi-itype
-rw-r--r-- 1 nagios nagios 6.5K Nov 13 05:19 xi-sys.cfg
-rw-r--r-- 1 nagios nagios 37 Nov 13 05:19 xi-uuid
-rw-r--r-- 1 nagios nagios 196 Nov 13 05:19 xiversion
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: The monitoring engine shows: not running

Post by mbellerue »

That looks good. Let's get some more information. Can you get me the output of these commands,

Code: Select all

cat /etc/hosts
ps -aux | grep nagios.cfg
grep nagios /etc/group
chage nagios
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
mcd
Posts: 9
Joined: Mon Jan 13, 2020 5:22 pm

Re: The monitoring engine shows: not running

Post by mcd »

I found the notify me button this should go faster now. :)
----------------------------------------------------------------------

cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6


grep nagios.cfg
nagios 322 0.2 0.0 87680 51748 ? Ss Jan14 3:15 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 371 0.0 0.0 85880 4204 ? S Jan14 0:04 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg


grep nagios /etc/group
nagcmd:x:1001:apache,nagios
nagios:x:1003:apache,nagios


chage -l nagios
Last password change : Nov 13, 2019
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: The monitoring engine shows: not running

Post by scottwilkerson »

I'm seeing these in the syslog on the profile you sent

Code: Select all

Jan 14 17:57:07 bolamon1 sudo: PAM unable to dlopen(/usr/lib64/security/pam_fprintd.so): /usr/lib64/security/pam_fprintd.so: cannot open shared object file: No such file or directory
This leads me to believe there is possibly something wrong with sudo on your system, which will be required to the crons to successfully run which populate the status in the UI.

Can you show the output of the following:

Code: Select all

tail -20 /var/log/cron
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
mcd
Posts: 9
Joined: Mon Jan 13, 2020 5:22 pm

Re: The monitoring engine shows: not running

Post by mcd »

looks like mostly complaining about the lack of a home but i thought that was established as ok?


[root@bolamon1 ~]# tail -20 /var/log/cron
Jan 16 21:37:01 bolamon1 CROND[29916]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29917]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php >> /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29922]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/snmptt_service_results.php >> /usr/local/nagiosxi/var/snmptt_service_results.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29918]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php >> /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29923]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29915]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29914]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29919]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php >> /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29916]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29920]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29921]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29924]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 16 21:37:01 bolamon1 CROND[29917]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29923]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29922]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29918]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29920]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29921]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29919]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
Jan 16 21:37:01 bolamon1 CROND[29924]: (CRON) ERROR chdir failed (/home/nagios): No such file or directory
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: The monitoring engine shows: not running

Post by scottwilkerson »

Does the nagios user home directory exist?

Code: Select all

ls -al /home/nagios
Can you also confirm the sudoers syntax

Code: Select all

visudo -c
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked