Page 1 of 1

My server got the DEVIL in it!

Posted: Tue Nov 27, 2018 4:24 pm
by benhank
So according to the the following you will see that
1. I killed and restarted the nagios service, but it isnt running
2. While it is not running I was able to do an apply config
3.the nagios gui says everything is runnin fine.
4. on a side note the apply config log shows a number of warnings but the warning count is 0
Behold!:

Code: Select all

[root@lkennagiost01 etc]# service nagios stop
Stopping nagios: kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
done.
[root@lkennagiost01 etc]# killall -9 nagios
[root@lkennagiost01 etc]# service nagios start
Starting nagios: done.
[root@lkennagiost01 etc]# service nagios status
nagios is not running
[root@lkennagiost01 etc]#  tail -f /usr/local/nagiosxi/var/cmdsubsys.log

PROCESSED 0 COMMANDS

PROCESSED 0 COMMANDS

PROCESSED 0 COMMANDS

PROCESSED 0 COMMANDS

PROCESSED 0 COMMANDS
APPLYING NAGIOSCORE CONFIG...
CMDLINE=cd /usr/local/nagiosxi/scripts && ./reconfigure_nagios.sh
No entry for terminal type "unknown";
using dumb terminal settings.

--- reset_config_perms.sh ------------
> Setting CCM script permissions
> Setting script permissions
> Setting special component script permissions
> Setting configuration file/directory permissions
> Setting perfdata directory and RRD permissions
> Setting NOM checkpoint user:group permissions
> + Setting CCM configuration file user:group permissions
> + Setting Recurring Downtime file user:group permissions
> + Setting BPI configuration file user:group permissions
--------------------------------------

--- ccm_import.php -------------------
> Setting import directory: /usr/local/nagios/etc/import/
> Importing config files into the CCM
  No files to import
--------------------------------------

--- ccm_export.php -------------------
> Writing CCM configuration to Nagios files
  Finished writing out configuraton
--------------------------------------

--------------------------------------
> Verifying configuration with Nagios Core
> Output:
Nagios Core 4.4.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2018-08-16
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
Warning: Duplicate definition found for service 'App - EPIC EPS Printing - Load Balancer - EPS_ATR' on host 'eps_atr.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_epic_printing_server_ping_wkenatrepsp04.xxx.org.cfg', starting on line 16)
Warning: Duplicate definition found for service 'App - EPIC EPS Printing - Load Balancer - EPS_ATR' on host 'eps_atr.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_epic_printing_server_ping_wkenatrepsp05.xxx.org.cfg', starting on line 16)
Warning: Duplicate definition found for service 'App - EPIC EPS Printing - Load Balancer - EPS_ATR' on host 'eps_atr.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_epic_printing_server_ping_wkenatrepsp02.xxx.org.cfg', starting on line 16)
Warning: Duplicate definition found for service 'App - EPIC EPS Printing - Load Balancer - EPS_ATR' on host 'eps_atr.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_epic_printing_server_ping_wkenatrepsp06.xxx.org.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NT: Memory Usage' on host 'WKENPHAPOSP01.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_windows_memory_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WVKENEXCHP04.xxx.net' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WVKENEXCHP03.xxx.net' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WVKENEXCHP02.xxx.net' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WVKENEXCHP01.xxx.net' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WVKENBIDBT01.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WVKENBIDBP01.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives' on host 'WKENMUSEP01.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives - VNA' on host 'SVR-SQL03.vnacarenetwork.net' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage_VNA.cfg', starting on line 16)
Warning: Duplicate definition found for service 'NSClient: NRPE: Disk Usage: All local drives - VNA' on host 'SVR-ARCHIVEONE.vnacarenetwork.net' (config file '/usr/local/nagios/etc/services/xxx_windows_all_disk_usage_VNA.cfg', starting on line 16)
Warning: Duplicate definition found for service 'Interface Table Status - core network devices' on host 'KEN-GLBXINET1' (config file '/usr/local/nagios/etc/services/xxx_interface_table_core.cfg', starting on line 16)
Warning: Duplicate definition found for service 'Linux Current Load' on host 'LKENPESP4.xxx.org' (config file '/usr/local/nagios/etc/services/xxx_linux_current_load.cfg', starting on line 16)


   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 12143 services.
        Checked 2295 hosts.
        Checked 276 host groups.
        Checked 1 service groups.
        Checked 129 contacts.
        Checked 17 contact groups.
        Checked 339 commands.
        Checked 143 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 2295 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 143 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
--------------------------------------
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
Stopping nagios: done.
Starting nagios: done.
OUTPUT=Starting nagios: done.
RETURNCODE=0

PROCESSED 1 COMMANDS
CMDLINE=php /usr/local/nagiosxi/html/includes/components/nagiosbpi/api_tool.php --cmd=syncall
CMD: syncall
PHP Notice:  Undefined variable: err in /usr/local/nagiosxi/html/includes/components/nagiosbpi/api_tool.php on line 146
MSG: Could not get data for objects. NDO or Core may not be running.
OUTPUT=MSG: Could not get data for objects. NDO or Core may not be running.
RETURNCODE=0

^C
[root@lkennagiost01 etc]# service nagios status
nagios is not running
[root@lkennagiost01 etc]#
Capture.PNG

Re: My server got the DEVIL in it!

Posted: Tue Nov 27, 2018 5:08 pm
by npolovenko
Hi, @benhank. Have you upgraded this XI instance recently? What version are you currently running? The devil in your server might've changed the path to the nagios.lock file in the /etc/init.d/nagios script.
Please run the following command first:
cat /usr/local/nagios/etc/nagios.cfg | grep lock_file
And make sure that this path corresponds to the path in the init file:
/etc/init.d/nagios
If it's different change the path in the init script, kill all Nagios processes and start Nagios again.

Re: My server got the DEVIL in it!

Posted: Wed Nov 28, 2018 11:14 am
by benhank
yeah, I upgraded from 5412 to 557
Heres the result of that command:

Code: Select all

cat /usr/local/nagios/etc/nagios.cfg | grep lock_file
lock_file=/usr/local/nagios/var/nagios.lock
here is the nagios init file snippet:

Code: Select all

# Our install-time configuration.
prefix=/usr/local/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfgFile=${prefix}/etc/nagios.cfg
NagiosCfgtestFile=${prefix}/var/nagios.configtest
NagiosStatusFile=${prefix}/var/status.dat
NagiosRetentionFile=${prefix}/var/retention.dat
NagiosCommandFile=${prefix}/var/rw/nagios.cmd
NagiosVarDir=${prefix}/var
NagiosRunFile=/var/run/nagios.lock
NagiosCGIDir=${exec_prefix}/sbin
NagiosUser=nagios
NagiosGroup=nagios
checkconfig="true"
I don't touch init files much which line do I change?

Re: My server got the DEVIL in it!

Posted: Wed Nov 28, 2018 11:58 am
by benhank
scratch that we made the change and its working now. but my final question is why or how did the lock file location change.

Re: My server got the DEVIL in it!

Posted: Wed Nov 28, 2018 12:10 pm
by sigmainformatique
Hi,

Same issue here, for two times, my lock file changed from /usr/local/nagios/... to /var/lock/... (without rights). I was disapointed because XI said I had issue in my configuration, even if my config was OK (by making a nagios checkconfig).
I think this issue is related with the mecanism that can restore old configurations.

Re: My server got the DEVIL in it!

Posted: Wed Nov 28, 2018 3:36 pm
by npolovenko
@benhank, The lock file location was changed in XI 5.5. Normally the upgrade script takes care of the init file and updates it with the correct path. In rare instances, it doesn't and the init script needs to be modified by hand. Let me know if everything is working as it should now?

@ sigmainformatique, Please read my response above. Are you saying that you had this happen a few times? If so we may need to open a separate thread for your issue and investigate further.

Re: My server got the DEVIL in it!

Posted: Mon Dec 03, 2018 12:04 pm
by sigmainformatique
In some cases (wrong configurations...) lock file come back to its old path. Very disturbing...

Re: My server got the DEVIL in it!

Posted: Mon Dec 03, 2018 4:57 pm
by ssax
There was a bugfix in XI 5.5.7 that impacted people with downgraded Core, when an apply config fails or you revert to a config snapshot on XI 5.5.7 it replaces the lockfile path in /usr/local/nagios/etc/nagios.cfg.

Please edit these files and make sure the lock file matches your lockfile path:

Code: Select all

/usr/local/nagiosxi/scripts/nom_restore_nagioscore_checkpoint.sh
/usr/local/nagiosxi/scripts/nom_restore_nagioscore_checkpoint_specific.sh
Specifically, this line:

Code: Select all

lockfile="/var/run/nagios.lock"

Re: My server got the DEVIL in it!

Posted: Tue Dec 04, 2018 12:04 pm
by benhank
Thanks for the fixes and info fellas you can lock it up im all set

Re: My server got the DEVIL in it!

Posted: Tue Dec 04, 2018 12:44 pm
by npolovenko
@benhank, Sounds good! Closing the thread.