Page 1 of 1
host dependencies configurations
Posted: Mon Dec 05, 2016 1:04 am
by chandasandeep
Moderator Edit: This thread has been split from another - https://support.nagios.com/forum/viewto ... =7&t=35913
In the future, please create a new thread and link to the old one instead of adding on.
Hi Team,
Could you please help me out from this issue.
We have linux servers in two data centers with different network. My nagios is in xyz data center and ABC data center, getting server down alerts when VPN disconnected as it has in different network.
I go through the link and configured as
https://assets.nagios.com/downloads/nag ... 1478502227
i have configured for host dependency as given below.
Code: Select all
define hostdependency{
host_name AZMRDC01
dependent_host_name 10.175.23.1
notification_failure_criteria d,
}
Code: Select all
define hostdependency{
host_name TDMRDC01
dependent_host_name 10.175.23.1
notification_failure_criteria d,u
}
However getting error messages as given below.
Code: Select all
[root@azmrns01 hosts]# service nagios restart
Restarting nagios (via systemctl): Job for nagios.service failed because the control process exited with error code. See "systemctl status nagios.service" and "journalctl -xe" for details.
[FAILED]
[root@azmrns01 hosts]# tail -f /var/log/messages
Dec 5 04:31:42 azmrns01 nagios: Check your configuration file(s) to ensure that they contain valid
Dec 5 04:31:42 azmrns01 nagios: directives and data defintions. If you are upgrading from a previous
Dec 5 04:31:42 azmrns01 nagios: version of Nagios, you should be aware that some variables/definitions
Dec 5 04:31:42 azmrns01 nagios: may have been removed or modified in this version. Make sure to read
Dec 5 04:31:42 azmrns01 nagios: the HTML documentation regarding the config files, as well as the
Dec 5 04:31:42 azmrns01 nagios: 'Whats New' section to find out what has changed.
Dec 5 04:31:42 azmrns01 systemd: nagios.service: control process exited, code=exited status=8
Dec 5 04:31:42 azmrns01 systemd: Failed to start LSB: Starts and stops the Nagios monitoring server.
Dec 5 04:31:42 azmrns01 systemd: Unit nagios.service entered failed state.
Dec 5 04:31:42 azmrns01 systemd: nagios.service failed.
Re: host dependencies configurations
Posted: Mon Dec 05, 2016 3:35 pm
by tgriep
Can you run the following verification command on the Nagios server and post the output here?
Code: Select all
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Thanks
Re: host dependencies configurations
Posted: Mon Dec 05, 2016 11:29 pm
by chandasandeep
Hi Thanks for your reply.
here the output of verification command for nagios
Code: Select all
[root@azmrns01 hosts]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRAXSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXSQL01.cfg', starting on line 218)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRAXSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXSQL01.cfg', starting on line 209)
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRAXDEVSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXDEVSQL01.cfg', starting on line 218)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRAXDEVSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXDEVSQL01.cfg', starting on line 209)
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSQL01.cfg', starting on line 233)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSQL01.cfg', starting on line 224)
Warning: Duplicate definition found for service 'data buffer cache hit ratio' on host 'TDMRSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSQL01.cfg', starting on line 198)
Warning: Duplicate definition found for service 'SQL connection time' on host 'TDMRSQL01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSQL01.cfg', starting on line 190)
Warning: Duplicate definition found for service 'disk write queue length' on host 'AZMRAXSQL01' (config file '/usr/local/nagios/etc/objects/hosts/AZMRAXSQL01.cfg', starting on line 213)
Warning: Duplicate definition found for service 'disk read queue length' on host 'AZMRAXSQL01' (config file '/usr/local/nagios/etc/objects/hosts/AZMRAXSQL01.cfg', starting on line 204)
Warning: Duplicate definition found for service 'sql service CPU privileged time' on host 'TDMRAXDW01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXDW01.cfg', starting on line 295)
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRAXDW01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXDW01.cfg', starting on line 204)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRAXDW01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXDW01.cfg', starting on line 195)
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRSRS01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSRS01.cfg', starting on line 249)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRSRS01' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSRS01.cfg', starting on line 240)
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRSQL02' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSQL02.cfg', starting on line 221)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRSQL02' (config file '/usr/local/nagios/etc/objects/hosts/TDMRSQL02.cfg', starting on line 212)
Warning: Duplicate definition found for service 'disk write queue length' on host 'AZMRDEVAXSQL01' (config file '/usr/local/nagios/etc/objects/hosts/AZMRDEVAXSQL01.cfg', starting on line 199)
Warning: Duplicate definition found for service 'disk read queue length' on host 'AZMRDEVAXSQL01' (config file '/usr/local/nagios/etc/objects/hosts/AZMRDEVAXSQL01.cfg', starting on line 190)
Warning: Duplicate definition found for service 'disk write queue length' on host 'TDMRAXSQL02' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXSQL02.cfg', starting on line 220)
Warning: Duplicate definition found for service 'disk read queue length' on host 'TDMRAXSQL02' (config file '/usr/local/nagios/etc/objects/hosts/TDMRAXSQL02.cfg', starting on line 211)
Error: Invalid max_check_attempts value for host 'gateway'
Error: Could not register host (config file '/usr/local/nagios/etc/objects/hosts/host-gateway.cfg', starting on line 2)
Error processing object config files!
***> One or more problems was encountered while processing the config files...
Check your configuration file(s) to ensure that they contain valid
directives and data defintions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.
Code: Select all
[root@azmrns01 hosts]# more host-gateway.cfg
# a host definition for the gateway of the default route
define host {
host_name gateway
alias Default Gateway
address 10.175.23.1
use generic-host
}
Note : My requirement is network dependency between two data centers. servers are in two data centers, nagios is in xyz data center. and second data center servers are getting down alerts when vpn disconnected. when we check manually those servers are up and running fine. So, we are trying to configure above scenario. Could you please advise me what is the best solution to fix this issue.
Thanks,
Sandeep
Re: host dependencies configurations
Posted: Tue Dec 06, 2016 12:49 pm
by tgriep
First, to fix the max_check_attempts error, you could either add that entry directly to that host's settings or you can edit the generic-host template and add it there.
Once one of those are done, that should fix the error for you.
BTW, you may also need to setup these required settings.
About receiving the notifications for when the VPN link is down. You can setup parent child relationship for the remote site so when Nagios detects that the link is down, it will stop sending notifications for the remote hosts.
Take a look at this link for more details on that.
https://assets.nagios.com/downloads/nag ... ility.html
Re: host dependencies configurations
Posted: Thu Dec 08, 2016 3:21 am
by chandasandeep
Hi Thank you for assistance.
successfully i have configured gateway
[root@azmrns01 hosts]# cat host-gateway.cfg
# a host definition for the gateway of the default route
define host{
host_name Gateway
alias Bogus Router #1
address 10.175.23.81
check_command check-host-alive
check_interval 1
retry_interval 1
max_check_attempts 5
check_period 24x7
process_perf_data 0
retain_nonstatus_information 0
contact_groups admins
notification_interval 2
notification_period 24x7
notification_options d,u,r
}
and here are dependencies in same gateway config file.
define hostdependency{
host_name AZMRDC01
dependent_host_name Gateway
notification_failure_criteria d
}
define hostdependency{
host_name TDMRAXDEVSQL01
dependent_host_name Gateway
notification_failure_criteria d,u
}
Note : My question is when vpn disconnected between AZMRDC01 and TDMRAXDEVSQL01 network. we have to get host unreachable or etc alerts on TDMRAXDEVSQL01 server. please let me know where and how should i place set up.
thanks in advance.
Sandeep
Re: host dependencies configurations
Posted: Thu Dec 08, 2016 2:45 pm
by tgriep
If you want to receive the unreachable notification for that host, you will have to make sure the contact is setup to receive the unreachable messages.
So edit that contact and make sure the host_notification_options are setup correctly.
Re: host dependencies configurations
Posted: Mon Dec 12, 2016 12:07 am
by chandasandeep
Hi
edited contact as given below and given host_notification_options too.
Code: Select all
define contact{
contact_name reddalerts ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined above)
alias redd Admin ; Full name of user
host_notification_options d,u,r
email [email protected]
}
in abc host i have setup configuration like given below which is throwing alerts when host got down
Code: Select all
define hostdependency{
host_name AZMRDC01
dependent_host_name TDMRAXDEVSQL01
notification_failure_criteria d,u
}
in gateway configuration file i have setup configuration like given below for host dependencies.
Code: Select all
define hostdependency{
host_name AZMRDC01
dependent_host_name Gateway
notification_failure_criteria d
}
Code: Select all
define hostdependency{
host_name TDMRAXDEVSQL01
dependent_host_name Gateway
notification_failure_criteria d,u
}
bit confusion over host dependency configuration. could you please advise me
Re: host dependencies configurations
Posted: Mon Dec 12, 2016 1:48 pm
by dwhitfield
@chandasandeep, it's unclear to me where we are with this. You mention you get alerts when the host goes down. That sounds good.
Then, you mention you are confused about the host dependency configuration. Are you still getting error messages? If so, which ones? What, if anything, is not working?
If you are just looking for a bit more information on how it all fits together, we can do that, but I just want to make sure we are addressing the correct issue. Thanks!
In general, you can find information on dependencies at
https://assets.nagios.com/downloads/nag ... ncies.html