AD/Witness Server Restart : Critical Alarm for URL Checks
-
- Posts: 72
- Joined: Wed Feb 06, 2019 3:22 pm
AD/Witness Server Restart : Critical Alarm for URL Checks
Every time I do a restart on my AD/Witness server my URL checks fail and show critical and send off my emails/texts. I thought it had to do with our secondary dns not being setup properly. There was a network issue in allowing DNS but that was resolved. We ran over our secondary dns also. But once we restart that primary dns the checks fail. Why is that? What could be causing the issue?
Below are the cfg files:
define host {
host_name http://prod1.yyyy.com/api/api.asmx
use generic-host
address prod1.yyyy.com
check_command check_tcp!80!
max_check_attempts 5
check_interval 5
retry_interval 1
contact_groups admins
notification_interval 60
notification_period 24x7
check_period 24x7
register 1
}
define service {
host_name http://prod1.yyyy.com/api/api.asmx
service_description URL Status
use generic-service
check_command check_service_http! -f follow -u '/api/api.asmx'
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contact_groups admins production
register 1
}
-------------------------------------
define host {
host_name http://prod.xxxx.com/api/api.asmx
use generic-host
address prod.xxxx.com
check_command check_tcp!80!
max_check_attempts 5
check_interval 5
retry_interval 1
contact_groups admins
notification_interval 60
notification_period 24x7
check_period 24x7
register 1
}
define service {
host_name http://prod.xxxx.com/api/api.asmx
service_description URL Status
use generic-service
check_command check_service_http! -f follow -u '/api/api.asmx'
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contact_groups admins production
register 1
}
Below are the cfg files:
define host {
host_name http://prod1.yyyy.com/api/api.asmx
use generic-host
address prod1.yyyy.com
check_command check_tcp!80!
max_check_attempts 5
check_interval 5
retry_interval 1
contact_groups admins
notification_interval 60
notification_period 24x7
check_period 24x7
register 1
}
define service {
host_name http://prod1.yyyy.com/api/api.asmx
service_description URL Status
use generic-service
check_command check_service_http! -f follow -u '/api/api.asmx'
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contact_groups admins production
register 1
}
-------------------------------------
define host {
host_name http://prod.xxxx.com/api/api.asmx
use generic-host
address prod.xxxx.com
check_command check_tcp!80!
max_check_attempts 5
check_interval 5
retry_interval 1
contact_groups admins
notification_interval 60
notification_period 24x7
check_period 24x7
register 1
}
define service {
host_name http://prod.xxxx.com/api/api.asmx
service_description URL Status
use generic-service
check_command check_service_http! -f follow -u '/api/api.asmx'
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contact_groups admins production
register 1
}
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: AD/Witness Server Restart : Critical Alarm for URL Check
Can you show the errors you are getting?
-
- Posts: 72
- Joined: Wed Feb 06, 2019 3:22 pm
Re: AD/Witness Server Restart : Critical Alarm for URL Check
Can I PM you the message? Too much sensitive info
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: AD/Witness Server Restart : Critical Alarm for URL Check
surenickanderson1982 wrote:Can I PM you the message? Too much sensitive info
-
- Posts: 72
- Joined: Wed Feb 06, 2019 3:22 pm
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: AD/Witness Server Restart : Critical Alarm for URL Check
Can the Nagios server reach these URLs when this happens?
How fast is the DNS switching to the fail-over?
Code: Select all
curl "http://prod1.yyyy.com/api/api.asmx"
-
- Posts: 72
- Joined: Wed Feb 06, 2019 3:22 pm
Re: AD/Witness Server Restart : Critical Alarm for URL Check
[root@computer name]# curl -v http://prod1.yyyy.com/api/api/asmx
* About to connect() to prod1.yyyy.com port 80 (#0)
* Trying xx.xx.96.108...
* Connected to prod1.affipay.com (xx.xx.96.108) port 80 (#0)
> GET /api/api/asmx HTTP/1.1
> User-Agent: curl/7.29.0
> Host: prod1.yyyy.com
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 301 Moved Permanently
< location: https://prod1.yyyy.com/api/api/asmx
< Server: BigIP
* HTTP/1.0 connection set to keep alive!
< Connection: Keep-Alive
< Content-Length: 0
<
* Connection #0 to host prod1.yyyy.com left intact
* About to connect() to prod1.yyyy.com port 80 (#0)
* Trying xx.xx.96.108...
* Connected to prod1.affipay.com (xx.xx.96.108) port 80 (#0)
> GET /api/api/asmx HTTP/1.1
> User-Agent: curl/7.29.0
> Host: prod1.yyyy.com
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 301 Moved Permanently
< location: https://prod1.yyyy.com/api/api/asmx
< Server: BigIP
* HTTP/1.0 connection set to keep alive!
< Connection: Keep-Alive
< Content-Length: 0
<
* Connection #0 to host prod1.yyyy.com left intact
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: AD/Witness Server Restart : Critical Alarm for URL Check
scottwilkerson wrote:How fast is the DNS switching to the fail-over?
-
- Posts: 72
- Joined: Wed Feb 06, 2019 3:22 pm
Re: AD/Witness Server Restart : Critical Alarm for URL Check
It switches over right away. We monitored that when we restarted the Witness/AD server.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: AD/Witness Server Restart : Critical Alarm for URL Check
I guess the only other thing I can suggest because you are getting a socket timeout is that you see what prod1.yyyy.com is resolving to from the Nagios server when this happens.