Issue last week Regarding hard state / parent child (XI 2.5)
Posted: Mon May 04, 2015 9:29 am
We had an episode last week on the 27th where we had a store switch go down and we still alerted on all the children (servers) in the store:
State History Reporting screen: (ANGY is the store switch / parent, rest are the servers / children)
Question is that time stamp on these are those the initial checks time stamps or the last check? Because the Parent seemed to hit the hard state and the children still ran & alerted.
Now we did change the host check from it's default ping (on servers) to check port 445, would that break any parent child logic by chance?
State History Reporting screen: (ANGY is the store switch / parent, rest are the servers / children)
Code: Select all
2015-04-27 14:32:12 xxxxxASHYP02 UNREACHABLE HARD 5 of 5 CRITICAL - xxxxxASHYP02: rta nan, lost 100%
2015-04-27 14:32:12 xxxxxASHYC01 UNREACHABLE HARD 5 of 5 CRITICAL - xxxxxASHYC01: rta nan, lost 100%
2015-04-27 14:32:12 xxxxxASRET01 UNREACHABLE HARD 5 of 5 CRITICAL - xxxxxASRET01: rta nan, lost 100%
2015-04-27 14:32:11 xxxxxASHYP01 UNREACHABLE HARD 5 of 5 CRITICAL - xxxxxASHYP01: rta nan, lost 100%
2015-04-27 14:31:43 xxxxxASRFI01 UNREACHABLE HARD 5 of 5 CRITICAL - xxxxxASRFI01: Host unreachable @ xxxxx.1.162. rta nan, lost 100%
2015-04-27 14:29:26 xxxxxANGWY DOWN HARD 3 of 3 CRITICAL - xxxxxANGWY: Host unreachable @ xxxxx.1.162. rta nan, lost 100%
Now we did change the host check from it's default ping (on servers) to check port 445, would that break any parent child logic by chance?