State Type Hard Up vs Soft Up

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
olmgroup
Posts: 14
Joined: Tue Jul 03, 2018 5:30 am

State Type Hard Up vs Soft Up

Post by olmgroup »

Hi all,
Earlier this week we were patching some servers that are monitored by Nagios XI. We scheduled downtime in Nagios for these servers as it is part of our patching regime. 2 of these monitored servers have identical Nagios configurations in that they are using Check_tcp to poll a server on port 1790; Once the maintenance was completed one of the server returned with "State Type OK Hard" and the other with "State Type OK Soft" and this is affecting our service availability reports.

So 2 questions:

1) How can we fix the service availability report to accurately show when the service was available
2) How can we stop this issue occurring in the future?

Screenshots shown at this URL
https://photos.google.com/share/AF1QipP ... R0ajF6UUVn

Running Nagios XI Installed Version: 5.5.9
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: State Type Hard Up vs Soft Up

Post by cdienger »

The data is coming from nagios.log and files in /usr/local/nagios/var/archives/ so we could edit those if necessary, but I'm not sure why it would have different results like that. I'd like to get the the logs covering the 26th through the 29th as well as a profile from Admin > System Config > System Profile > Download Profile. Please pm these to me - compress the log files if they are not already.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: State Type Hard Up vs Soft Up

Post by cdienger »

The logs show the -swas machine was actually unreachable(no route to host) while it was in downtime. When the host is in a non-OK status, then its services will automatically go into a HARD state if they also have a non-OK status:
[Mon May 27 20:00:04 PDT 2019] SERVICE DOWNTIME ALERT: -swas; LOGIN;STARTED; Service has entered a period of scheduled downtime
[Mon May 27 20:00:06 PDT 2019] HOST DOWNTIME ALERT: -swas;STARTED; Host has entered a period of scheduled downtime


[Mon May 27 21:30:44 PDT 2019] HOST ALERT: -swas;DOWN;SOFT;1;CRITICAL - 172.31.120.14: rta nan, lost 100%

[Mon May 27 21:31:33 PDT 2019] SERVICE ALERT: -swas; LOGIN;CRITICAL;HARD;1;connect to address 172.31.120.14 and port 1790: No route to host

[Mon May 27 21:31:45 PDT 2019] HOST ALERT: -swas;DOWN;SOFT;2;CRITICAL - 172.31.120.14: Host unreachable @ 79.99.65.57. rta nan, lost 100%
[Mon May 27 21:32:46 PDT 2019] HOST ALERT: -swas;DOWN;SOFT;3;CRITICAL - 172.31.120.14: Host unreachable @ 79.99.65.57. rta nan, lost 100%
[Mon May 27 21:33:47 PDT 2019] HOST ALERT: -swas;DOWN;SOFT;4;CRITICAL - 172.31.120.14: Host unreachable @ 79.99.65.57. rta nan, lost 100%
[Mon May 27 21:34:45 PDT 2019] HOST ALERT: -swas;UP;SOFT;1;OK - 172.31.120.14: rta 0.819ms, lost 0%

[Mon May 27 21:41:26 PDT 2019] SERVICE ALERT: -swas; LOGIN;OK;SOFT;1;TCP OK - 0.001 second response time on 172.31.120.14 port 1790
Are you using the "Hide scheduled downtime" option found under Advanced when you run the SLA reports? Does this option make a difference?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
olmgroup
Posts: 14
Joined: Tue Jul 03, 2018 5:30 am

Re: State Type Hard Up vs Soft Up

Post by olmgroup »

I have just tried the "Hide Scheduled Downtime" in the report but the output is the same
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: State Type Hard Up vs Soft Up

Post by lmiltchev »

I have just tried the "Hide Scheduled Downtime" in the report but the output is the same
The output shouldn't be the same, unless the host is still in downtime. The downtime start/end times should be both known in order for math to work in reports... I am not sure if this is the case here.
Be sure to check out our Knowledgebase for helpful articles and solutions!
olmgroup
Posts: 14
Joined: Tue Jul 03, 2018 5:30 am

Re: State Type Hard Up vs Soft Up

Post by olmgroup »

Sorry - I rechecked the output and tweaked the report times and can confirm that once I isolated the period where the work took place and excluded scheduled downtime it does show 100% availability on the service.

Thanks for your assistance!
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: State Type Hard Up vs Soft Up

Post by lmiltchev »

Great! Let us know if it is OK to close the topic then. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked