Page 1 of 1
Open Service Problem Dashlet CRITICAL
Posted: Thu Nov 05, 2020 4:53 pm
by blariv
hi
wondering if this dashlet can be changed to show critical when the Current Check is HARD at 5 of 5 instead of it being critical as soon as the first check is bad?
screenshot attached.
Re: Open Service Problem Dashlet CRITICAL
Posted: Fri Nov 06, 2020 10:23 am
by dchurch
Don't be put off by a CRITICAL state. Depending on the type of service being monitored, a host or service will be in an "up" or "down" state. In this case a CRITICAL status just means the host or service is down.
Other times, a host or service will report an abnormally high load, e.g. a high ping time, or a high CPU load, in which case it's a WARNING state instead of a CRITICAL. If you don't have a service that is measured like that, then it's only ever going to be a CRITICAL or OK state. An example check like this would be "is sshd running on my host"?
In lieu of writing your own plugin that takes in some sort of memory, what you're probably looking to key off of and report on is HARD vs. SOFT states. A HARD state just means "this is a real problem" - e.g. an HTTP request timed out X times in a row.
Here's a document that explains the HARD and SOFT states that Nagios uses.
Re: Open Service Problem Dashlet CRITICAL
Posted: Fri Nov 06, 2020 11:30 am
by blariv
I am sorry, I don't think I correctly worded that. I know about the HARD and SOFT states, but my questions was I would like the have the last critical only show on the dashboard, not only on the first bad state because I know it can recover in the next 4 checks. I just do not want my NOC to overreact
Re: Open Service Problem Dashlet CRITICAL
Posted: Fri Nov 06, 2020 4:39 pm
by dchurch
You could use a passive check with a freshness threshold to achieve the desired effect.
Code: Select all
define service {
host_name 192.168.23.45
service_description Process Count
use xiwizard_passive_service
# Set the state=WARNING for this service. It's only run when the freshness check fails.
# I.e. if Nagios XI hasn't received any passive check result for $freshness_threshold seconds.
check_command check_dummy!1!!!!!!!
max_check_attempts 1
check_interval 1
retry_interval 1
check_freshness 1
# Multiply the expected passive interval by the number of times it can
# be missing before considering it a warning.
# E.g. if ncpa.cfg's sleep = 300, and you want it to be a warning after 5 missed checks, set this to 1500.
freshness_threshold 1500
notification_interval 60
register 1
}
Re: Open Service Problem Dashlet CRITICAL
Posted: Mon Nov 09, 2020 4:25 pm
by blariv
sorry, still not getting my meaning.
my ops screen on nagiosxi shows nothing in critical until it hits the HARD 5 of 5, however the nagios fusion dashboards show that something is critical at all states of critical, 1 of 5, 2 of 5, etc...
wondering if there is something that needs to be changed in the code of the dashboard to reflect this.
Re: Open Service Problem Dashlet CRITICAL
Posted: Tue Nov 10, 2020 5:05 pm
by ssax
The functionality to do that would need to be added by development, it doesn't currently pass/use the state_type. I have submitted a feature request on your behalf with a link back to this thread:
FR: Fusion - Add state_type to host/service data and add the ability to only show hard states in the NOC page/dashlets
Please keep in mind that the decision to implement the enhancement is at the discretion of our development team.
Re: Open Service Problem Dashlet CRITICAL
Posted: Tue Nov 10, 2020 5:07 pm
by blariv
Great thank you!
Re: Open Service Problem Dashlet CRITICAL
Posted: Wed Nov 11, 2020 7:21 am
by scottwilkerson
blariv wrote:Great thank you!
Locking thread