Page 1 of 1

NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 10:27 am
by snapier3
NagiosXI interface is inverting the host exit states.
down-as-unr.PNG
unr-as-down.PNG
Validating the status via API for host checks submitted via NRDP, the data is correct.
Down-NRDP.PNG
Unreachable-NRDP.PNG
Please provide a patch for this issue ASAP.

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 10:37 am
by swolf
Hi @snapier, thanks for reaching out - I'm trying to reproduce this issue. Is this something you're only seeing when reporting host check results via NCPA + NRDP?

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 10:40 am
by snapier3
I am seeing this when submitting via the XI->Submit Passive Checks, the states are inverted in the DB.

When submitting via NRDP the states are correct in the DB but, the exit state are still being inverted.

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 10:47 am
by snapier3
This is also in Core 4.4.13
core2.PNG

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 11:43 am
by swolf
Based on my own testing, I don't think there's a bug here, but I do think the behavior is somewhat unintuitive.

For hosts, the result codes are 0 for UP, 1 for DOWN, and 2 for UNREACHABLE. On my system, this matches the behavior of NRDP. My guess is that you expected the codes to match those for services (0=OK/1=WARNING/2=CRITICAL/3=UNKNOWN). Submitting a result of 3 will also show DOWN, making it look like the result codes were reversed. I don't personally like this behavior, but I don't think it's a critical bug - results outside of 0-2 for hosts should cause an alert, whereas UNREACHABLE results are usually suppressed.

Hopefully this helps - if you think you're seeing something else, please let me know.

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 12:12 pm
by snapier3
So is NRDP wrong our the UI?
Remember I get both versions of output

2=down
1=Unreachable
0=Up
3=Unknown

1 Down
2 Unreachable
0 = Up
3 = Unknown

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 12:14 pm
by snapier3
All good, I fixed it

Re: NagiosXI 5.11.3 | Host Monitoring Bug | Inverted Exit States

Posted: Mon Nov 27, 2023 12:20 pm
by snapier3

Host States

Hosts that are checked can be in one of three different states:

UP
DOWN
UNREACHABLE

Host State Determination

Host checks are performed by plugins, which can return a state of OK, WARNING, UNKNOWN, or CRITICAL. How does Nagios translate these plugin return codes into host states of UP, DOWN, or UNREACHABLE? Lets see...

The table below shows how plugin return codes correspond with preliminary host states. Some post-processing (which is described later) is done which may then alter the final host state.
Plugin Result Preliminary Host State
OK UP
WARNING UP or DOWN*
UNKNOWN DOWN
CRITICAL DOWN

Note Note: WARNING results usually means the host is UP. However, WARNING results are interpreted to mean the host is DOWN if the use_aggressive_host_checking option is enabled.

If the preliminary host state is DOWN, Nagios will attempt to see if the host is really DOWN or if it is UNREACHABLE. The distinction between DOWN and UNREACHABLE host states is important, as it allows admins to determine root cause of network outages faster. The following table shows how Nagios makes a final state determination based on the state of the hosts parent(s). A host's parents are defined in the parents directive in host definition.

Preliminary Host State Parent Host State Final Host State
DOWN At least one parent is UP DOWN
DOWN All parents are either DOWN or UNREACHABLE UNREACHABLE

More information on how Nagios distinguishes between DOWN and UNREACHABLE states can be found here.

Host State Changes

As you are probably well aware, hosts don't always stay in one state. Things break, patches get applied, and servers need to be rebooted. When Nagios checks the status of hosts, it will be able to detect when a host changes between UP, DOWN, and UNREACHABLE states and take appropriate action. These state changes result in different state types (HARD or SOFT), which can trigger event handlers to be run and notifications to be sent out. Detecting and dealing with state changes is what Nagios is all about.

When hosts change state too frequently they are considered to be "flapping". A good example of a flapping host would be server that keeps spontaneously rebooting as soon as the operating system loads. That's always a fun scenario to have to deal with. Nagios can detect when hosts start flapping, and can suppress notifications until flapping stops and the host's state stabilizes. More information on the flap detection logic can be found here.

Posted: Mon Nov 27, 2023 12:22 pm
by snapier3
Options and stuff, not a bug just forgotten settings