Nagios Support Forum

Posted: **Thu Sep 24, 2020 6:51 am**

Hi

We have a questions regarding passive control, NRDP,

Our Nagios XI 5.7.2 i running on Redhat6 64 in vmware.

We have created a host with 10 checks in Nagios XI, all passive.

They are updated from a client using a custom script.

If we run the script to update one service one time, all works as expected.
and if we run the script to update all 10 services once, all works as well.
But if we update the same service with a new status or status information more than once in quick succession.
It seems random which update is displayed.

If we delay updates of same service to 10 seconds, the last update is shows. Almost all the time.

The question is
How fast or slow can we update the same service, and expect the last of the updates to be shown. Every time ?

We have even tried to use send_nrpd.sh as a test, to rule out any error in our custom code.

% cat test10.sh
PAUSE=10
for t in 1 2 3 4 5 6 7 8 9 10; do
./send_nrdp.sh -u https://nagiosserver/nrdp/ -t <very secret token> -H testserver -s "batch job 1" -S 1 -o "Job run warning $t"
sleep $PAUSE
./send_nrdp.sh -u https://nagiosserver/nrdp/ -t <very secret token> -H testserver -s "batch job 1" -S 0 -o "Job run ok $t"
sleep $PAUSE
done

This run with similar results, PAUSE under 10 seems to give random results.
10 and above seems to work as expected, last update i shown.

I have ensured that send_nrdp.sh returns 1 every time

Sent 1 checks to https://nagiosserver/nrdp/
Sent 1 checks to https://nagiosserver/nrdp/
Sent 1 checks to https://nagiosserver/nrdp/
Sent 1 checks to https://nagiosserver/nrdp/
......

Please advice to the inner workings in Nagios, to figure out reasonable timings.

Regards.

Henrik

Posted: **Thu Sep 24, 2020 4:28 pm**

This is governed by how frequently you have nagios process the passive check results, every 10 seconds is the default
in the nagios.cfg

Code: Select all

check_result_reaper_frequency=10

https://assets.nagios.com/downloads/nag ... _frequency

Posted: **Fri Sep 25, 2020 2:16 am**

Yes, that sounds correct - but why do we not get all the statuses? It misses 40% of the status updatess if we update each 1s - and status updates within 10s are inserted in a random order, which we can live with, if they are just inserted correctly.

Posted: **Fri Sep 25, 2020 5:45 am**

mrmit wrote:Yes, that sounds correct - but why do we not get all the statuses? It misses 40% of the status updatess if we update each 1s - and status updates within 10s are inserted in a random order, which we can live with, if they are just inserted correctly.

As mrmit describe
If we send 10 updates in rapid succession, Nagios first shows one. That seem random.
I guess the inner workings of Nagios is responsible for this. Nagios poll's for a status internally.

One would expect that Nagios would show the last update (of the 10) next time the web interface updates. but it does not.
It seems like the rest of the 10 updates is forgotten or deleted, after the first update.

Regards

Djarner

Posted: **Fri Sep 25, 2020 7:58 am**

You can drop this to 1

Code: Select all

check_result_reaper_frequency=1

And restart nagios.

This will process checks every second, more frequently than that is subject to the behavior you are seeing as the checks are placed in a queue directory and not processed in a specific order as this is emptied at the interval in the setting above

Posted: **Mon Sep 28, 2020 2:15 am**

We understand the async queue and the random order which check results are inserted, that is not the issue.

The issue is that is loses 40% of the status updates - that is a consistency problem, why does it lose data when pushing 1 update per second??

Posted: **Mon Sep 28, 2020 7:31 am**

mrmit wrote:We understand the async queue and the random order which check results are inserted, that is not the issue.

The issue is that is loses 40% of the status updates - that is a consistency problem, why does it lose data when pushing 1 update per second??

Are you sure it is losing them instead of just not processing them in the correct order?

I would set the following in your nagios.cfg

Code: Select all

log_passive_checks=1

Then restart nagios

You should be able to see it process each check

Posted: **Thu Oct 01, 2020 1:56 am**

ok, I think I understand now. Off course when all the check statuses are in random order - it will not show the ones where the status does not change from ok to warning for example. That is off course why it seems to lose some check statuses.

I would think that enabling volatile on the service would fix that, so we could see all the updates - in the description it says exactly that, but it doesnt seem to work.

When we are using a template, options on the service does take effect, right?

Posted: **Thu Oct 01, 2020 5:51 pm**

You likely want to use State Stalking for the logging as well:

https://assets.nagios.com/downloads/nag ... lking.html

Correct, anytime you set something directly on the host/service it will override anything defined in a template.

Posted: **Mon Oct 05, 2020 2:54 am**

I dont intentionally want to drag this out, but still it doesnt seem to work.
We have enabled volatile, stalking and obsessing, still when I push OK status checks to the service, nothing appears in service history. Wont it show up there?

Nagios Support Forum

Passive checks reams to miss a beat

Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat

Re: Passive checks reams to miss a beat