Sounds like realtime data isn't getting updated. This could be due to multiple nagios processes or caching. Are there multiple Nagios processes running on any of the machines? "ps -ef | grep nagios.cfg" should only show to processes similar to:
nagios 11232 1 3 12:27 ? 00:07:33 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 11276 11232 0 12:27 ? 00:00:01 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Anything more should be killed. The caching option can be found under Admin > System Config > Performance Settings > Backend Cache.
Inconsistent NRDP performance
Re: Inconsistent NRDP performance
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Inconsistent NRDP performance
I checked all 3 of the active nagios servers and there are indeed only 2 processes with nagios.cfg in the command line and the same was true of the passive server.cdienger wrote:Sounds like realtime data isn't getting updated. This could be due to multiple nagios processes or caching. Are there multiple Nagios processes running on any of the machines? "ps -ef | grep nagios.cfg" should only show to processes similar to:
nagios 11232 1 3 12:27 ? 00:07:33 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 11276 11232 0 12:27 ? 00:00:01 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Anything more should be killed. The caching option can be found under Admin > System Config > Performance Settings > Backend Cache.
Also, I checked and the caching settings are not enabled in any of the servers so that's not what is driving this. Where else should I be looking at this point?
Re: Inconsistent NRDP performance
I think I found something, but I'm not sure what the heck to make of it. I happened to spot these lines in my nagios.log:
What's interesting about that is that the Nagios passive server sees that I've set the freshness threshold to 25 minutes, but is considering the checks stale after mere seconds. And it's not just that one check, the log was literally littered with them timing out after as little as 1 second. As you can see by the log output, I've set the freshness threshold but it's being ignored. Thoughts?
Oh, and in case it helps, I looked at the settings for one of the failing checks in objects.cache and it looks like this:
Code: Select all
[1521057945] Warning: The results of service 'Datastore - Usage' on host 'some_host_name' are stale by 0d 0h 0m 32s (threshold=0d 0h 25m 0s). I'm forcing an immediate check of the service.
Oh, and in case it helps, I looked at the settings for one of the failing checks in objects.cache and it looks like this:
Code: Select all
define service {
host_name some_host_name
service_description Datastore - Usage
check_period xi_timeperiod_24x7
check_command check_dummy!2!"Data not received from $_HOSTNAGHOST$"!!!!!!
contacts nagiosadmin
notification_period xi_timeperiod_24x7
initial_state o
importance 0
check_interval 10.000000
retry_interval 1.000000
max_check_attempts 5
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 1500
check_freshness 1
notification_options a
notifications_enabled 0
notification_interval 60.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 0
retain_status_information 1
retain_nonstatus_information 1
}
Re: Inconsistent NRDP performance
I'm curious about the testing method with the log - the message would indicate that the check hasn't come in for threshold+stalevalue or 25minutes 32 seconds in this case. That would be inline with the behavior we've been seeing but on the other hand if the check is going stale BEFORE the threshold of 25 minutes is reached, that would be a problem.
If the above doesn't help you with finding the problem, I'd like to take a look at the systems on a remote and would request you open a ticket at http://support.nagios.com/tickets/ in that case.
If the above doesn't help you with finding the problem, I'd like to take a look at the systems on a remote and would request you open a ticket at http://support.nagios.com/tickets/ in that case.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.