freshness checks stop working periodically
freshness checks stop working periodically
Hello,
So I set up freshnesh checks for this passive service which is sending about 20 critical passive events a day. I set the timer to 300 seconds. This seems to work fine, but this is the fourth time now that this suddenly stops working. The service is no longer reset after 5 minutes.
Please check the screenshot for more information. as you can see the last critical passive check arrived at 07:00. I did a manual reset at 11:23.
Now the weird thing is that it only seems to fail with the first event after 07:00. At 07:00 there is an automatic apply configuration done with Reactor and the REST API.
Please advice how to prevent the freshnesh check from working as intended (reset critical states after 5 minutes)
Grtz
Willem
So I set up freshnesh checks for this passive service which is sending about 20 critical passive events a day. I set the timer to 300 seconds. This seems to work fine, but this is the fourth time now that this suddenly stops working. The service is no longer reset after 5 minutes.
Please check the screenshot for more information. as you can see the last critical passive check arrived at 07:00. I did a manual reset at 11:23.
Now the weird thing is that it only seems to fail with the first event after 07:00. At 07:00 there is an automatic apply configuration done with Reactor and the REST API.
Please advice how to prevent the freshnesh check from working as intended (reset critical states after 5 minutes)
Grtz
Willem
You do not have the required permissions to view the files attached to this post.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: freshness checks stop working periodically
Can you look in your objects.cache file and post one of the service definitions that is not working correctly.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: freshness checks stop working periodically
Here you go: Host:
Service:
Let me know if you need any more info.
Code: Select all
define host {
host_name cash0001
alias cash0001
address 10.54.86.128
check_period xi_timeperiod_24x7
check_command check_xi_host_ping!3000.0!20%!5000.0!80%!!!!
contacts steven.reenders,nagiosadmin,nagiosadmin
contact_groups cg_dummy
notification_period xi_timeperiod_24x7
initial_state o
importance 0
check_interval 5.000000
retry_interval 1.000000
max_check_attempts 7
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options a
notifications_enabled 1
notification_interval 1440.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
icon_image win_server.png
statusmap_image win_server.png
retain_status_information 1
retain_nonstatus_information 1
_XIWIZARD windowsserver
}
Code: Select all
define service {
host_name cash0003
service_description EVT_Cash_Quota
display_name EVT_System
check_period xi_timeperiod_24x7
check_command check_dummy!0!"Dummy check passed"!!!!!!
contacts steven.reynders,nagiosadmin,nagiosadmin
contact_groups cg_dummy
notification_period xi_timeperiod_24x7
initial_state o
importance 0
check_interval 1440.000000
retry_interval 1.000000
max_check_attempts 1
is_volatile 0
parallelize_check 1
active_checks_enabled 0
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 0
flap_detection_options a
freshness_threshold 300
check_freshness 1
notification_options a
notifications_enabled 1
notification_interval 1440.000000
first_notification_delay 0.000000
stalking_options o,w,u,c
process_perf_data 0
icon_image windowseventlog.png
retain_status_information 1
retain_nonstatus_information 1
_XIWIZARD windowseventlog
}
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: freshness checks stop working periodically
Might wanna see some logging if there's nothing sensitive:
grep "cash0001" /usr/local/nagios/var/nagios.log | tail -100
grep "cash0001" /usr/local/nagios/var/nagios.log | tail -100
Former Nagios employee
Re: freshness checks stop working periodically
This is the result for cash0001:
And for cash0002:
I'm not really sure what the results mean. As you can see the threshold is (threshold=0d 0h 5m 0s). So why does it sometimes sorce a check on weird times, such as "Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 5h 3m 5s" and "Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 1s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service."
Please let me know if this mean anything to you.
Code: Select all
grep "cash0001" /usr/local/nagios/var/nagios.log | tail -100
[1462831200] CURRENT HOST STATE: cash0001;DOWN;HARD;7;CRITICAL - 10.54.86.128: rta nan, lost 100%
[1462831200] CURRENT SERVICE STATE: cash0001;DRV_C_Load;OK;HARD;1;OK: Drive C: Avg of 5 samples: {Rate (Read: 0.00000MB/s)(Write: 0.01867MB/s)} {Avg Nr of (Reads: 0.00000r/s)(Writes: 2.18149w/s)} {Latency (Read: 0.00000ms)(Write: 1.77600ms)} {Queue Length (Read: 0.00000ql)(Write: 0.00580ql)}
[1462831200] CURRENT SERVICE STATE: cash0001;DRV_C_Usage;OK;HARD;1;OK: C:: Total: 55.8G - Used: 25.2G (45%) - Free: 30.6G (55%)
[1462831200] CURRENT SERVICE STATE: cash0001;EVT_Application;OK;HARD;1;OK: Dummy check passed
[1462831200] CURRENT SERVICE STATE: cash0001;EVT_Cash_Quota;OK;HARD;1;OK: Dummy check passed
[1462831200] CURRENT SERVICE STATE: cash0001;EVT_System;OK;HARD;1;OK - Manual Reset
[1462831200] CURRENT SERVICE STATE: cash0001;NET_Connections;OK;HARD;1;OK: {TCP: (Total: 00037)(Established: 7)(Listening: 30)(Time_Wait: 0)(Close_Wait: 0)(Other: 0)}{UDP: (Total: 15)}
[1462831200] CURRENT SERVICE STATE: cash0001;NET_Load;OK;HARD;1;OK: Realtek PCIe GBE Family Controller: Avg of 2 seconds: {Total Link Utilisation: 0,00012%}{Rate (Total: 0,00014 MB/sec)(Received: 0,00000 MB/sec)(Sent: 0,00014 MB/sec)}
[1462831200] CURRENT SERVICE STATE: cash0001;PRC_Tracs;OK;HARD;1;OK: All processes are running.
[1462831200] CURRENT SERVICE STATE: cash0001;SRV_CPU_Usage;OK;HARD;1;OK: 1m: 0%, 5m: 2%, 15m: 2%
[1462831200] CURRENT SERVICE STATE: cash0001;SRV_Certificates;OK;HARD;1;All certificates are OK.
[1462831200] CURRENT SERVICE STATE: cash0001;SRV_Memory;OK;HARD;1;OK: physical memory: Total: 3.39G - Used: 1.1G (32%) - Free: 2.29G (68%), paged bytes: Total: 6.78G - Used: 1.01G (14%) - Free: 5.78G (86%)
[1462831200] CURRENT SERVICE STATE: cash0001;SRV_Ping;CRITICAL;SOFT;1;CRITICAL - 10.54.86.128: rta nan, lost 100%
[1462831200] CURRENT SERVICE STATE: cash0001;SRV_Uptime;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 20 seconds.
[1462831200] CURRENT SERVICE STATE: cash0001;SVC_McAfee;OK;HARD;1;OK: All services are in their appropriate state.
[1462831200] CURRENT SERVICE STATE: cash0001;SVC_Windows;OK;HARD;1;OK: All services are in their appropriate state.
[1462831566] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0001' are stale by 0d 5h 6m 13s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462856484] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0001' are stale by 0d 12h 1m 31s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
2.00 EUR: 0 stuk(s). ERT: cash0001;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
[1462856519] HOST ALERT: cash0001;UP;HARD;7;OK - 10.54.86.128: rta 0.355ms, lost 0%
[1462856519] HOST NOTIFICATION: nagiosadmin;cash0001;UP;xi_host_notification_handler;OK - 10.54.86.128: rta 0.355ms, lost 0%
[1462856519] HOST NOTIFICATION: steven.reynders;cash0001;UP;xi_host_notification_handler;OK - 10.54.86.128: rta 0.355ms, lost 0%
[1462856735] SERVICE ALERT: cash0001;SRV_Ping;OK;SOFT;2;OK - 10.54.86.128: rta 0.355ms, lost 0%
[1462856786] SERVICE ALERT: cash0001;SRV_Uptime;OK;SOFT;2;OK: uptime: 0:6
Code: Select all
grep "cash0002" /usr/local/nagios/var/nagios.log | tail -100
[1462831200] CURRENT HOST STATE: cash0002;DOWN;HARD;7;CRITICAL - 10.54.86.148: rta nan, lost 100%
[1462831200] CURRENT SERVICE STATE: cash0002;DRV_C_Load;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 60 seconds.
[1462831200] CURRENT SERVICE STATE: cash0002;DRV_C_Usage;OK;HARD;1;OK: C:: Total: 55.8G - Used: 21.5G (38%) - Free: 34.3G (62%)
[1462831200] CURRENT SERVICE STATE: cash0002;EVT_Application;OK;HARD;1;OK: Dummy check passed
* Cutter OK: Trueeend: True STATE: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
[1462831200] CURRENT SERVICE STATE: cash0002;EVT_System;OK;HARD;1;OK - Manual Reset
[1462831200] CURRENT SERVICE STATE: cash0002;NET_Connections;OK;HARD;1;OK: {TCP: (Total: 00045)(Established: 11)(Listening: 32)(Time_Wait: 1)(Close_Wait: 1)(Other: 0)}{UDP: (Total: 15)}
[1462831200] CURRENT SERVICE STATE: cash0002;NET_Load;CRITICAL;SOFT;1;Timeout while attempting connection
[1462831200] CURRENT SERVICE STATE: cash0002;PRC_Tracs;CRITICAL;SOFT;1;CHECK_NRPE: Socket timeout after 60 seconds.
[1462831200] CURRENT SERVICE STATE: cash0002;SRV_CPU_Usage;OK;HARD;1;OK: 1m: 2%, 5m: 2%, 15m: 2%
[1462831200] CURRENT SERVICE STATE: cash0002;SRV_Certificates;OK;HARD;1;All certificates are OK.
[1462831200] CURRENT SERVICE STATE: cash0002;SRV_Memory;OK;HARD;1;OK: physical memory: Total: 3.39G - Used: 1.12G (33%) - Free: 2.27G (67%), paged bytes: Total: 6.78G - Used: 1.01G (14%) - Free: 5.77G (86%)
[1462831200] CURRENT SERVICE STATE: cash0002;SRV_Ping;CRITICAL;SOFT;1;CRITICAL - 10.54.86.148: rta nan, lost 100%
[1462831200] CURRENT SERVICE STATE: cash0002;SRV_Uptime;OK;HARD;1;OK: uptime: 9:38
[1462831200] CURRENT SERVICE STATE: cash0002;SVC_McAfee;OK;HARD;1;OK: All services are in their appropriate state.
[1462831200] CURRENT SERVICE STATE: cash0002;SVC_Windows;OK;HARD;1;OK: All services are in their appropriate state.
[1462831566] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 5h 3m 5s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462831742] HOST ALERT: cash0002;UP;HARD;7;OK - 10.54.86.148: rta 0.361ms, lost 0%
[1462831742] HOST NOTIFICATION: nagiosadmin;cash0002;UP;xi_host_notification_handler;OK - 10.54.86.148: rta 0.361ms, lost 0%
[1462831742] HOST NOTIFICATION: steven.reynders;cash0002;UP;xi_host_notification_handler;OK - 10.54.86.148: rta 0.361ms, lost 0%
2.00 EUR: 0 stuk(s). ERT: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
[1462831832] SERVICE ALERT: cash0002;NET_Load;OK;SOFT;2;OK: Realtek PCIe GBE Family Controller: Avg of 2 seconds: {Total Link Utilisation: 0,00018%}{Rate (Total: 0,00021 MB/sec)(Received: 0,00021 MB/sec)(Sent: 0,00000 MB/sec)}
[1462831847] SERVICE ALERT: cash0002;SRV_Ping;OK;SOFT;2;OK - 10.54.86.148: rta 0.316ms, lost 0%
[1462831871] SERVICE ALERT: cash0002;DRV_C_Load;OK;SOFT;2;OK: Drive C: Avg of 5 samples: {Rate (Read: 0.16296MB/s)(Write: 15.47346MB/s)} {Avg Nr of (Reads: 21.64613r/s)(Writes: 18.79993w/s)} {Latency (Read: 1.25207ms)(Write: 7.92500ms)} {Queue Length (Read: 0.05148ql)(Write: 0.23789ql)}
[1462831907] SERVICE ALERT: cash0002;PRC_Tracs;OK;SOFT;2;OK: All processes are running.
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
[1462856424] SERVICE ALERT: cash0002;EVT_Cash_Quota;OK;HARD;1;OK: Dummy check passed
[1462856424] SERVICE NOTIFICATION: nagiosadmin;cash0002;EVT_Cash_Quota;OK;xi_service_notification_handler;OK: Dummy check passed
[1462856784] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
[1462856958] SERVICE NOTIFICATION: nagiosadmin;cash0002;EVT_Cash_Quota;CRITICAL;xi_service_notification_handler;error 47 PayTracs: Ticket printer * Cutter OK: Trueeend: True
[1462856958] SERVICE NOTIFICATION: steven.reynders;cash0002;EVT_Cash_Quota;CRITICAL;xi_service_notification_handler;error 47 PayTracs: Ticket prin * Cutter OK: Trueeend: True
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
[1462857324] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 59s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462857324] SERVICE ALERT: cash0002;EVT_Cash_Quota;OK;HARD;1;OK: Dummy check passed
[1462857324] SERVICE NOTIFICATION: nagiosadmin;cash0002;EVT_Cash_Quota;OK;xi_service_notification_handler;OK: Dummy check passed
[1462857684] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462858044] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462858404] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462858764] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462859124] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462859483] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 59s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462859844] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 1s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462860204] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
* Cutter OK: Trueeend: Truecash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Ticket printer melding:
[1462860561] SERVICE NOTIFICATION: nagiosadmin;cash0002;EVT_Cash_Quota;CRITICAL;xi_service_notification_handler;error 47 PayTracs: Ticket printer * Cutter OK: Trueeend: True
[1462860561] SERVICE NOTIFICATION: steven.reynders;cash0002;EVT_Cash_Quota;CRITICAL;xi_service_notification_handler;error 47 PayTracs: Ticket prin * Cutter OK: Trueeend: True
2.00 EUR: 106 stuk(s). T: cash0002;EVT_Cash_Quota;CRITICAL;HARD;1;error 47 PayTracs: Hopperniveau kritisch
[1462860924] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 59s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462860924] SERVICE ALERT: cash0002;EVT_Cash_Quota;OK;HARD;1;OK: Dummy check passed
[1462860924] SERVICE NOTIFICATION: nagiosadmin;cash0002;EVT_Cash_Quota;OK;xi_service_notification_handler;OK: Dummy check passed
[1462861283] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 59s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462861643] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462861944] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 1s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462862303] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 59s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462862604] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 1s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462862963] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 59s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
[1462863323] Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 1m 0s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service.
Please let me know if this mean anything to you.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: freshness checks stop working periodically
Your check_interval is 5 minutes, so when it says Warning: The results of service 'EVT_Cash_Quota' on host 'cash0002' are stale by 0d 0h 0m 1s (threshold=0d 0h 5m 0s). I'm forcing an immediate check of the service. it is saying that it it's 5 minutes and 1 second old when the check_interval ran so it's considered stale, that looks normal to me. The freshness checks will only run on the scheduled check_interval so the last check would have been 5 minutes ago.
I have no idea why you are getting stale by 0d 5h 3m 5s, what do you have set in your /usr/local/nagios/etc/nagios.cfg for these:
Edit: I don't think that's right, it should check based on the host/service_freshness_check_interval in your nagios.cfg
I have no idea why you are getting stale by 0d 5h 3m 5s, what do you have set in your /usr/local/nagios/etc/nagios.cfg for these:
Code: Select all
additional_freshness_latency
check_host_freshness
check_service_freshness
host_freshness_check_interval
service_freshness_check_intervalRe: freshness checks stop working periodically
Sean,
My nagios.cfg:
Or the freshnesh only:
Not sure what's going on..
My nagios.cfg:
Code: Select all
# MODIFIED
admin_email=root@localhost
admin_pager=root@localhost
translate_passive_host_checks=1
log_event_handlers=0
use_large_installation_tweaks=1
enable_environment_macros=0
# NDOUtils module
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
# Mod Gearman module
broker_module=/usr/lib64/mod_gearman/mod_gearman.o config=/etc/mod_gearman/mod_gearman_neb.conf
# PNP settings - bulk mode with NCPD
process_performance_data=1
# service performance data
service_perfdata_file=/var/nagiosramdisk/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$\tSERVICEOUTPUT::$SERVICEOUTPUT$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file-bulk
# host performance data
host_perfdata_file=/var/nagiosramdisk/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tHOSTOUTPUT::$HOSTOUTPUT$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file-bulk
# OBJECTS - UNMODIFIED
#cfg_file=/usr/local/nagios/etc/objects/commands.cfg
#cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
#cfg_file=/usr/local/nagios/etc/objects/templates.cfg
#cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
# STATIC OBJECT DEFINITIONS (THESE DON'T GET EXPORTED/IMPORTED BY NAGIOSQL)
cfg_dir=/usr/local/nagios/etc/static
# OBJECTS EXPORTED FROM NAGIOSQL
cfg_file=/usr/local/nagios/etc/contacttemplates.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/contacts.cfg
cfg_file=/usr/local/nagios/etc/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/commands.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/servicegroups.cfg
cfg_file=/usr/local/nagios/etc/hosttemplates.cfg
cfg_file=/usr/local/nagios/etc/servicetemplates.cfg
cfg_file=/usr/local/nagios/etc/servicedependencies.cfg
cfg_file=/usr/local/nagios/etc/serviceescalations.cfg
cfg_file=/usr/local/nagios/etc/hostdependencies.cfg
cfg_file=/usr/local/nagios/etc/hostescalations.cfg
cfg_file=/usr/local/nagios/etc/hostextinfo.cfg
cfg_file=/usr/local/nagios/etc/serviceextinfo.cfg
cfg_dir=/usr/local/nagios/etc/hosts
cfg_dir=/usr/local/nagios/etc/services
# GLOBAL EVENT HANDLERS
global_host_event_handler=xi_host_event_handler
global_service_event_handler=xi_service_event_handler
# UNMODIFIED
accept_passive_host_checks=1
accept_passive_service_checks=1
additional_freshness_latency=15
auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45
bare_update_check=0
cached_host_check_horizon=15
cached_service_check_horizon=15
check_external_commands=1
check_for_orphaned_hosts=1
check_for_orphaned_services=1
check_for_updates=1
check_host_freshness=0
check_result_path=/var/nagiosramdisk/spool/checkresults
check_result_reaper_frequency=10
check_service_freshness=1
#command_check_interval=-1
command_file=/usr/local/nagios/var/rw/nagios.cmd
daemon_dumps_core=0
date_format=us
debug_file=/usr/local/nagios/var/nagios.debug
debug_level=0
debug_verbosity=1
#enable_embedded_perl=1
enable_event_handlers=1
enable_flap_detection=1
enable_notifications=1
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
event_broker_options=-1
event_handler_timeout=30
execute_host_checks=1
execute_service_checks=1
#external_command_buffer_slots=4096
high_host_flap_threshold=20.0
high_service_flap_threshold=20.0
host_check_timeout=30
host_freshness_check_interval=60
host_inter_check_delay_method=s
illegal_macro_output_chars=`~$&|'"<>
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
interval_length=60
lock_file=/usr/local/nagios/var/nagios.lock
log_archive_path=/usr/local/nagios/var/archives
log_external_commands=0
log_file=/usr/local/nagios/var/nagios.log
log_host_retries=1
log_initial_states=0
log_notifications=1
log_passive_checks=0
log_rotation_method=d
log_service_retries=1
low_host_flap_threshold=5.0
low_service_flap_threshold=5.0
max_check_result_file_age=3600
max_check_result_reaper_time=30
max_concurrent_checks=0
max_debug_file_size=1000000
max_host_check_spread=30
max_service_check_spread=30
nagios_group=nagios
nagios_user=nagios
notification_timeout=30
object_cache_file=/var/nagiosramdisk/objects.cache
obsess_over_hosts=0
obsess_over_services=0
ocsp_timeout=5
#p1_file=/usr/local/nagios/bin/p1.pl
passive_host_checks_are_soft=0
perfdata_timeout=5
precached_object_file=/usr/local/nagios/var/objects.precache
resource_file=/usr/local/nagios/etc/resource.cfg
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
retained_host_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_service_attribute_mask=0
retain_state_information=1
retention_update_interval=60
service_check_timeout=250
service_freshness_check_interval=60
service_inter_check_delay_method=s
service_interleave_factor=s
#sleep_time=0.25
soft_state_dependencies=0
state_retention_file=/usr/local/nagios/var/retention.dat
status_file=/var/nagiosramdisk/status.dat
status_update_interval=10
temp_file=/usr/local/nagios/var/nagios.tmp
temp_path=/var/nagiosramdisk/tmp
use_aggressive_host_checking=0
#####use_embedded_perl_implicitly=1
use_regexp_matching=0
use_retained_program_state=1
use_retained_scheduling_info=1
use_syslog=1
use_true_regexp_matching=0
host_down_disable_service_checks=1
Code: Select all
cat /usr/local/nagios/etc/nagios.cfg | grep freshness
additional_freshness_latency=15
check_host_freshness=0
check_service_freshness=1
host_freshness_check_interval=60
service_freshness_check_interval=60
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: freshness checks stop working periodically
Are you able to replicate this at all? I'm wondering what it says if you enable logging in there, I talked to the developer and he was saying that maybe something got stuck in the queue.
Do you have multiple message queues?
Does this output anything?
Do you have multiple message queues?
Code: Select all
ipcs -qCode: Select all
echo "select * from nagios_timedevents; select * from nagios_timedeventqueue;" | mysql -pnagiosxi nagiosRe: freshness checks stop working periodically
Nope, sorry can't replicate this. Unless it happens again on Saturday as I noticed it happened last two Saturdays starting from 07:00. I'll let you know next week.
ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xc2010002 1310720 nagios 600 4611072 4503
Code: Select all
echo "select * from nagios_timedevents; select * from nagios_timedeventqueue;" | mysql -pnagiosxi nagios
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: freshness checks stop working periodically
Ok.
Did you offload your DB or change the root MySQL pass?
Did you offload your DB or change the root MySQL pass?