Page 1 of 1
HIGH Host and service check latency
Posted: Fri Jan 15, 2021 7:58 am
by erkanerturk
Hi
i have a server.
i have almost 30K checks in 15 minute interval. it is a VM. most of the checks are SNMP checks..
i see that host check execution time avg is 0.26 sec
but host check latency is avg 165 sec
similarly
service check exec time: 0.81 sec (avg) BUT
service check latency : 160 sec (avg)
i also noticed that, from time to time, last check times are not updated. i suspect that, this is because of this latency issue
how can i correct the problem
TIA
Re: HIGH Host and service check latency
Posted: Fri Jan 15, 2021 6:17 pm
by ssax
Please PM me a copy of your profile.zip, you can download it from Admin > System Profile by clicking the Download Profile button.
Additionally, please send the output of these commands:
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
This next command may fail, that's okay, not all systems run postgresql, send the output anyways:
Code: Select all
echo "SELECT relname as Table, pg_size_pretty(pg_total_relation_size(relid)) As Size, pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as ExternalSize FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;" | psql nagiosxi nagiosxi
Re: HIGH Host and service check latency
Posted: Mon Jan 18, 2021 7:15 am
by erkanerturk
Hi
i could not upload our profile.zip because of our new organizational policy. sorry for that. i could not eliminate the organizational data
i have seen no errors in the mariadb.log file.
and i see at least %30 cpu idle time..
if you want me to send you a data, please specify
query results are the following
PostgreSQL Query Result:
Code: Select all
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
MYSQL Query result:
+--------------------------------------------+------------+
| Table | Size in MB |
+--------------------------------------------+------------+
| nagios_acknowledgements | 0.00 |
| nagios_commands | 0.02 |
| nagios_commenthistory | 23.40 |
| nagios_comments | 0.00 |
| nagios_configfiles | 0.01 |
| nagios_configfilevariables | 0.01 |
| nagios_conninfo | 0.27 |
| nagios_contact_addresses | 0.00 |
| nagios_contact_notificationcommands | 0.01 |
| nagios_contactgroup_members | 0.01 |
| nagios_contactgroups | 0.00 |
| nagios_contactnotificationmethods | 478.19 |
| nagios_contactnotifications | 505.58 |
| nagios_contacts | 0.01 |
| nagios_contactstatus | 0.01 |
| nagios_customvariables | 1.36 |
| nagios_customvariablestatus | 1.32 |
| nagios_dbversion | 0.00 |
| nagios_downtimehistory | 0.00 |
| nagios_eventhandlers | 0.20 |
| nagios_externalcommands | 0.01 |
| nagios_flappinghistory | 9.75 |
| nagios_host_contactgroups | 0.08 |
| nagios_host_contacts | 0.10 |
| nagios_host_parenthosts | 0.00 |
| nagios_hostchecks | 0.56 |
| nagios_hostdependencies | 0.00 |
| nagios_hostescalation_contactgroups | 0.00 |
| nagios_hostescalation_contacts | 0.00 |
| nagios_hostescalations | 0.00 |
| nagios_hostgroup_members | 0.08 |
| nagios_hostgroups | 0.00 |
| nagios_hosts | 0.46 |
| nagios_hoststatus | 0.97 |
| nagios_instances | 0.00 |
| nagios_logentries | 2428.10 |
| nagios_notifications | 426.48 |
| nagios_objects | 5.96 |
| nagios_processevents | 0.22 |
| nagios_programstatus | 0.00 |
| nagios_runtimevariables | 0.00 |
| nagios_scheduleddowntime | 0.00 |
| nagios_service_contactgroups | 1.21 |
| nagios_service_contacts | 0.41 |
| nagios_service_parentservices | 0.00 |
| nagios_servicechecks | 6.56 |
| nagios_servicedependencies | 0.00 |
| nagios_serviceescalation_contactgroups | 0.00 |
| nagios_serviceescalation_contacts | 0.00 |
| nagios_serviceescalations | 0.00 |
| nagios_servicegroup_members | 0.00 |
| nagios_servicegroups | 0.00 |
| nagios_services | 7.61 |
| nagios_servicestatus | 15.91 |
| nagios_statehistory | 462.82 |
| nagios_systemcommands | 0.03 |
| nagios_timedeventqueue | 0.00 |
| nagios_timedevents | 0.00 |
| nagios_timeperiod_timeranges | 0.03 |
| nagios_timeperiods | 0.01 |
| tbl_command | 0.06 |
| tbl_contact | 0.03 |
| tbl_contactgroup | 0.03 |
| tbl_contacttemplate | 0.03 |
| tbl_domain | 0.03 |
| tbl_host | 0.50 |
| tbl_hostdependency | 0.03 |
| tbl_hostescalation | 0.03 |
| tbl_hostextinfo | 0.03 |
| tbl_hostgroup | 0.03 |
| tbl_hosttemplate | 0.03 |
| tbl_info | 0.17 |
| tbl_lnkContactToCommandHost | 0.02 |
| tbl_lnkContactToCommandService | 0.02 |
| tbl_lnkContactToContactgroup | 0.02 |
| tbl_lnkContactToContacttemplate | 0.02 |
| tbl_lnkContactToVariabledefinition | 0.02 |
| tbl_lnkContactgroupToContact | 0.02 |
| tbl_lnkContactgroupToContactgroup | 0.02 |
| tbl_lnkContacttemplateToCommandHost | 0.02 |
| tbl_lnkContacttemplateToCommandService | 0.02 |
| tbl_lnkContacttemplateToContactgroup | 0.02 |
| tbl_lnkContacttemplateToContacttemplate | 0.02 |
| tbl_lnkContacttemplateToVariabledefinition | 0.02 |
| tbl_lnkHostToContact | 0.09 |
| tbl_lnkHostToContactgroup | 0.08 |
| tbl_lnkHostToHost | 0.02 |
| tbl_lnkHostToHostgroup | 0.02 |
| tbl_lnkHostToHosttemplate | 0.09 |
| tbl_lnkHostToVariabledefinition | 0.08 |
| tbl_lnkHostdependencyToHost_DH | 0.02 |
| tbl_lnkHostdependencyToHost_H | 0.02 |
| tbl_lnkHostdependencyToHostgroup_DH | 0.02 |
| tbl_lnkHostdependencyToHostgroup_H | 0.02 |
| tbl_lnkHostescalationToContact | 0.02 |
| tbl_lnkHostescalationToContactgroup | 0.02 |
| tbl_lnkHostescalationToHost | 0.02 |
| tbl_lnkHostescalationToHostgroup | 0.02 |
| tbl_lnkHostgroupToHost | 0.06 |
| tbl_lnkHostgroupToHostgroup | 0.02 |
| tbl_lnkHosttemplateToContact | 0.02 |
| tbl_lnkHosttemplateToContactgroup | 0.02 |
| tbl_lnkHosttemplateToHost | 0.02 |
| tbl_lnkHosttemplateToHostgroup | 0.02 |
| tbl_lnkHosttemplateToHosttemplate | 0.02 |
| tbl_lnkHosttemplateToVariabledefinition | 0.02 |
| tbl_lnkServiceToContact | 0.14 |
| tbl_lnkServiceToContactgroup | 0.31 |
| tbl_lnkServiceToHost | 1.52 |
| tbl_lnkServiceToHostgroup | 0.02 |
| tbl_lnkServiceToServicegroup | 0.02 |
| tbl_lnkServiceToServicetemplate | 1.48 |
| tbl_lnkServiceToVariabledefinition | 0.42 |
| tbl_lnkServicedependencyToHost_DH | 0.02 |
| tbl_lnkServicedependencyToHost_H | 0.02 |
| tbl_lnkServicedependencyToHostgroup_DH | 0.02 |
| tbl_lnkServicedependencyToHostgroup_H | 0.02 |
| tbl_lnkServicedependencyToService_DS | 0.02 |
| tbl_lnkServicedependencyToService_S | 0.02 |
| tbl_lnkServicedependencyToServicegroup_DS | 0.02 |
| tbl_lnkServicedependencyToServicegroup_S | 0.02 |
| tbl_lnkServiceescalationToContact | 0.02 |
| tbl_lnkServiceescalationToContactgroup | 0.02 |
| tbl_lnkServiceescalationToHost | 0.02 |
| tbl_lnkServiceescalationToHostgroup | 0.02 |
| tbl_lnkServiceescalationToService | 0.02 |
| tbl_lnkServiceescalationToServicegroup | 0.02 |
| tbl_lnkServicegroupToService | 0.02 |
| tbl_lnkServicegroupToServicegroup | 0.02 |
| tbl_lnkServicetemplateToContact | 0.02 |
| tbl_lnkServicetemplateToContactgroup | 0.02 |
| tbl_lnkServicetemplateToHost | 0.02 |
| tbl_lnkServicetemplateToHostgroup | 0.02 |
| tbl_lnkServicetemplateToServicegroup | 0.02 |
| tbl_lnkServicetemplateToServicetemplate | 0.02 |
| tbl_lnkServicetemplateToVariabledefinition | 0.02 |
| tbl_lnkTimeperiodToTimeperiod | 0.02 |
| tbl_logbook | 0.27 |
| tbl_mainmenu | 0.02 |
| tbl_permission | 0.02 |
| tbl_permission_inactive | 0.02 |
| tbl_service | 7.52 |
| tbl_servicedependency | 0.03 |
| tbl_serviceescalation | 0.03 |
| tbl_serviceextinfo | 0.03 |
| tbl_servicegroup | 0.03 |
| tbl_servicetemplate | 0.03 |
| tbl_session | 0.02 |
| tbl_session_locks | 0.02 |
| tbl_settings | 0.03 |
| tbl_submenu | 0.02 |
| tbl_timedefinition | 0.06 |
| tbl_timeperiod | 0.03 |
| tbl_user | 0.03 |
| tbl_variabledefinition | 1.52 |
| xi_auditlog | 3198.27 |
| xi_auth_tokens | 35.70 |
| xi_cmp_ccm_backups | 0.02 |
| xi_cmp_favorites | 0.03 |
| xi_cmp_nagiosbpi_backups | 0.06 |
| xi_cmp_trapdata | 0.16 |
| xi_cmp_trapdata_log | 0.03 |
| xi_commands | 0.05 |
| xi_deploy_agents | 0.02 |
| xi_deploy_jobs | 0.02 |
| xi_eventqueue | 0.03 |
| xi_events | 3.31 |
| xi_incidents | 0.00 |
| xi_meta | 41.06 |
| xi_mibs | 0.05 |
| xi_options | 0.03 |
| xi_sessions | 0.03 |
| xi_sysstat | 0.03 |
| xi_usermeta | 3.84 |
| xi_users | 0.08 |
+--------------------------------------------+------------+
Re: HIGH Host and service check latency
Posted: Tue Jan 19, 2021 3:48 am
by erkanerturk
Hi I have noticed something
monitoring engine status > monitoring engine check statistics counters decreased and then showed 0. at the same time, i have noticed that last check time stopped at 10.44 and stayed 30 minutes with that value.
from linux cli, i see that nagios continues to do checks but in the gui last check times stayed the same..
should i increase npcd load threshold?
please advice..
Re: HIGH Host and service check latency
Posted: Tue Jan 19, 2021 6:11 pm
by ssax
Please send me your
/usr/local/nagios/etc/nagios.cfg.
What XI version are you running? You can find it on the bottom left hand side of the web interface.
Are you seeing any errors in
/var/log/messages,
/var/log/http/error_log,
/var/log/httpd/ssl_error_log, or
/var/log/dmesg?
Include the output of these commands as root:
Code: Select all
sar
ps aux
ulimit -a
su -s /bin/bash -c 'ulimit -a' nagios
su -s /bin/bash -c 'ulimit -a' mysql
Attach your
/etc/php.ini file as well.
Re: HIGH Host and service check latency
Posted: Wed Jan 20, 2021 11:57 am
by erkanerturk
Hi
I have sent the files via PM
waiting for your reply..
Re: HIGH Host and service check latency
Posted: Wed Jan 20, 2021 12:29 pm
by ssax
Please edit your
and change these:
Code: Select all
max_execution_time = 60
max_input_vars = 5000
memory_limit = 256M
To these:
Code: Select all
max_execution_time = 600
max_input_vars = 50000
memory_limit = 1024M
Then restart apache:
Then take the attached zip file, transfer it to your XI server, and run these commands as root against it:
- Or you can upgrade to XI 5.8.1 and it will update your NDO3 as well
Code: Select all
unzip ndo-master.zip
cd ndo-master
./configure
make all
make install
If you have an offloaded database or changed the default MySQL passwords you will need to edit your
/usr/local/nagios/etc/ndo.cfg file and update these before running the next command to start it up:
- You can get the info from your from /usr/local/nagiosxi/html/config.inc.php for the ndoutils database
Then restart the nagios service:
Then apply configuration and see if that alleviates the issue.
If it doesn't, please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
Re: HIGH Host and service check latency
Posted: Thu Jan 21, 2021 7:40 am
by erkanerturk
Hi
firstly, thanks for your response. but part of the problem persists..
our service/host check latencies definetly decreased from >500 seconds to 150 seconds (for service check latency) and 60 seconds (for hosts)
but, i have seen that when I apply config, that our last check times stop for a while (approx 30 minutes ). active service checks (1-min,5 and 15-min) drops to 0 while active host checks shows (i think) correct numbers. this was the case before applying your changes..
when I see the following log entries, GUI becomes normal (last check times update) (may be just a coincidance)
[1611215866] NDO-3: Ended downtime thread
[1611215866] NDO-3: Ended acknowledgement thread
[1611215866] NDO-3: Ended comment thread
[1611215866] NDO-3: Ended flapping thread
[1611215866] NDO-3: Ended statechange thread
[1611215866] NDO-3: Ended event_handler thread
[1611215867] NDO-3: Ended notification thread
[1611215929] NDO-3: Ended service_check thread
[1611215929] NDO-3: Ended timed_event thread
anyway if you want we can continue with the ticket
Re: HIGH Host and service check latency
Posted: Thu Jan 21, 2021 7:22 pm
by ssax
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
Re: HIGH Host and service check latency
Posted: Fri Jan 22, 2021 2:00 pm
by ssax
Locking thread, ticket received, we will continue support through the ticket.
Thank you!