Page 1 of 1

Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Tue Jul 13, 2021 5:42 pm
by gormank
Hi,
I have a service created long ago by a wizard and then made generic. It now warns pretty frequently but eventually clears. It's been doing this since 5/18/21.
The system running the check is 5.8.4 (updated last week) and the 2 hosts being checked are 5.7.3. Both of the checked hosts have the same warning alerts on Nagios_Remote_Jobs.
Running dbmaint.php manually only takes a second or two.
It's Nagios_Remote_Jobs and after executing it from CCM and massaging a bit I have the following test. It typically clears a bit before 600 seconds.

Code: Select all

# address=10.133.134.85; /usr/bin/php /usr/local/nagios/libexec/check_nagiosxiserver.php --address=$address --url="https://$address/nagiosxi/" --apikey='fake' --mode=jobs
Database Maintenance (dbmaint) stale (505 seconds old)
# address=10.133.134.85; /usr/bin/php /usr/local/nagios/libexec/check_nagiosxiserver.php --address=$address --url="https://$address/nagiosxi/" --apikey='fake' --mode=jobs
Database Maintenance (dbmaint) stale (541 seconds old)
# address=10.133.134.85; /usr/bin/php /usr/local/nagios/libexec/check_nagiosxiserver.php --address=$address --url="https://$address/nagiosxi/" --apikey='fake' --mode=jobs
All jobs are running okay.
The dbmaint cron runs every 5 minutes and there's nothing in the log.

Code: Select all

CREATING: /usr/local/nagiosxi/var/dbmaint.lock
CLEANING ndoutils TABLE 'commenthistory'...
SQL: DELETE FROM nagios_commenthistory WHERE entry_time < FROM_UNIXTIME(1563141301)
CLEANING ndoutils TABLE 'processevents'...
SQL: DELETE FROM nagios_processevents WHERE event_time < FROM_UNIXTIME(1594677301)
CLEANING ndoutils TABLE 'externalcommands'...
SQL: DELETE FROM nagios_externalcommands WHERE entry_time < FROM_UNIXTIME(1625608501)
CLEANING ndoutils TABLE 'logentries'...
SQL: DELETE FROM nagios_logentries WHERE logentry_time < FROM_UNIXTIME(1618437301)
CLEANING ndoutils TABLE 'notifications'...
SQL: DELETE FROM nagios_notifications WHERE start_time < FROM_UNIXTIME(1618437301)
CLEANING ndoutils TABLE 'contactnotifications'...
SQL: DELETE FROM nagios_contactnotifications WHERE start_time < FROM_UNIXTIME(1618437301)
CLEANING ndoutils TABLE 'contactnotificationmethods'...
SQL: DELETE FROM nagios_contactnotificationmethods WHERE start_time < FROM_UNIXTIME(1618437301)
CLEANING ndoutils TABLE 'statehistory'...
SQL: DELETE FROM nagios_statehistory WHERE state_time < FROM_UNIXTIME(1563141301)
CLEANING ndoutils TABLE 'timedevents'...
SQL: DELETE FROM nagios_timedevents WHERE event_time < FROM_UNIXTIME(1626213001)
CLEANING ndoutils TABLE 'systemcommands'...
SQL: DELETE FROM nagios_systemcommands WHERE start_time < FROM_UNIXTIME(1626213001)
CLEANING ndoutils TABLE 'servicechecks'...
SQL: DELETE FROM nagios_servicechecks WHERE start_time < FROM_UNIXTIME(1626213001)
CLEANING ndoutils TABLE 'hostchecks'...
SQL: DELETE FROM nagios_hostchecks WHERE start_time < FROM_UNIXTIME(1626213001)
CLEANING ndoutils TABLE 'eventhandlers'...
SQL: DELETE FROM nagios_eventhandlers WHERE start_time < FROM_UNIXTIME(1626213001)
LASTOPT:  1626210901
INTERVAL: 60
NOW:      1626213301
OPTTIME:  1626214501
CLEANING nagiosxi TABLE 'commands'...
SQL: DELETE FROM xi_commands WHERE processing_time < FROM_UNIXTIME(1626184501) AND status_code = 2
CLEANING nagiosxi TABLE 'events'...
SQL: DELETE FROM xi_events WHERE processing_time < FROM_UNIXTIME(1626184501) AND status_code = 2
CLEANING nagiosxi TABLE 'auth_tokens'...
SQL: DELETE FROM xi_auth_tokens WHERE auth_valid_until < FROM_UNIXTIME(1626126901)
CLEANING nagiosxi TABLE 'cmp_trapdata_log'...
SQL: DELETE FROM xi_cmp_trapdata_log WHERE trapdata_log_datetime < FROM_UNIXTIME(1618437301)
SQL1: SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL
SQL2: Deleted 0 (DELETE FROM xi_meta WHERE meta_id IN (SELECT xi_meta.meta_id FROM xi_meta LEFT JOIN xi_events ON xi_meta.metaobj_id=xi_events.event_id WHERE metatype_id='1' AND event_id IS NULL))
CLEANING nagiosxi TABLE 'auditlog'...
SQL: DELETE FROM xi_auditlog WHERE log_time < FROM_UNIXTIME(1623621301)
CLEANING nagiosql TABLE 'logbook'...
SQL: DELETE FROM tbl_logbook WHERE time < FROM_UNIXTIME(1626184501)
Repair Complete: Removing Lock File
I've checked the DB, saw no errors, and ran the repair script.

Code: Select all

mysqlcheck -r -f -uroot -pnagiosxi --all-databases
/usr/local/nagiosxi/scripts/repair_databases.sh
The check runs every 5 minutes and has 4 retries and it still sometims sends notifications.
I've done a good bit of research on this and can't find a reason for it to have a problem. Any ideas?

Thanks

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Wed Jul 14, 2021 11:36 am
by gsmith
Hi

On the 5.8.4 box, could you run the command from the CLI with the debug option turned on:

Code: Select all

/usr/bin/php /usr/local/nagios/libexec/check_nagiosxiserver.php --address=$address --url="https://$address/nagiosxi/" --apikey='fake' --mode=jobs --debug=1
Please send me that output.

Secondly, on one of the 5.7.3 boxes, run the same command (above) against the other 5.7.3 machine
and send me that output too.

Thanks

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Wed Jul 14, 2021 12:25 pm
by gormank
We also got a page last night with the following indicating that the issue might be getting worse. Here's the flow of alerts and timeline. While we get warning emails pretty often, this is the critical that caused a page, meaning it was critical for >20 minutes.

In checking me emails, I see ~100 alerts on Nagios_Remote_Jobs since 7/7 if that gives any perspective about how often it alerts. There are 4 retries and a 15 minut notification delay.

Code: Select all

2021-07-14 07:15:07	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;OK;HARD;4;All jobs are running okay.
Service Critical	2021-07-14 07:13:44	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;CRITICAL;HARD;4;Error: Could not parse JSON from https://10.133.134.84/nagiosxi/ (false
Service Notification	2021-07-14 07:13:44	SERVICE NOTIFICATION: 1vzw.net.cdsp-sms;txslm2mlnag001;Nagios_Remote_Jobs;CRITICAL;xi_service_notification_handler;Error: Could not parse JSON from https://10.133.134.84/nagiosxi/ (false
Service Notification	2021-07-14 07:13:44	SERVICE NOTIFICATION: 1vzw.net.cdsp-mail;txslm2mlnag001;Nagios_Remote_Jobs;CRITICAL;xi_service_notification_handler;Error: Could not parse JSON from https://10.133.134.84/nagiosxi/ (false
Service Notification	2021-07-14 07:03:50	SERVICE NOTIFICATION: 1vzw.net.cdsp-mail;txslm2mlnag001;Nagios_Remote_Jobs;WARNING;xi_service_notification_handler;Database Maintenance (dbmaint) stale (528 seconds old)
Service Notification	2021-07-14 06:58:49	SERVICE NOTIFICATION: 1vzw.net.cdsp-mail;txslm2mlnag002;Nagios_Remote_Jobs;WARNING;xi_service_notification_handler;Database Maintenance (dbmaint) stale (523 seconds old)
Service Warning	2021-07-14 06:44:05	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;HARD;4;Database Maintenance (dbmaint) stale (543 seconds old)
Service Warning	2021-07-14 06:43:05	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;SOFT;3;Database Maintenance (dbmaint) stale (483 seconds old)
Service Warning	2021-07-14 06:42:04	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;SOFT;2;Database Maintenance (dbmaint) stale (422 seconds old)
Service Warning	2021-07-14 06:41:03	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;SOFT;1;Database Maintenance (dbmaint) stale (361 seconds old)
Service Recovery	2021-07-14 06:40:03	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;OK;SOFT;2;All jobs are running okay.
Here's the output from the remote 5.8.4 host.

Code: Select all

ACCESSING URL: https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
RESULT:
Array
(
    [headers] => Array
        (
            [Date] => Wed, 14 Jul 2021 17:11:32 GMT
            [Server] => Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16
            [X-Powered-By] => PHP/5.4.16
            [Access-Control-Allow-Origin] => *
            [Access-Control-Allow-Methods] => POST, GET, OPTIONS, DELETE, PUT
            [Content-Length] => 2298
            [Content-Type] => application/json
        )

    [body] => {"nom":{"last_check":"1626282661"},"cleaner":{"last_check":"1626282662"},"deadpool_reaper":{"last_check":"1626282662"},"iostat":{"updated":"2021-07-14 17:11:27","user":"10.35","nice":"0.00","system":"2.21","iowait":"0.00","steal":"0.00","idle":"87.44"},"sysstat":{"last_check":"1626282682"},"eventman":{"last_check":"1626282692"},"cmdsubsys":{"last_check":"1626282692"},"dbmaint":{"last_check":"1626282603"},"perfdataprocessor":{"last_check":"1626282691"},"dbbackend":{"last_checkin":"2020-09-24 07:53:43","bytes_processed":"14702208","entries_processed":"24861","connect_time":"2020-09-24 07:33:17","disconnect_time":"0000-00-00 00:00:00"},"load":{"updated":"2021-07-14 17:11:22","load1":"1.45","load5":"0.99","load15":"0.80"},"memory":{"updated":"2021-07-14 17:11:22","total":"64266","used":"1691","free":"493","shared":"3212","buffers":"62080","cached":"58842"},"swap":{"updated":"2021-07-14 17:11:22","total":"32767","used":"8","free":"32759"},"feedprocessor":{"last_check":"1626282682"},"reportengine":{"last_check":"1626282661"},"daemons":{"updated":"2021-07-14 17:11:22","daemon":[{"@attributes":{"id":"nagioscore"},"name":"nagios","output":"           ??32405 \/usr\/bin\/perl -w \/usr\/local\/nagios\/libexec\/check_hp -H 2001:4888:a03:311f:c0:a:0:413 --timeout=45 --community=sp1der --exclude=cpqFcaHostCntlrStatus","return_code":"0","status":"0"},{"@attributes":{"id":"pnp"},"name":"npcd","output":"           ??13922 \/usr\/local\/nagios\/bin\/npcd -d -f \/usr\/local\/nagios\/etc\/pnp\/npcd.cfg","return_code":"0","status":"0"}]},"nagioscore":{"updated":"2021-07-14 17:11:22","activehostchecks":{"val1":"52","val5":"308","val15":"308"},"passivehostchecks":{"val1":"0","val5":"0","val15":"0"},"activeservicechecks":{"val1":"419","val5":"2096","val15":"2098"},"passiveservicechecks":{"val1":"1","val5":"1","val15":"1"},"activehostcheckperf":{"min_latency":"0","max_latency":"1.4704060554504395","avg_latency":"0.02912612356584495","min_execution_time":"0.0035369999999999998","max_execution_time":"4.120271","avg_execution_time":"3.960959032467532"},"activeservicecheckperf":{"min_latency":"0","max_latency":"1.472885012626648","avg_latency":"0.03401063694047463","min_execution_time":"0.0028250000000000003","max_execution_time":"13.892618","avg_execution_time":"0.6476608702471491"}}}

    [info] => Array
        (
            [url] => https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
            [content_type] => application/json
            [http_code] => 200
            [header_size] => 311
            [request_size] => 189
            [filetime] => -1
            [ssl_verify_result] => 0
            [redirect_count] => 0
            [total_time] => 0.47397
            [namelookup_time] => 6.8E-5
            [connect_time] => 0.025447
            [pretransfer_time] => 0.253035
            [size_upload] => 0
            [size_download] => 2298
            [speed_download] => 4848
            [speed_upload] => 0
            [download_content_length] => 2298
            [upload_content_length] => 0
            [starttransfer_time] => 0.473957
            [redirect_time] => 0
            [certinfo] => Array
                (
                )

            [primary_ip] => 10.133.134.84
            [primary_port] => 443
            [local_ip] => 10.136.243.84
            [local_port] => 52232
            [redirect_url] =>
        )

)
XML DATA LOOKS OK
CHECKING JOB reportengine (Report Engine)
CHECKING JOB sysstat (System Statistics)
CHECKING JOB eventman (Event Manager)
CHECKING JOB feedprocessor (Feed Processor)
CHECKING JOB cmdsubsys (Command Subsystem)
CHECKING JOB nom (Nonstop Operations Manager)
CHECKING JOB dbmaint (Database Maintenance)
CHECKING JOB cleaner (Cleaner)
All jobs are running okay.
Here's the output from the local 5.7.3 host.

Code: Select all

ACCESSING URL: https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
RESULT:
Array
(
    [headers] => Array
        (
            [Date] => Wed, 14 Jul 2021 17:10:49 GMT
            [Server] => Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16
            [X-Powered-By] => PHP/5.4.16
            [Access-Control-Allow-Origin] => *
            [Access-Control-Allow-Methods] => POST, GET, OPTIONS, DELETE, PUT
            [Content-Length] => 2261
            [Content-Type] => application/json
        )

    [body] => {"nom":{"last_check":"1626282602"},"cleaner":{"last_check":"1626282603"},"deadpool_reaper":{"last_check":"1626282602"},"iostat":{"updated":"2021-07-14 17:10:48","user":"11.13","nice":"0.00","system":"3.26","iowait":"0.00","steal":"0.00","idle":"85.61"},"sysstat":{"last_check":"1626282643"},"eventman":{"last_check":"1626282648"},"cmdsubsys":{"last_check":"1626282649"},"dbmaint":{"last_check":"1626282603"},"perfdataprocessor":{"last_check":"1626282642"},"dbbackend":{"last_checkin":"2020-09-24 07:53:43","bytes_processed":"14702208","entries_processed":"24861","connect_time":"2020-09-24 07:33:17","disconnect_time":"0000-00-00 00:00:00"},"load":{"updated":"2021-07-14 17:10:43","load1":"0.61","load5":"0.79","load15":"0.73"},"memory":{"updated":"2021-07-14 17:10:43","total":"64266","used":"1630","free":"556","shared":"3212","buffers":"62079","cached":"58903"},"swap":{"updated":"2021-07-14 17:10:43","total":"32767","used":"8","free":"32759"},"feedprocessor":{"last_check":"1626282643"},"reportengine":{"last_check":"1626282602"},"daemons":{"updated":"2021-07-14 17:10:43","daemon":[{"@attributes":{"id":"nagioscore"},"name":"nagios","output":"           ??31381 \/usr\/local\/nagios\/libexec\/check_nrpe -H 10.133.31.237 --v2-packets-only --unknown-timeout -t 59 3 -c check_init_service -a vasd","return_code":"0","status":"0"},{"@attributes":{"id":"pnp"},"name":"npcd","output":"           ??13922 \/usr\/local\/nagios\/bin\/npcd -d -f \/usr\/local\/nagios\/etc\/pnp\/npcd.cfg","return_code":"0","status":"0"}]},"nagioscore":{"updated":"2021-07-14 17:10:43","activehostchecks":{"val1":"55","val5":"308","val15":"308"},"passivehostchecks":{"val1":"0","val5":"0","val15":"0"},"activeservicechecks":{"val1":"427","val5":"2098","val15":"2098"},"passiveservicechecks":{"val1":"1","val5":"1","val15":"1"},"activehostcheckperf":{"min_latency":"0","max_latency":"1.4704060554504395","avg_latency":"0.0297475943374015","min_execution_time":"0.003146","max_execution_time":"4.120271","avg_execution_time":"3.9612676363636363"},"activeservicecheckperf":{"min_latency":"0","max_latency":"1.472885012626648","avg_latency":"0.029339193409710605","min_execution_time":"0.0028250000000000003","max_execution_time":"13.892618","avg_execution_time":"0.648569172528517"}}}

    [info] => Array
        (
            [url] => https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
            [content_type] => application/json
            [http_code] => 200
            [header_size] => 311
            [request_size] => 189
            [filetime] => -1
            [ssl_verify_result] => 0
            [redirect_count] => 0
            [total_time] => 0.318283
            [namelookup_time] => 5.4E-5
            [connect_time] => 0.000247
            [pretransfer_time] => 0.116056
            [size_upload] => 0
            [size_download] => 2261
            [speed_download] => 7103
            [speed_upload] => 0
            [download_content_length] => 2261
            [upload_content_length] => 0
            [starttransfer_time] => 0.318274
            [redirect_time] => 0
            [certinfo] => Array
                (
                )

            [primary_ip] => 10.133.134.84
            [primary_port] => 443
            [local_ip] => 10.133.134.85
            [local_port] => 33306
            [redirect_url] =>
        )

)
XML DATA LOOKS OK
CHECKING JOB reportengine (Report Engine)
CHECKING JOB sysstat (System Statistics)
CHECKING JOB eventman (Event Manager)
CHECKING JOB feedprocessor (Feed Processor)
CHECKING JOB cmdsubsys (Command Subsystem)
CHECKING JOB nom (Nonstop Operations Manager)
CHECKING JOB dbmaint (Database Maintenance)
CHECKING JOB cleaner (Cleaner)
All jobs are running okay.

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Wed Jul 14, 2021 3:18 pm
by gormank
We also got a page last night with the following indicating that the issue might be getting worse. Here's the flow of alerts and timeline. While we get warning emails pretty often, this is the critical that caused a page, meaning it was critical for >20 minutes.

In checking me emails, I see ~100 alerts on Nagios_Remote_Jobs since 7/7 if that gives any perspective about how often it alerts. There are 4 retries and a 15 minut notification delay.

Code: Select all

2021-07-14 07:15:07	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;OK;HARD;4;All jobs are running okay.
Service Critical	2021-07-14 07:13:44	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;CRITICAL;HARD;4;Error: Could not parse JSON from https://10.133.134.84/nagiosxi/ (false
Service Notification	2021-07-14 07:13:44	SERVICE NOTIFICATION: 1vzw.net.cdsp-sms;txslm2mlnag001;Nagios_Remote_Jobs;CRITICAL;xi_service_notification_handler;Error: Could not parse JSON from https://10.133.134.84/nagiosxi/ (false
Service Notification	2021-07-14 07:13:44	SERVICE NOTIFICATION: 1vzw.net.cdsp-mail;txslm2mlnag001;Nagios_Remote_Jobs;CRITICAL;xi_service_notification_handler;Error: Could not parse JSON from https://10.133.134.84/nagiosxi/ (false
Service Notification	2021-07-14 07:03:50	SERVICE NOTIFICATION: 1vzw.net.cdsp-mail;txslm2mlnag001;Nagios_Remote_Jobs;WARNING;xi_service_notification_handler;Database Maintenance (dbmaint) stale (528 seconds old)
Service Notification	2021-07-14 06:58:49	SERVICE NOTIFICATION: 1vzw.net.cdsp-mail;txslm2mlnag002;Nagios_Remote_Jobs;WARNING;xi_service_notification_handler;Database Maintenance (dbmaint) stale (523 seconds old)
Service Warning	2021-07-14 06:44:05	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;HARD;4;Database Maintenance (dbmaint) stale (543 seconds old)
Service Warning	2021-07-14 06:43:05	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;SOFT;3;Database Maintenance (dbmaint) stale (483 seconds old)
Service Warning	2021-07-14 06:42:04	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;SOFT;2;Database Maintenance (dbmaint) stale (422 seconds old)
Service Warning	2021-07-14 06:41:03	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;WARNING;SOFT;1;Database Maintenance (dbmaint) stale (361 seconds old)
Service Recovery	2021-07-14 06:40:03	SERVICE ALERT: txslm2mlnag001;Nagios_Remote_Jobs;OK;SOFT;2;All jobs are running okay.
Here's the output from the remote 5.8.4 host.

Code: Select all

ACCESSING URL: https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
RESULT:
Array
(
    [headers] => Array
        (
            [Date] => Wed, 14 Jul 2021 17:11:32 GMT
            [Server] => Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16
            [X-Powered-By] => PHP/5.4.16
            [Access-Control-Allow-Origin] => *
            [Access-Control-Allow-Methods] => POST, GET, OPTIONS, DELETE, PUT
            [Content-Length] => 2298
            [Content-Type] => application/json
        )

    [body] => {"nom":{"last_check":"1626282661"},"cleaner":{"last_check":"1626282662"},"deadpool_reaper":{"last_check":"1626282662"},"iostat":{"updated":"2021-07-14 17:11:27","user":"10.35","nice":"0.00","system":"2.21","iowait":"0.00","steal":"0.00","idle":"87.44"},"sysstat":{"last_check":"1626282682"},"eventman":{"last_check":"1626282692"},"cmdsubsys":{"last_check":"1626282692"},"dbmaint":{"last_check":"1626282603"},"perfdataprocessor":{"last_check":"1626282691"},"dbbackend":{"last_checkin":"2020-09-24 07:53:43","bytes_processed":"14702208","entries_processed":"24861","connect_time":"2020-09-24 07:33:17","disconnect_time":"0000-00-00 00:00:00"},"load":{"updated":"2021-07-14 17:11:22","load1":"1.45","load5":"0.99","load15":"0.80"},"memory":{"updated":"2021-07-14 17:11:22","total":"64266","used":"1691","free":"493","shared":"3212","buffers":"62080","cached":"58842"},"swap":{"updated":"2021-07-14 17:11:22","total":"32767","used":"8","free":"32759"},"feedprocessor":{"last_check":"1626282682"},"reportengine":{"last_check":"1626282661"},"daemons":{"updated":"2021-07-14 17:11:22","daemon":[{"@attributes":{"id":"nagioscore"},"name":"nagios","output":"           ??32405 \/usr\/bin\/perl -w \/usr\/local\/nagios\/libexec\/check_hp -H 2001:4888:a03:311f:c0:a:0:413 --timeout=45 --community=sp1der --exclude=cpqFcaHostCntlrStatus","return_code":"0","status":"0"},{"@attributes":{"id":"pnp"},"name":"npcd","output":"           ??13922 \/usr\/local\/nagios\/bin\/npcd -d -f \/usr\/local\/nagios\/etc\/pnp\/npcd.cfg","return_code":"0","status":"0"}]},"nagioscore":{"updated":"2021-07-14 17:11:22","activehostchecks":{"val1":"52","val5":"308","val15":"308"},"passivehostchecks":{"val1":"0","val5":"0","val15":"0"},"activeservicechecks":{"val1":"419","val5":"2096","val15":"2098"},"passiveservicechecks":{"val1":"1","val5":"1","val15":"1"},"activehostcheckperf":{"min_latency":"0","max_latency":"1.4704060554504395","avg_latency":"0.02912612356584495","min_execution_time":"0.0035369999999999998","max_execution_time":"4.120271","avg_execution_time":"3.960959032467532"},"activeservicecheckperf":{"min_latency":"0","max_latency":"1.472885012626648","avg_latency":"0.03401063694047463","min_execution_time":"0.0028250000000000003","max_execution_time":"13.892618","avg_execution_time":"0.6476608702471491"}}}

    [info] => Array
        (
            [url] => https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
            [content_type] => application/json
            [http_code] => 200
            [header_size] => 311
            [request_size] => 189
            [filetime] => -1
            [ssl_verify_result] => 0
            [redirect_count] => 0
            [total_time] => 0.47397
            [namelookup_time] => 6.8E-5
            [connect_time] => 0.025447
            [pretransfer_time] => 0.253035
            [size_upload] => 0
            [size_download] => 2298
            [speed_download] => 4848
            [speed_upload] => 0
            [download_content_length] => 2298
            [upload_content_length] => 0
            [starttransfer_time] => 0.473957
            [redirect_time] => 0
            [certinfo] => Array
                (
                )

            [primary_ip] => 10.133.134.84
            [primary_port] => 443
            [local_ip] => 10.136.243.84
            [local_port] => 52232
            [redirect_url] =>
        )

)
XML DATA LOOKS OK
CHECKING JOB reportengine (Report Engine)
CHECKING JOB sysstat (System Statistics)
CHECKING JOB eventman (Event Manager)
CHECKING JOB feedprocessor (Feed Processor)
CHECKING JOB cmdsubsys (Command Subsystem)
CHECKING JOB nom (Nonstop Operations Manager)
CHECKING JOB dbmaint (Database Maintenance)
CHECKING JOB cleaner (Cleaner)
All jobs are running okay.
Here's the output from the local 5.7.3 host.

Code: Select all

ACCESSING URL: https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
RESULT:
Array
(
    [headers] => Array
        (
            [Date] => Wed, 14 Jul 2021 17:10:49 GMT
            [Server] => Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16
            [X-Powered-By] => PHP/5.4.16
            [Access-Control-Allow-Origin] => *
            [Access-Control-Allow-Methods] => POST, GET, OPTIONS, DELETE, PUT
            [Content-Length] => 2261
            [Content-Type] => application/json
        )

    [body] => {"nom":{"last_check":"1626282602"},"cleaner":{"last_check":"1626282603"},"deadpool_reaper":{"last_check":"1626282602"},"iostat":{"updated":"2021-07-14 17:10:48","user":"11.13","nice":"0.00","system":"3.26","iowait":"0.00","steal":"0.00","idle":"85.61"},"sysstat":{"last_check":"1626282643"},"eventman":{"last_check":"1626282648"},"cmdsubsys":{"last_check":"1626282649"},"dbmaint":{"last_check":"1626282603"},"perfdataprocessor":{"last_check":"1626282642"},"dbbackend":{"last_checkin":"2020-09-24 07:53:43","bytes_processed":"14702208","entries_processed":"24861","connect_time":"2020-09-24 07:33:17","disconnect_time":"0000-00-00 00:00:00"},"load":{"updated":"2021-07-14 17:10:43","load1":"0.61","load5":"0.79","load15":"0.73"},"memory":{"updated":"2021-07-14 17:10:43","total":"64266","used":"1630","free":"556","shared":"3212","buffers":"62079","cached":"58903"},"swap":{"updated":"2021-07-14 17:10:43","total":"32767","used":"8","free":"32759"},"feedprocessor":{"last_check":"1626282643"},"reportengine":{"last_check":"1626282602"},"daemons":{"updated":"2021-07-14 17:10:43","daemon":[{"@attributes":{"id":"nagioscore"},"name":"nagios","output":"           ??31381 \/usr\/local\/nagios\/libexec\/check_nrpe -H 10.133.31.237 --v2-packets-only --unknown-timeout -t 59 3 -c check_init_service -a vasd","return_code":"0","status":"0"},{"@attributes":{"id":"pnp"},"name":"npcd","output":"           ??13922 \/usr\/local\/nagios\/bin\/npcd -d -f \/usr\/local\/nagios\/etc\/pnp\/npcd.cfg","return_code":"0","status":"0"}]},"nagioscore":{"updated":"2021-07-14 17:10:43","activehostchecks":{"val1":"55","val5":"308","val15":"308"},"passivehostchecks":{"val1":"0","val5":"0","val15":"0"},"activeservicechecks":{"val1":"427","val5":"2098","val15":"2098"},"passiveservicechecks":{"val1":"1","val5":"1","val15":"1"},"activehostcheckperf":{"min_latency":"0","max_latency":"1.4704060554504395","avg_latency":"0.0297475943374015","min_execution_time":"0.003146","max_execution_time":"4.120271","avg_execution_time":"3.9612676363636363"},"activeservicecheckperf":{"min_latency":"0","max_latency":"1.472885012626648","avg_latency":"0.029339193409710605","min_execution_time":"0.0028250000000000003","max_execution_time":"13.892618","avg_execution_time":"0.648569172528517"}}}

    [info] => Array
        (
            [url] => https://10.133.134.84/nagiosxi/api/v1/system/statusdetail?apikey=J8eTtlRoIYHGJdqtWU260TZsTG8N7GM6NZgAEnfjlVfk8J74D9pT2JFcl4fLJ07M
            [content_type] => application/json
            [http_code] => 200
            [header_size] => 311
            [request_size] => 189
            [filetime] => -1
            [ssl_verify_result] => 0
            [redirect_count] => 0
            [total_time] => 0.318283
            [namelookup_time] => 5.4E-5
            [connect_time] => 0.000247
            [pretransfer_time] => 0.116056
            [size_upload] => 0
            [size_download] => 2261
            [speed_download] => 7103
            [speed_upload] => 0
            [download_content_length] => 2261
            [upload_content_length] => 0
            [starttransfer_time] => 0.318274
            [redirect_time] => 0
            [certinfo] => Array
                (
                )

            [primary_ip] => 10.133.134.84
            [primary_port] => 443
            [local_ip] => 10.133.134.85
            [local_port] => 33306
            [redirect_url] =>
        )

)
XML DATA LOOKS OK
CHECKING JOB reportengine (Report Engine)
CHECKING JOB sysstat (System Statistics)
CHECKING JOB eventman (Event Manager)
CHECKING JOB feedprocessor (Feed Processor)
CHECKING JOB cmdsubsys (Command Subsystem)
CHECKING JOB nom (Nonstop Operations Manager)
CHECKING JOB dbmaint (Database Maintenance)
CHECKING JOB cleaner (Cleaner)
All jobs are running okay.

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Wed Jul 14, 2021 3:55 pm
by gormank
I sent profiles in case you need them.

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Thu Jul 15, 2021 1:58 pm
by gsmith
Hi,

I am still digging through the profiles, thanks for sending them.

Could you check the time on each server using the "date" command please?

Thanks

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Thu Jul 15, 2021 4:38 pm
by gormank
Good catch. The time is off (fast) 4-5 min in the host that's checking the remote hosts. I'll look into that since the time is supposedly synced.

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Thu Jul 15, 2021 5:24 pm
by gsmith
Hey,

Was looking at your command and I don't see how it's working:

Code: Select all

/usr/bin/php /usr/local/nagios/libexec/check_nagiosxiserver.php --address=$address --url="https://$address/nagiosxi/" --apikey='fake' --mode=jobs --debug=1
Look what I get:

Code: Select all

[root@gs-cent8-23-82 libexec]# /usr/bin/php check_nagiosxiserver.php --address=$ADDRESS$ --url='https://$ADDRESS$/nagiosxi/' --apikey='hHFJriqiXXb2dtRWqctKOJnCVu0WNZv4Wq4HLNfZKphWc89GjFGoE3ZvRgLEvAIs' --mode=jobs --warn=600
Error: Could not parse JSON from https://$ADDRESS$/nagiosxi/ ()
Plus the fact that the API key is different for each server - it doesn't look like you are accounting for that.

For now you should try running dedicated services for each host you want to monitor - like:
Image3.jpg
where $ARG1$ is:
--url='https://192.168.23.81/nagiosxi/' --apikey='hHFJriqiXXb2dtRWqctKOJnCVu0WNZv4Wq4HLNfZKphWc89GjFGoE3ZvRgLEvAIs' --mode=jobs --warn=600


Thanks

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Fri Jul 16, 2021 2:49 pm
by gormank
The ntpHADSH0x 1ca8d8service was dead so the time drifted. After syncing time the issue is gone.
Thanks!

Re: Nagios_Remote_Jobs: Database Maintenance (dbmaint) stale

Posted: Fri Jul 16, 2021 5:20 pm
by gsmith
Nice!

Have a good weekend!