Page 1 of 2

Several devices have a "Sync Missed" status

Posted: Thu Aug 04, 2011 4:46 pm
by cwscribner
Hi all.

I have about ~1600 devices being monitored, several hundred of which have been "Sync Missed" since they were added. I've applied configuration at least 30 times in the past few days and several devices are still showing "Sync Missed" and the changes are not showing up in the statusmap or in the lists of available hosts (i.e. for parent/child).

Re: Several devices have a "Sync Missed" status

Posted: Thu Aug 04, 2011 4:59 pm
by mguthrie
What results do you get from using the Core Config Manager->Write Config Tool?

Can you verify that your system time is up to date?

http://support.nagios.com/wiki/index.ph ... e.22_Error

Re: Several devices have a "Sync Missed" status

Posted: Thu Aug 04, 2011 5:49 pm
by cwscribner
When I try to write the config, it just hangs. I just let it sit for ~40 minutes and it still hadn't processed.

Re: Several devices have a "Sync Missed" status

Posted: Fri Aug 05, 2011 9:26 am
by mguthrie
If that's hanging up I'm concerned something is amiss. I'd like to have you check a few things for us.

What does your CPU load look like on your system?

Have you increased the PHP memory limits on your machine?
http://support.nagios.com/wiki/index.ph ... _Completes

Can you show the output from the following?

Code: Select all

ll /usr/local/nagios/var/rw
and

Code: Select all

ll /usr/local/nagiosxi/scripts
and

Code: Select all

ll /usr/local/nagiosxi/cron
And also run the following just to be safe.
/usr/local/nagiosxi/scripts/reset_config_perms

Lets also make sure postgresql is running smoothly. (this will display some error output, which is normal).

Code: Select all

pgsql nagiosxi nagiosxi
vacuum;
vacuum analyze;
vacuum full;

Re: Several devices have a "Sync Missed" status

Posted: Fri Aug 05, 2011 9:48 am
by cwscribner
CPU Load has never been over 5. Normally remains around 0.8-1.4.

I've increased the limits in my php.ini file.

Code: Select all

prw-rw---- 1 nagios nagcmd 0 Aug  5 01:41 nagios.cmd

Code: Select all

-rwxr-xr-x 1 nagios nagios    2757 Jul 29 21:48 backup_xi.sh
-rwxr-xr-x 1 nagios nagios     352 Jul 29 21:48 export_nagiosql.sh
-rwxr-xr-x 1 nagios nagios    1017 Jul 29 21:48 fixperms.sh
-rwxr-xr-x 1 nagios nagios     821 Jul 29 21:48 handle_nagioscore_event.php
-rwxr-xr-x 1 nagios nagios     829 Jul 29 21:48 handle_nagioscore_notification.php
-rwxr-xr-x 1 nagios nagios     259 Jul 29 21:48 import_nagiosql.sh
-rwxr-xr-x 1 nagios nagios     149 Jul 29 21:48 kill_rrdtool.sh
-rw-r--r-- 1 nagios nagios     153 Aug  4 18:29 nagiosql.cookies
-rwxr-xr-x 1 nagios nagios  253324 Jul 29 21:48 nagiosql_defaults.sql
-rwxr-xr-x 1 nagios nagios     858 Jul 29 21:48 nagiosql_delete_contact.php
-rwxr-xr-x 1 nagios nagios     849 Jul 29 21:48 nagiosql_delete_host.php
-rwxr-xr-x 1 nagios nagios     505 Jul 29 21:48 nagiosql_delete_object.sh
-rwxr-xr-x 1 nagios nagios     858 Jul 29 21:48 nagiosql_delete_service.php
-rwxr-xr-x 1 nagios nagios     881 Jul 29 21:48 nagiosql_delete_timeperiod.php
-rw-r--r-- 1 nagios nagios    7400 Aug  4 18:30 nagiosql.export.additional
-rwxr-xr-x 1 nagios nagios     964 Jul 29 21:48 nagiosql_exportall.php
-rw-r--r-- 1 nagios nagios    4788 Aug  4 18:30 nagiosql.export.monitoring
-rwxr-xr-x 1 nagios nagios    1105 Jul 29 21:48 nagiosql_importall.php
-rw-r--r-- 1 nagios nagios 1342766 Aug  4 10:20 nagiosql.import.monitoring
-rw-r--r-- 1 nagios nagios    5286 Aug  4 18:29 nagiosql.login
-rwxr-xr-x 1 nagios nagios    1255 Jul 29 21:48 nagiosql_login.php
-rwxr-xr-x 1 nagios nagios     258 Jul 29 21:48 nagiosql_trim_backups.sh
-rwxr-xr-x 1 nagios nagios     515 Jul 29 21:48 nom_create_nagioscore_checkpoint_cond.sh
-rwxr-xr-x 1 nagios nagios     716 Jul 29 21:48 nom_create_nagioscore_checkpoint.sh
-rwxr-xr-x 1 nagios nagios     631 Jul 29 21:48 nom_create_nagioscore_errorpoint.sh
-rwxr-xr-x 1 nagios nagios     784 Jul 29 21:48 nom_restore_nagioscore_checkpoint.sh
-rwxr-xr-x 1 nagios nagios    2104 Jul 29 21:48 nom_trim_nagioscore_checkpoints.sh
-rwxr-xr-x 1 nagios nagios    4271 Jul 29 21:48 parse_core_eventlog.php
-rwxr-xr-x 1 nagios nagios    3795 Jul 29 21:48 patch_ndoutils.php
-rw-r--r-- 1 root   root      2148 Aug  4 10:05 reconfig.txt
-rwxr-xr-x 1 nagios nagios     246 Jul 29 21:48 reconfigure_nagios.sh
-rwxr-xr-x 1 nagios nagios     982 Jul 29 21:48 repairmysql.sh
-rwsr-xr-x 1 root   nagios    5258 Jul 29 21:48 reset_config_perms
-rwxr-xr-x 1 nagios nagios     280 Jul 29 21:48 reset_config_perms.c
-rwsr-xr-x 1 root   nagios     494 Jul 29 21:48 reset_config_perms.sh
-rwxr-xr-x 1 nagios nagios    1155 Jul 29 21:48 reset_nagiosadmin_password.php
-rwxr-xr-x 1 nagios nagios     850 Jul 29 21:48 restart_nagios_with_export.sh
-rwxr-xr-x 1 nagios nagios     803 Jul 29 21:48 restore_defaults.sh
-rwxr-xr-x 1 nagios nagios    4010 Jul 29 21:48 restore_xi.sh
-rw-r--r-- 1 root   root   1250244 Aug  4 10:33 subsys.txt

Code: Select all

-rwxr-xr-x 1 nagios nagios  1991 Jul 29 21:48 cleaner.php
-rwxr-xr-x 1 nagios nagios 11587 Jul 29 21:48 cmdsubsys.php
-rwxr-xr-x 1 nagios nagios   216 Jul 29 21:48 cookie.txt
-rwxr-xr-x 1 nagios nagios 11107 Jul 29 21:48 dbmaint.php
-rwxr-xr-x 1 nagios nagios  3698 Jul 29 21:48 eventman.php
-rwxr-xr-x 1 nagios nagios  1544 Jul 29 21:48 feedproc.php
-rwxr-xr-x 1 nagios nagios  2072 Jul 29 21:48 nom.php
-rwxr-xr-x 1 nagios nagios 10318 Jul 29 21:48 perfdataproc.php
-rwxr-xr-x 1 nagios nagios 16388 Jul 29 21:48 recurringdowntime.pl
-rwxr-xr-x 1 nagios nagios  1321 Jul 29 21:48 reportengine.php
-rwxr-xr-x 1 nagios nagios  9798 Jul 29 21:48 sysstat.php
Ran /usr/local/nagiosxi/scripts/reset_config_perms

Code: Select all

# pgsql nagiosxi nagiosxi
-bash: pgsql: command not found

Re: Several devices have a "Sync Missed" status

Posted: Fri Aug 05, 2011 11:22 am
by mguthrie
Oops, typo on that last one.

Code: Select all

psql nagiosxi nagiosxi 
vacuum;
vacuum analyze;
vacuum full;
Did I see on a different thread that you're running something special on the network, like a proxy, SSL, VPN, or NAT?

Re: Several devices have a "Sync Missed" status

Posted: Fri Aug 05, 2011 11:36 am
by cwscribner
I'm accessing Nagios over a VPN. I'm offsite from Nagios.

Code: Select all

nagiosxi=> vacuum;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM
nagiosxi=> vacuum;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM

nagiosxi=> vacuum analyze;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM

nagiosxi=> vacuum full;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM

Re: Several devices have a "Sync Missed" status

Posted: Mon Aug 08, 2011 11:39 am
by cwscribner
Any movement on this issue?

Re: Several devices have a "Sync Missed" status

Posted: Mon Aug 08, 2011 6:08 pm
by nscott
We wrote a script to troubleshoot backend calls going wrong over special connections such as a VPN, can you follow the post

http://support.nagios.com/forum/viewtop ... 079#p13079

And post its output? If there is a call thats going wrong that will help use troubleshoot it more effectively.

Re: Several devices have a "Sync Missed" status

Posted: Mon Aug 08, 2011 8:19 pm
by cwscribner

Code: Select all

Testing System Profile
get_base_uri returns: http://172.22.2.123/nagiosxi/
get_base_url returns: http://172.22.2.123/nagiosxi/
get_backend_url(internal_call=false) returns: http://172.22.2.123/nagiosxi/profile.php
get_backend_url(internal_call=true) returns: http://172.22.2.123/nagiosxi/backend/
SERVER INFO DUMP
Array
(
    [HTTP_HOST] => 172.22.2.123
    [HTTP_CONNECTION] => keep-alive
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.124 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => ISO-8859-1,utf-8;q=0.7,*;q=0.3
    [HTTP_COOKIE] => nagiosxi=v3nul14fl83dhnu4qetr3q2pg6
    [PATH] => /sbin:/usr/sbin:/bin:/usr/bin
    [SERVER_SIGNATURE] => <address>Apache/2.2.3 (CentOS) Server at 172.22.2.123 Port 80</address>

    [SERVER_SOFTWARE] => Apache/2.2.3 (CentOS)
    [SERVER_NAME] => 172.22.2.123
    [SERVER_ADDR] => 172.22.2.123
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 172.22.240.9
    [DOCUMENT_ROOT] => /var/www/html
    [SERVER_ADMIN] => root@localhost
    [SCRIPT_FILENAME] => /usr/local/nagiosxi/html/profile.php
    [REMOTE_PORT] => 60821
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => 
    [REQUEST_URI] => /nagiosxi/profile.php
    [SCRIPT_NAME] => /nagiosxi/profile.php
    [PHP_SELF] => /nagiosxi/profile.php
    [REQUEST_TIME] => 1312852699
)
1

PING LOCALHOST
RUNNING: '/bin/ping -c 3 localhost 2>&1
'PING healthone.org (127.0.0.1) 56(84) bytes of data.
64 bytes from healthone.org (127.0.0.1): icmp_seq=1 ttl=64 time=0.098 ms
64 bytes from healthone.org (127.0.0.1): icmp_seq=2 ttl=64 time=0.029 ms
64 bytes from healthone.org (127.0.0.1): icmp_seq=3 ttl=64 time=0.024 ms

--- healthone.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.024/0.050/0.098/0.034 ms


WGET LOCALHOST CCM

WGET FROM URL: http://localhost/nagiosql/index.php
RUNNING: /usr/bin/wget http://localhost/nagiosql/index.php
--2011-08-08 21:18:21--  http://localhost/nagiosql/index.php
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5259 (5.1K) [text/html]
Saving to: `/tmp/nagiosql_index.tmp'

     0K .....                                                 100%  247M=0s

2011-08-08 21:18:22 (247 MB/s) - `/tmp/nagiosql_index.tmp' saved [5259/5259]