Several devices have a "Sync Missed" status

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Several devices have a "Sync Missed" status

Post by cwscribner »

Hi all.

I have about ~1600 devices being monitored, several hundred of which have been "Sync Missed" since they were added. I've applied configuration at least 30 times in the past few days and several devices are still showing "Sync Missed" and the changes are not showing up in the statusmap or in the lists of available hosts (i.e. for parent/child).
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

What results do you get from using the Core Config Manager->Write Config Tool?

Can you verify that your system time is up to date?

http://support.nagios.com/wiki/index.ph ... e.22_Error
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

When I try to write the config, it just hangs. I just let it sit for ~40 minutes and it still hadn't processed.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

If that's hanging up I'm concerned something is amiss. I'd like to have you check a few things for us.

What does your CPU load look like on your system?

Have you increased the PHP memory limits on your machine?
http://support.nagios.com/wiki/index.ph ... _Completes

Can you show the output from the following?

Code: Select all

ll /usr/local/nagios/var/rw
and

Code: Select all

ll /usr/local/nagiosxi/scripts
and

Code: Select all

ll /usr/local/nagiosxi/cron
And also run the following just to be safe.
/usr/local/nagiosxi/scripts/reset_config_perms

Lets also make sure postgresql is running smoothly. (this will display some error output, which is normal).

Code: Select all

pgsql nagiosxi nagiosxi
vacuum;
vacuum analyze;
vacuum full;
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

CPU Load has never been over 5. Normally remains around 0.8-1.4.

I've increased the limits in my php.ini file.

Code: Select all

prw-rw---- 1 nagios nagcmd 0 Aug  5 01:41 nagios.cmd

Code: Select all

-rwxr-xr-x 1 nagios nagios    2757 Jul 29 21:48 backup_xi.sh
-rwxr-xr-x 1 nagios nagios     352 Jul 29 21:48 export_nagiosql.sh
-rwxr-xr-x 1 nagios nagios    1017 Jul 29 21:48 fixperms.sh
-rwxr-xr-x 1 nagios nagios     821 Jul 29 21:48 handle_nagioscore_event.php
-rwxr-xr-x 1 nagios nagios     829 Jul 29 21:48 handle_nagioscore_notification.php
-rwxr-xr-x 1 nagios nagios     259 Jul 29 21:48 import_nagiosql.sh
-rwxr-xr-x 1 nagios nagios     149 Jul 29 21:48 kill_rrdtool.sh
-rw-r--r-- 1 nagios nagios     153 Aug  4 18:29 nagiosql.cookies
-rwxr-xr-x 1 nagios nagios  253324 Jul 29 21:48 nagiosql_defaults.sql
-rwxr-xr-x 1 nagios nagios     858 Jul 29 21:48 nagiosql_delete_contact.php
-rwxr-xr-x 1 nagios nagios     849 Jul 29 21:48 nagiosql_delete_host.php
-rwxr-xr-x 1 nagios nagios     505 Jul 29 21:48 nagiosql_delete_object.sh
-rwxr-xr-x 1 nagios nagios     858 Jul 29 21:48 nagiosql_delete_service.php
-rwxr-xr-x 1 nagios nagios     881 Jul 29 21:48 nagiosql_delete_timeperiod.php
-rw-r--r-- 1 nagios nagios    7400 Aug  4 18:30 nagiosql.export.additional
-rwxr-xr-x 1 nagios nagios     964 Jul 29 21:48 nagiosql_exportall.php
-rw-r--r-- 1 nagios nagios    4788 Aug  4 18:30 nagiosql.export.monitoring
-rwxr-xr-x 1 nagios nagios    1105 Jul 29 21:48 nagiosql_importall.php
-rw-r--r-- 1 nagios nagios 1342766 Aug  4 10:20 nagiosql.import.monitoring
-rw-r--r-- 1 nagios nagios    5286 Aug  4 18:29 nagiosql.login
-rwxr-xr-x 1 nagios nagios    1255 Jul 29 21:48 nagiosql_login.php
-rwxr-xr-x 1 nagios nagios     258 Jul 29 21:48 nagiosql_trim_backups.sh
-rwxr-xr-x 1 nagios nagios     515 Jul 29 21:48 nom_create_nagioscore_checkpoint_cond.sh
-rwxr-xr-x 1 nagios nagios     716 Jul 29 21:48 nom_create_nagioscore_checkpoint.sh
-rwxr-xr-x 1 nagios nagios     631 Jul 29 21:48 nom_create_nagioscore_errorpoint.sh
-rwxr-xr-x 1 nagios nagios     784 Jul 29 21:48 nom_restore_nagioscore_checkpoint.sh
-rwxr-xr-x 1 nagios nagios    2104 Jul 29 21:48 nom_trim_nagioscore_checkpoints.sh
-rwxr-xr-x 1 nagios nagios    4271 Jul 29 21:48 parse_core_eventlog.php
-rwxr-xr-x 1 nagios nagios    3795 Jul 29 21:48 patch_ndoutils.php
-rw-r--r-- 1 root   root      2148 Aug  4 10:05 reconfig.txt
-rwxr-xr-x 1 nagios nagios     246 Jul 29 21:48 reconfigure_nagios.sh
-rwxr-xr-x 1 nagios nagios     982 Jul 29 21:48 repairmysql.sh
-rwsr-xr-x 1 root   nagios    5258 Jul 29 21:48 reset_config_perms
-rwxr-xr-x 1 nagios nagios     280 Jul 29 21:48 reset_config_perms.c
-rwsr-xr-x 1 root   nagios     494 Jul 29 21:48 reset_config_perms.sh
-rwxr-xr-x 1 nagios nagios    1155 Jul 29 21:48 reset_nagiosadmin_password.php
-rwxr-xr-x 1 nagios nagios     850 Jul 29 21:48 restart_nagios_with_export.sh
-rwxr-xr-x 1 nagios nagios     803 Jul 29 21:48 restore_defaults.sh
-rwxr-xr-x 1 nagios nagios    4010 Jul 29 21:48 restore_xi.sh
-rw-r--r-- 1 root   root   1250244 Aug  4 10:33 subsys.txt

Code: Select all

-rwxr-xr-x 1 nagios nagios  1991 Jul 29 21:48 cleaner.php
-rwxr-xr-x 1 nagios nagios 11587 Jul 29 21:48 cmdsubsys.php
-rwxr-xr-x 1 nagios nagios   216 Jul 29 21:48 cookie.txt
-rwxr-xr-x 1 nagios nagios 11107 Jul 29 21:48 dbmaint.php
-rwxr-xr-x 1 nagios nagios  3698 Jul 29 21:48 eventman.php
-rwxr-xr-x 1 nagios nagios  1544 Jul 29 21:48 feedproc.php
-rwxr-xr-x 1 nagios nagios  2072 Jul 29 21:48 nom.php
-rwxr-xr-x 1 nagios nagios 10318 Jul 29 21:48 perfdataproc.php
-rwxr-xr-x 1 nagios nagios 16388 Jul 29 21:48 recurringdowntime.pl
-rwxr-xr-x 1 nagios nagios  1321 Jul 29 21:48 reportengine.php
-rwxr-xr-x 1 nagios nagios  9798 Jul 29 21:48 sysstat.php
Ran /usr/local/nagiosxi/scripts/reset_config_perms

Code: Select all

# pgsql nagiosxi nagiosxi
-bash: pgsql: command not found
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

Oops, typo on that last one.

Code: Select all

psql nagiosxi nagiosxi 
vacuum;
vacuum analyze;
vacuum full;
Did I see on a different thread that you're running something special on the network, like a proxy, SSL, VPN, or NAT?
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

I'm accessing Nagios over a VPN. I'm offsite from Nagios.

Code: Select all

nagiosxi=> vacuum;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM
nagiosxi=> vacuum;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM

nagiosxi=> vacuum analyze;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM

nagiosxi=> vacuum full;
WARNING:  skipping "pg_authid" --- only table or database owner can vacuum it
WARNING:  skipping "pg_tablespace" --- only table or database owner can vacuum it
WARNING:  skipping "pg_pltemplate" --- only table or database owner can vacuum it
WARNING:  skipping "pg_shdepend" --- only table or database owner can vacuum it
WARNING:  skipping "pg_auth_members" --- only table or database owner can vacuum it
WARNING:  skipping "pg_database" --- only table or database owner can vacuum it
VACUUM
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

Any movement on this issue?
User avatar
nscott
Posts: 1040
Joined: Wed May 11, 2011 8:54 am

Re: Several devices have a "Sync Missed" status

Post by nscott »

We wrote a script to troubleshoot backend calls going wrong over special connections such as a VPN, can you follow the post

http://support.nagios.com/forum/viewtop ... 079#p13079

And post its output? If there is a call thats going wrong that will help use troubleshoot it more effectively.
Nicholas Scott
Former Nagios employee
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

Code: Select all

Testing System Profile
get_base_uri returns: http://172.22.2.123/nagiosxi/
get_base_url returns: http://172.22.2.123/nagiosxi/
get_backend_url(internal_call=false) returns: http://172.22.2.123/nagiosxi/profile.php
get_backend_url(internal_call=true) returns: http://172.22.2.123/nagiosxi/backend/
SERVER INFO DUMP
Array
(
    [HTTP_HOST] => 172.22.2.123
    [HTTP_CONNECTION] => keep-alive
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.124 Safari/534.30
    [HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    [HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
    [HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
    [HTTP_ACCEPT_CHARSET] => ISO-8859-1,utf-8;q=0.7,*;q=0.3
    [HTTP_COOKIE] => nagiosxi=v3nul14fl83dhnu4qetr3q2pg6
    [PATH] => /sbin:/usr/sbin:/bin:/usr/bin
    [SERVER_SIGNATURE] => <address>Apache/2.2.3 (CentOS) Server at 172.22.2.123 Port 80</address>

    [SERVER_SOFTWARE] => Apache/2.2.3 (CentOS)
    [SERVER_NAME] => 172.22.2.123
    [SERVER_ADDR] => 172.22.2.123
    [SERVER_PORT] => 80
    [REMOTE_ADDR] => 172.22.240.9
    [DOCUMENT_ROOT] => /var/www/html
    [SERVER_ADMIN] => root@localhost
    [SCRIPT_FILENAME] => /usr/local/nagiosxi/html/profile.php
    [REMOTE_PORT] => 60821
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => 
    [REQUEST_URI] => /nagiosxi/profile.php
    [SCRIPT_NAME] => /nagiosxi/profile.php
    [PHP_SELF] => /nagiosxi/profile.php
    [REQUEST_TIME] => 1312852699
)
1

PING LOCALHOST
RUNNING: '/bin/ping -c 3 localhost 2>&1
'PING healthone.org (127.0.0.1) 56(84) bytes of data.
64 bytes from healthone.org (127.0.0.1): icmp_seq=1 ttl=64 time=0.098 ms
64 bytes from healthone.org (127.0.0.1): icmp_seq=2 ttl=64 time=0.029 ms
64 bytes from healthone.org (127.0.0.1): icmp_seq=3 ttl=64 time=0.024 ms

--- healthone.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.024/0.050/0.098/0.034 ms


WGET LOCALHOST CCM

WGET FROM URL: http://localhost/nagiosql/index.php
RUNNING: /usr/bin/wget http://localhost/nagiosql/index.php
--2011-08-08 21:18:21--  http://localhost/nagiosql/index.php
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5259 (5.1K) [text/html]
Saving to: `/tmp/nagiosql_index.tmp'

     0K .....                                                 100%  247M=0s

2011-08-08 21:18:22 (247 MB/s) - `/tmp/nagiosql_index.tmp' saved [5259/5259]
Locked