Several devices have a "Sync Missed" status

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

Can you ping this IP from the XI server and show us the return times?

ping 172.22.2.123
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

Code: Select all

ping 172.22.2.123 -c 5
PING 172.22.2.123 (172.22.2.123) 56(84) bytes of data.
64 bytes from 172.22.2.123: icmp_seq=1 ttl=64 time=0.050 ms
64 bytes from 172.22.2.123: icmp_seq=2 ttl=64 time=0.115 ms
64 bytes from 172.22.2.123: icmp_seq=3 ttl=64 time=0.062 ms
64 bytes from 172.22.2.123: icmp_seq=4 ttl=64 time=0.068 ms
64 bytes from 172.22.2.123: icmp_seq=5 ttl=64 time=0.081 ms

--- 172.22.2.123 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.050/0.075/0.115/0.022 ms
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

Let us know if there appears to be improvement on this after removing some of the extra software that was bundled with gnome.

Also, if the write config tool is still failing, the issue almost has to be permissions related. Can you run the following and show us the output?

Code: Select all

ls -l /var/www/html/nagiosql
and also run:

Code: Select all

/usr/local/nagisoxi/scripts/reset_config_perms
and then verify that the files in /usr/local/nagios/etc are owned by apache.nagios?
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

This was my response on the other thread (which I marked as solved).
So I removed the Gnome group and setroubleshoot and that freed up a ton/i] of RAM and CPU. Now Nagios, MySQL, and httpd are top 3 and the RAM usage reduced 75%. Seems like that setroubleshoot really hogs resources...I'll have more device additions in the upcoming weeks so I'll close this thread for now as solved and keep you posted on further developments.

Code: Select all

ls -l /var/www/html/nagiosql/
total 72
drwxr-xr-x 2 apache apache 4096 Apr 21 12:11 admin
-rwxr-xr-x 1 apache apache 1572 Jul 29 21:49 admin.php
drwxr-xr-x 3 apache apache 4096 Apr 21 12:11 config
-rwxr-xr-x 1 apache apache 1150 Jul 29 21:49 favicon.ico
drwxr-xr-x 4 apache apache 4096 Apr 21 12:11 functions
drwxr-xr-x 2 apache apache 4096 Apr 21 12:11 images
-rwxr-xr-x 1 apache apache 3851 Jul 29 21:49 index.php
drwxr-xr-x 8 apache apache 4096 Apr 21 12:11 install
drwxr-xr-x 4 apache apache 4096 Apr 21 12:11 templates

Code: Select all

ls -l /usr/local/nagios/etc/
total 444
-rw-rw-r-- 1 apache nagios    954 Jul  1 14:30 cgi.cfg
-rw-rw-r-- 1 apache nagios  16824 Aug 10 17:24 commands.cfg
-rw-rw-r-- 1 apache nagios    931 Aug 10 17:24 contactgroups.cfg
-rw-rw-r-- 1 apache nagios   1782 Aug 10 17:24 contacts.cfg
-rw-rw-r-- 1 apache nagios   1382 Aug 10 17:24 contacttemplates.cfg
-rw-rw-r-- 1 apache nagios    642 Aug 10 17:24 hostdependencies.cfg
-rw-rw-r-- 1 apache nagios    644 Aug 10 17:24 hostescalations.cfg
-rw-rw-r-- 1 apache nagios    662 Aug 10 17:24 hostextinfo.cfg
-rw-rw-r-- 1 apache nagios   8158 Aug 10 17:24 hostgroups.cfg
-rw-rw-r-- 1 apache nagios  17895 Jul 21 23:38 hostgroups.cfg.orig
drwsrwsr-x 2 apache nagios 106496 Aug 11 01:43 hosts
-rw-rw-r-- 1 apache nagios   7307 Aug 10 17:24 hosttemplates.cfg
drwsrwsr-x 2 apache nagios  12288 Aug  8 21:33 import
-rw-rw-r-- 1 apache nagios   5928 Aug  9 10:35 nagios.cfg
-rwxrwxr-x 1 apache nagios   2229 Apr 21 12:10 ndo2db.cfg
-rwxrwxr-x 1 apache nagios   4723 Apr 21 12:10 ndomod.cfg
-rw-rw-r-- 1 apache nagios   7207 Apr 21 12:11 nrpe.cfg
-rwxrwxr-x 1 apache nagios   5345 Apr 21 12:11 nsca.cfg
drwxrwxr-x 4 apache nagios   4096 Aug  5 18:41 pnp
-rw-rw-r-- 1 apache nagios      0 May 18 15:17 recurringdowntime.cfg
-rwxrwxr-x 1 apache nagios    210 Apr 21 12:08 resource.cfg
-rwxrwxr-x 1 apache nagios   1627 Apr 21 12:11 send_nsca.cfg
-rw-rw-r-- 1 apache nagios    648 Aug 10 17:24 servicedependencies.cfg
-rw-rw-r-- 1 apache nagios    650 Aug 10 17:24 serviceescalations.cfg
-rw-rw-r-- 1 apache nagios    668 Aug 10 17:24 serviceextinfo.cfg
-rw-rw-r-- 1 apache nagios    638 Aug 10 17:24 servicegroups.cfg
drwsrwsr-x 2 apache nagios  36864 Aug 10 17:24 services
-rw-rw-r-- 1 apache nagios  10642 Aug 10 17:24 servicetemplates.cfg
drwsrwsr-x 2 apache nagios   4096 Aug  5 18:41 static
-rw-rw-r-- 1 apache nagios   3350 Aug 10 17:24 timeperiods.cfg
Just to reiterate the current problems I'm having...

Auto Discovery Wizard: Interface is incredibly slow and frequently fails at various steps during the process. Especially during the last step of applying the config.
Missed Sync/Slow device intake: It seems like the database is very slow to add devices. I have devices with missed sync statuses that were put in several days to almost a week ago. Many of the changes are not showing up in Nagvis or on the Statusmap.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

Do you have anything telling in the /usr/local/nagios/var/nagios.log file. Normally it will log state changes and events, but if you see anything in there related to ndomod or ndoutils can you post that log data?
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

I did a grep for ndoutils and ndomod but only got a return from the archives for ndomod. Attached is the output.
You do not have the required permissions to view the files attached to this post.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Several devices have a "Sync Missed" status

Post by mguthrie »

Ok, that appears to be be clean from sync issues with ndoutils.

So I'm noticing there are multiple issues/symptoms on your install, some of which have been resolved, some haven't. Can you give us a recap of where things are at on your system, and what problems still exist? Were you able to offload the MySQL DB to another system? I'd like to make sure we have a clear understanding as to what still isn't working.
cwscribner
Posts: 316
Joined: Thu Mar 31, 2011 9:54 am
Location: Patten, ME
Contact:

Re: Several devices have a "Sync Missed" status

Post by cwscribner »

I haven't been able to offload the mysql database for my customer yet. We've put in an order for a second server to move it onto so that's on the horizon.

Recap:
  • Autodiscovery wizard is incredibly slow and often fails to input devices on the last steps. Tends to just timeout and display a blank page
  • Lag in device application. There are devices that persistently miss the sync. Several changes never apply to the Network Status Map. Some of these changes are weeks old and still haven't changed on the status map.
Locked