Ram disk and config errors

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Ram disk and config errors

Post by jbennett »

I'm struggling with an issue with my ramdisk filling up.

Currently I utilize templates for my hosts and services. I have asigned alert contact groups in those templates rather than in each host and service.

This was working fine until yesterday, where I needed to test out creating a second contact group and assigning that contact group only a certain set of hosts, but all services assigned to those hosts. I removed the contact groups from the templates, saved and applied the changes. I then went to the bulk modification tool and added contact groups to select hosts and all services. I had contact group A assigned to a set of hosts and contact group B assigned to another set of hosts.

I tested it out and it appears to work fine.

I then needed to revert from the testing so I decided to just add contact group A to all hosts and remove contact group B from their assigned hosts. Keeping in mind that we have 1500 something hosts and 22000 service checks, this appeared to have caused the DB to crash. I also ran out of inodes in /var. I repaired the DB, and cleared out the inodes (from all of the checks that backed up I suppose).

I then started Nagios back up and foudn that the ramdisk was filling up. When I ran the check config script, I see the following:

Code: Select all

Warning: Host 'xxxx' has no default contacts or contactgroups defined!
Yet, I check the host and the corresponding template that's assigned to it and I see my contact groups.

What am I missing?

I'm on Nagios XI 2014R1.5.

EDIT: I also have the following when I check configs:

Code: Select all

Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/hosttemplates.cfg', starting at line 40) 
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/hosttemplates.cfg', starting at line 285) 
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/hosttemplates.cfg', starting at line 314) 
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 64) 
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 103) 
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 368) 
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 401) 
Read object config files okay... 
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Ram disk and config errors

Post by lmiltchev »

Run the following commands and show us the output:

Code: Select all

uptime
service npcd status
df -h
df -i
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
Note: Modify the path to "nagiosramdisk" if it is in a different location.
Be sure to check out our Knowledgebase for helpful articles and solutions!
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Ram disk and config errors

Post by jbennett »

Code: Select all

[xxx]# uptime
 12:21:21 up  1:24,  1 user,  load average: 2.24, 2.02, 1.64
[xxx]# /etc/init.d/npcd status
NPCD running (pid 3469).
[xxx]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00_ROOT
                       48G   36G  8.9G  81% /
/dev/mapper/VolGroup00-LogVol00
                      3.0G  255M  2.6G   9% /tmp
/dev/mapper/VolGroup00-LogVol00_VAR
                      5.7G  4.7G  750M  87% /var
/dev/hda1             190M   40M  141M  23% /boot
tmpfs                 5.9G     0  5.9G   0% /dev/shm
tmpfs                 125M  125M     0 100% /var/nagiosramdisk
10.100.3.220:/kickstart
                      190G  130G   51G  73% /kickstart
[xxx]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00_ROOT
                     12799776  218098 12581678    2% /
/dev/mapper/VolGroup00-LogVol00
                      793600    6192  787408    1% /tmp
/dev/mapper/VolGroup00-LogVol00_VAR
                     1540096   90296 1449800    6% /var
/dev/hda1              50200      50   50150    1% /boot
tmpfs                1538707       1 1538706    1% /dev/shm
tmpfs                1538707  801816  736891   53% /var/nagiosramdisk
xx.xx.xx.xx:/kickstart
                     51216384  252260 50964124    1% /kickstart
[xxx]# ls /var/nagiosramdisk/spool/xidpe | wc -l
0
[xxx]# ls /var/nagiosramdisk/spool/perfdata/ | wc -l
78
[xxx]# ls /var/nagiosramdisk/spool/checkresults/ | wc -l
806626
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Ram disk and config errors

Post by jbennett »

From another thread I ran the following:

Code: Select all

[xxx checkresults]$ sudo tail -25 /usr/local/nagios/var/perfdata.log
2014-10-08 13:01:21 [20827] [0] *** process_perfdata.pl terminated on signal ALRM
2014-10-08 13:01:53 [21268] [0] *** TIMEOUT: Timeout after 15 secs. ***
2014-10-08 13:01:53 [21268] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-10-08 13:01:53 [21268] [0] *** TIMEOUT: Please check your npcd.cfg
2014-10-08 13:01:53 [21268] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791283-PID-21268 deleted
2014-10-08 13:01:53 [21268] [0] *** Timeout while processing Host: "yyy" Service: "zzz"
2014-10-08 13:01:53 [21268] [0] *** process_perfdata.pl terminated on signal ALRM
2014-10-08 13:01:54 [21267] [0] *** TIMEOUT: Timeout after 15 secs. ***
2014-10-08 13:01:54 [21267] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-10-08 13:01:54 [21267] [0] *** TIMEOUT: Please check your npcd.cfg
2014-10-08 13:01:54 [21267] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791267-PID-21267 deleted
2014-10-08 13:01:54 [21267] [0] *** Timeout while processing Host: "yyy" Service: "zzz"
2014-10-08 13:01:54 [21267] [0] *** process_perfdata.pl terminated on signal ALRM
2014-10-08 13:02:25 [21637] [0] *** TIMEOUT: Timeout after 15 secs. ***
2014-10-08 13:02:25 [21637] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-10-08 13:02:25 [21637] [0] *** TIMEOUT: Please check your npcd.cfg
2014-10-08 13:02:25 [21637] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791297-PID-21637 deleted
2014-10-08 13:02:25 [21637] [0] *** Timeout while processing Host: "yyy" Service: "zzz"
2014-10-08 13:02:25 [21637] [0] *** process_perfdata.pl terminated on signal ALRM
2014-10-08 13:04:30 [23012] [0] *** TIMEOUT: Timeout after 15 secs. ***
2014-10-08 13:04:30 [23012] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-10-08 13:04:30 [23012] [0] *** TIMEOUT: Please check your npcd.cfg
2014-10-08 13:04:30 [23012] [0] *** TIMEOUT: /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791447-PID-23012 deleted
2014-10-08 13:04:30 [23012] [0] *** Timeout while processing Host: "yyy" Service: "zzz"
2014-10-08 13:04:30 [23012] [0] *** process_perfdata.pl terminated on signal ALRM
[xxx checkresults]$ tail -25 /usr/local/nagios/var/npcd.log
[10-08-2014 12:46:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412790369'
[10-08-2014 12:46:54] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 12:46:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//host-perfdata.1412790369'
[10-08-2014 12:49:15] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 12:49:15] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412790518'
[10-08-2014 12:49:45] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 12:49:45] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412790532'
[10-08-2014 12:51:42] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 12:51:42] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412790668'
[10-08-2014 12:52:54] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 12:52:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412790742'
[10-08-2014 12:55:25] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 12:55:25] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412790894'
[10-08-2014 13:01:21] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 13:01:21] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791253'
[10-08-2014 13:01:21] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 13:01:21] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791237'
[10-08-2014 13:01:53] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 13:01:53] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791283'
[10-08-2014 13:01:54] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 13:01:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791267'
[10-08-2014 13:02:25] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 13:02:25] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791297'
[10-08-2014 13:04:30] NPCD: ERROR: Executed command exits with return code '7'
[10-08-2014 13:04:30] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//service-perfdata.1412791447'
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Ram disk and config errors

Post by jbennett »

I think I may have found the answer.

A day ago I added sbin to my $PATH, as suggested here: http://support.nagios.com/forum/viewtop ... 16&t=29427

I started retracing all of my steps and removed :/sbin from my profile's path and lo and behild, my ramdisk isn't filling up any more.

However, I still have the warnings in my configs.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Ram disk and config errors

Post by Box293 »

Please post the warnings that are still appearing.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Ram disk and config errors

Post by jbennett »

Code: Select all

Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/hosttemplates.cfg', starting at line 40)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/hosttemplates.cfg', starting at line 285)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/hosttemplates.cfg', starting at line 314)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 64)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 103)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 368)
Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting at line 401)
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Ram disk and config errors

Post by Box293 »

You have some configs in your templates that need "failure_prediction_enabled" removed from them.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Ram disk and config errors

Post by jbennett »

ok - I see where that config was removed from definitions as part of core 4.0.

I'm checking host templates and I don't see this anywhere.

Where do i have to remove it?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Ram disk and config errors

Post by Box293 »

You're right ... I was talking to a dev and these should have been removed when upgraded to 2014 so you won't be able to do this in the ccm.

However try this:

Go into CCM
Tools > Write Config Files
Click the Write Button
It will show an output of all the files it creates
Click the Verify button
The output should end with "Total Errors: 0"
Quick Tools > Apply Configuration
Click the Apply Configuration button

Now do these warning messages still appear?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked