Tactical Overview, Ops Center, Ops Screen Problems

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

Just to clarify, the box that is out of space is the one I am trying to migrate FROM, for this reason.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by scottwilkerson »

Ok, lets run this to remove the files we just created in the backup process

Code: Select all

cd /usr/local/nagios/share/perfdata
rm -rf *.rrd.xml
then you will likely need to restart the following

Code: Select all

service postgresql restart
service mysqld restart
Finally run the commands abrist suggested so we can find files that can be removed to allow the space
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

abrist wrote:If the XI box is out of space, you will have issues with many parts of XI. Run the following commands from the cli and post the output in a code wrap. This will help us track down where the space went.
I bet the problems are in /usr or /var, they always are.

Code: Select all

cd /
df -i
du -hsx * | sort -rh | head -10
find . -type f -print0 | xargs -0 du -s | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
find . -type d -print0 | xargs -0 du -s | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}

Code: Select all

[root@nagiosxivm /]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     7122336  206839 6915497    3% /
tmpfs                 160609       1  160608    1% /dev/shm
/dev/sda1              25688      57   25631    1% /boot
tmpfs                 160609      71  160538    1% /var/nagiosramdisk

[root@nagiosxivm /]# du -hsx * | sort -rh | head -10
du: cannot access `proc/26395/task/26395/fd/4': No such file or directory
du: cannot access `proc/26395/task/26395/fdinfo/4': No such file or directory
du: cannot access `proc/26395/fd/4': No such file or directory
du: cannot access `proc/26395/fdinfo/4': No such file or directory
du: cannot access `proc/26398': No such file or directory
du: cannot access `proc/26399': No such file or directory
du: cannot access `proc/26400': No such file or directory
du: cannot access `proc/26402': No such file or directory
du: cannot access `proc/26404': No such file or directory
du: cannot access `proc/26406': No such file or directory
du: cannot access `proc/26408': No such file or directory
du: cannot access `proc/26409': No such file or directory
du: cannot access `proc/26410': No such file or directory
du: cannot access `proc/26412': No such file or directory
du: cannot access `proc/26413': No such file or directory
22G     usr
16G     var
2.3G    store
290M    lib
209M    tmp
81M     boot
58M     etc
24M     root
9.7M    sbin
6.3M    bin

[root@nagiosxivm /]# find . -type f -print0 | xargs -0 du -s | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
{a whole list of 'No such file or directory' items in addition to the below:}
1.1G    ./var/lib/mysql/mysql-bin.000003
1.1G    ./var/lib/mysql/mysql-bin.000004
1.1G    ./var/lib/mysql/mysql-bin.000005
1.1G    ./var/lib/mysql/mysql-bin.000006
1.1G    ./var/lib/mysql/mysql-bin.000007
1.1G    ./var/lib/mysql/mysql-bin.000008
1.1G    ./var/lib/mysql/mysql-bin.000010
1.1G    ./var/lib/mysql/mysql-bin.000011
1.1G    ./var/lib/mysql/mysql-bin.000012
1.1G    ./store/backups/nagiosxi/1358874081.tar.gz

[root@nagiosxivm /]# find . -type d -print0 | xargs -0 du -s | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
{a whole list of 'No such file or directory' items in addition to the below:}
6.7G    ./usr/local/nagios/var
13G     ./var/lib/mysql
13G     ./var/lib
14G     ./usr/local/nagios/share/perfdata
14G     ./usr/local/nagios/share
16G     ./var
20G     ./usr/local/nagios
21G     ./usr/local
22G     ./usr
41G     .
I have added some space via the VM, but I have not been able to extend the partition in the OS as of yet. It's not entirely straight forward being that it's a LVM.
Last edited by jbennett on Mon Mar 04, 2013 1:05 pm, edited 2 times in total.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

scottwilkerson wrote:Ok, lets run this to remove the files we just created in the backup process

Code: Select all

cd /usr/local/nagios/share/perfdata
rm -rf *.rrd.xml
then you will likely need to restart the following

Code: Select all

service postgresql restart
service mysqld restart
Finally run the commands abrist suggested so we can find files that can be removed to allow the space
When I run the command to remove the .rrd.xml files that were created, it doesn't appear to do anything. As a result, when I remove the switched to not prompt, I get the following:
rm: cannot remove `*.rrd.xml': No such file or directory

When I ls the directory, I see a ton of folders and inside each of those folders, I have the .rrd.xml files.

Would I be correct in using the following to remove these files from all sub-directories?

Code: Select all

find /usr/local/nagios/share/perfdata -type f -iname "*.rrd.xml" -exec rm -f {} \;
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by abrist »

Your command looks right. I just tested it to be sure ;)
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

abrist wrote:Your command looks right. I just tested it to be sure ;)
Thanks! That appears to have done the trick:

Code: Select all

[root@nagiosxivm]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      108G   41G   63G  40% /
tmpfs                 7.4G     0  7.4G   0% /dev/shm
/dev/sda1              97M   82M   11M  89% /boot
tmpfs                  50M   33M   18M  65% /var/nagiosramdisk
One other thing that I've noticed when looking in this directory - I have a number of folders that are from what appears to be old hosts that have since been renamed and/or deleted entirely. Those that have been renamed, are there, as they should be. For example - We went o a uniform way of naming hosts. Previously, the admin had some in all caps, some in lower case, etc. When I went back through and changed everything to all caps, it appears as if it didn't remove the associated folders. I do have the new folders, in all caps. Am I safe to delete these folders? Or are they actually the host groups?
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

I have since updated my above post to reflect the commands to display directories that are large once I removed the files mentioned above.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by scottwilkerson »

jbennett wrote:Am I safe to delete these folders? Or are they actually the host groups?
You are safe in removing these old folder and that would really help reduce the increase of disk-space in the export.

Actually you could find all the files that haven't been updated in 30 days with the following command

Code: Select all

find /usr/local/nagios/share/perfdata/* -mtime +30 -exec ls -l {} \;
CAREFUL WITH THIS
and remove them, if this looks right with

Code: Select all

find /usr/local/nagios/share/perfdata/* -mtime +30 -exec rm -rf {} \;
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

I will try this for sure. I am running a backup now just to be sure. Once that is finished, I will try this as well. I think I can clear off a bunch of space.

Once this is done, I will copy over the files to assist with the graphing issue on the new Nagios server. However, will this resolve my initial problem of the data being old on the screens I mentioned and/or no data showing up at all?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by abrist »

Wasn't the initial problems due to a change in architecture causing the rrds to not be read/written? That and the timeout/load issues with npcd. Lets get the rrds handled and then move on to any other problems once they present themselves.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked