Tactical Overview, Ops Center, Ops Screen Problems

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

The latest log entry is from about 2 hours ago. I see the same basic entry over and over again:

Code: Select all

[Wed Feb 27 12:10:29 2013] [error] [client xxx.xxx.xxx.xxx] PHP Warning:  fopen(/usr/local/nagios/var/graphapi.log) [<a href='function.fopen'>function.fopen</
a>]: failed to open stream: Permission denied in /usr/local/nagiosxi/html/includes/components/perfdata/graphApi.php on line 70, referer: http://nagiosserver/nagiosxi/perfgraphs/
[Wed Feb 27 12:10:29 2013] [error] [client xxx.xxx.xxx.xxx] PHP Warning:  fwrite(): supplied argument is not a valid stream resource in /usr/local/nagiosxi/ht
ml/includes/components/perfdata/graphApi.php on line 72, referer: http://lnttavmnag1/nagiosxi/perfgraphs/

Code: Select all

[Wed Feb 27 12:11:24 2013] [error] [client xxx.xxx.xxx.xxx] PHP Warning:  fclose(): supplied argument is not a valid stream resource in /usr/local/nagiosxi/html/includes/components/perfdata/graphApi.php on line 73, referer: http://nagiosserver/nagiosxi/perfgraphs/?&host=examplehost&service=&source=1&view=1&start=&end=&startdate=&enddate=&mode=1&host_id=10504
ERROR: This RRD was created on another architecture
I'm guessing that these were all generated when I tried to view the graphs. This seems like it might point to the issues with the graphs, but not my original issue?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by scottwilkerson »

What version of XI are you running?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

2012R1.4
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by mguthrie »

ERROR: This RRD was created on another architecture
This is the actual error that's causing the issue with the red X. It appears that some of these rrd's were migrated from another install of a different architecture.

http://support.nagios.com/wiki/index.ph ... Install.3F

I'm betting that if you've got rrd's with a different architecture then the NPCD daemon maybe hasn't been able to update them. I'm not sure what kind of error gets throw in that circumstance.
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

Ah ha! This is indeed the case on the migration.

I have moved from a CentOS build (32-bit) to a Red Hat build (64-bit).

I followed the document for restoring from a backup, where the Things to Consider section mentions different OS builds, but it didn't touch on this.

I take it I would need to do these steps in addition to the Things to Consider section of that document?

Also, does any of this change with the use of the ramdisk on one or both machines?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by scottwilkerson »

jbennett wrote:I take it I would need to do these steps in addition to the Things to Consider section of that document?
correct
jbennett wrote:Also, does any of this change with the use of the ramdisk on one or both machines?
I do not believe that use of a RAM Disk will have any affect on this, UNLESS your ram disks are in different locations on the 2 servers
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

I would like to understand what these steps are doing.

Code: Select all

 for i in `find -name "*.rrd"`; do rrdtool dump $i > $i.xml; done
tar -cvzf perfdata.tar.gz */*.rrd.xml
for i in `find -name "*.rrd.xml"`; do rm -f $i; done
Would it be possible to get an explination?

I am trying to run the first line, but it seems to just be sitting. I'm assuming that since I have a fair number of hosts, it is going through all of the folders and checking all of the .rrd files and dumping them into a .xml file? Taring the files into a single file and then?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by scottwilkerson »

jbennett wrote:Would it be possible to get an explination?
Sure, this process can take a long time if you have a large setup
jbennett wrote:I'm assuming that since I have a fair number of hosts, it is going through all of the folders and checking all of the .rrd files and dumping them into a .xml file?
Correct, it is creating *.rrd.xml filed with all the information in the *.rrd files. These will be required to restore the performance data on the new system. Because .rrd file are architecture specific, if we change arch we need to dump the data in the files to xml, then we will restore them on the new system.
jbennett wrote: Taring the files into a single file and then?
Yep, this will create a file perfdata.tar.gz which will be moved to the new system for restore
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
jbennett
Posts: 522
Joined: Mon Apr 16, 2012 3:00 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by jbennett »

Turns out I've run into a little problem. I'm out of space thanks to this little activity...which is probably a symptom of another issue.

I've run the following as suggested: for i in `find -name "*.rrd"`; do rrdtool dump $i > $i.xml; done

When I go to tar the files, I get a notice that I've run out of disk space.

Code: Select all

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      108G  108G     0 100% /
tmpfs                 7.4G     0  7.4G   0% /dev/shm
/dev/sda1              97M   82M   11M  89% /boot
tmpfs                  50M     -     -   -  /var/nagiosramdisk
I'm trying to figure out what's taking up all of the space, but still be a relative Linux newbie, I'm not getting very far. I'm guessing that in doing the above, I've simply take up all of the space on the drive.

Does it make sense to extend the logical volume, say, following this example? http://www.ehowstuff.com/how-to-increas ... -on-linux/

I should note that in our environment, I need to do this via command line. I could go into all of the details, but suffice it to say that things are locked down from the user side.

Also, while this is an issue, it isn't the primary concern at this point.

I need to get to where the overview screens are useable. Currently, they are't.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Tactical Overview, Ops Center, Ops Screen Problems

Post by abrist »

If the XI box is out of space, you will have issues with many parts of XI. Run the following commands from the cli and post the output in a code wrap. This will help us track down where the space went.
I bet the problems are in /usr or /var, they always are.

Code: Select all

cd /
df -i
du -hsx * | sort -rh | head -10
find . -type f -print0 | xargs -0 du -s | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
find . -type d -print0 | xargs -0 du -s | sort -n | tail -10 | cut -f2 | xargs -I{} du -sh {}
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked