Nagios XI - Import hangs without any diagnostic messages
Posted: Tue Dec 07, 2010 4:36 pm
Issue:
Nagios XI - Import hangs without any diagnostic messages
Active Nagios Core user attempting INITIAL import of functioning Nagios Core configuration into Nagios XI.
Nagios Core configuration functional, but GUI is sluggish due to large configuration and corresponding processing of large flat files (see details below).
Attempting to replace core front-end with Nagios XI to improve GUI performance,allow for target user "role" implementation, and use other XI specific features.
Nagios XI import process hangs without displaying root cause or progress details.
Can import process logging verbosity be enabled and/or increased to determine root cause?
Can any timers be adjusted or parameters altered?
Background (Nagios Core):
Currently running Nagios Core 3.2.3 and Plugins 1.4.15
Fully functional distributed environment with three distributed servers sending checks via nsca to a single central server.
Central server has roughly 15,000 hosts defined and 50,000 services defined.
Distributed servers each have a subset of the full configuration and checks are split between the three.
Central server successfully processing roughly 30,000 passive checks results /5min.
Central server active checks restricted to nagios server/daemon healthchecks executed via nrpe, roughly 300 /5min (these could also be migrated to a distributed server if desired).
Central server objects.cache file consists of 2.5 million lines.
Central server status.dat file consists of 3.7 million lines.
Background (Nagios XI):
NagiosXI
uname -a
Linux localhost.localdomain 2.6.18-164.9.1.el5 #1 SMP Tue Dec 15 21:04:57 EST 2009 i686 i686 i386 GNU/Linux
NagiosXI is running as a VM under Oracle VM VirtulalBox version 3.2.10 r66523
I am using the nagiosxi-2009r1.3g-vmware.zip file converted to virtual box as described by the NagiosXI instructions.
The VM failed to work initially and the initrd had to be modified since virtual box uses SATA drives instead of IDE drives.
Through instruction found at http://support.nagios.com/forum/viewtop ... lbox#p3063 I was able to create the new initrd file:
mkinitrd --allow-missing --preload=ahci --force-scsi-probe /boot/initrd-`uname -r`-custom.img `uname -r`
edit /boot/grub.conf:
from:
initrd /initrd-2.6.18-164.91.e15.img
to:
initrd /initrd-2.6.18-164.91.e15-custom.img
At this point everything fires up and works. With the defaults.
Using instructions from http://library.nagios.com/library/produ ... -prep-tool i was able to import all the config files into the cfgprep directory. One file did generate some php notifications but from looking at the error I don’t think this is an issue.
Now when following the instruction from
http://library.nagios.com/library/produ ... es-into-xi
is where the issue is.
After selecting the files in cfgprep with a check in overwrite database the page just times out and returns nothing. There is no messages indicating anything is being done.. the browser will either timeout or or just return like a blank frame after several minutes.
After giving it plenty of time I moved on to Write monitoring data. Same results here the page does no tell me its doing anything and tends to time out or return a blank frame.
Moving on to write additional data.. I do see output as indicated by the instructions and this does finish successfully.
At this point attempting to do a checkconfig fails as it seems parts of the config is missing even though I verified the files were there in this particular instance it was missing a host template.
I also attempted to use the objects.cache file (2.5 million lines) from our central Nagios server and import that file to the data base with basically the same results.. the import and write process through the web page just seems to time out.
If there are logs somewhere that we could look at to see what is going on while it is trying to process the data that would be helpful. I am thinking our config is just too big for Nagios to handle. I really think the import process is hanging they put on red that they recommend importing in a certain order “commands->timeperiods->contacttemplates etc…. but with our files I don’t think it is possible. Is this really necessary? If so it could be why the import process is failing.
Nagios XI - Import hangs without any diagnostic messages
Active Nagios Core user attempting INITIAL import of functioning Nagios Core configuration into Nagios XI.
Nagios Core configuration functional, but GUI is sluggish due to large configuration and corresponding processing of large flat files (see details below).
Attempting to replace core front-end with Nagios XI to improve GUI performance,allow for target user "role" implementation, and use other XI specific features.
Nagios XI import process hangs without displaying root cause or progress details.
Can import process logging verbosity be enabled and/or increased to determine root cause?
Can any timers be adjusted or parameters altered?
Background (Nagios Core):
Currently running Nagios Core 3.2.3 and Plugins 1.4.15
Fully functional distributed environment with three distributed servers sending checks via nsca to a single central server.
Central server has roughly 15,000 hosts defined and 50,000 services defined.
Distributed servers each have a subset of the full configuration and checks are split between the three.
Central server successfully processing roughly 30,000 passive checks results /5min.
Central server active checks restricted to nagios server/daemon healthchecks executed via nrpe, roughly 300 /5min (these could also be migrated to a distributed server if desired).
Central server objects.cache file consists of 2.5 million lines.
Central server status.dat file consists of 3.7 million lines.
Background (Nagios XI):
NagiosXI
uname -a
Linux localhost.localdomain 2.6.18-164.9.1.el5 #1 SMP Tue Dec 15 21:04:57 EST 2009 i686 i686 i386 GNU/Linux
NagiosXI is running as a VM under Oracle VM VirtulalBox version 3.2.10 r66523
I am using the nagiosxi-2009r1.3g-vmware.zip file converted to virtual box as described by the NagiosXI instructions.
The VM failed to work initially and the initrd had to be modified since virtual box uses SATA drives instead of IDE drives.
Through instruction found at http://support.nagios.com/forum/viewtop ... lbox#p3063 I was able to create the new initrd file:
mkinitrd --allow-missing --preload=ahci --force-scsi-probe /boot/initrd-`uname -r`-custom.img `uname -r`
edit /boot/grub.conf:
from:
initrd /initrd-2.6.18-164.91.e15.img
to:
initrd /initrd-2.6.18-164.91.e15-custom.img
At this point everything fires up and works. With the defaults.
Using instructions from http://library.nagios.com/library/produ ... -prep-tool i was able to import all the config files into the cfgprep directory. One file did generate some php notifications but from looking at the error I don’t think this is an issue.
Now when following the instruction from
http://library.nagios.com/library/produ ... es-into-xi
is where the issue is.
After selecting the files in cfgprep with a check in overwrite database the page just times out and returns nothing. There is no messages indicating anything is being done.. the browser will either timeout or or just return like a blank frame after several minutes.
After giving it plenty of time I moved on to Write monitoring data. Same results here the page does no tell me its doing anything and tends to time out or return a blank frame.
Moving on to write additional data.. I do see output as indicated by the instructions and this does finish successfully.
At this point attempting to do a checkconfig fails as it seems parts of the config is missing even though I verified the files were there in this particular instance it was missing a host template.
I also attempted to use the objects.cache file (2.5 million lines) from our central Nagios server and import that file to the data base with basically the same results.. the import and write process through the web page just seems to time out.
If there are logs somewhere that we could look at to see what is going on while it is trying to process the data that would be helpful. I am thinking our config is just too big for Nagios to handle. I really think the import process is hanging they put on red that they recommend importing in a certain order “commands->timeperiods->contacttemplates etc…. but with our files I don’t think it is possible. Is this really necessary? If so it could be why the import process is failing.