Page 4 of 5
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 10:38 am
by CFT6Server
So I think I found the issue and I was able to get configurations to apply properly again. As I was suspecting it was the DB that was causing this lag, I reviewed the logs and saw that the buffer for mysqld was set to 8M. Which for our environment could be too small. Since we have plenty of RAM, I gave it 512M of buffer to see what will happen and voila, everything is now working and I can continue to add and make changes and apply configurations.
Changes was made to /etc/my.cnf
Code: Select all
innodb_buffer_pool_size=512M
innodb_additional_mem_pool_size=128M
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 11:02 am
by WillemDH
Very nice CFT6! Thanks for posting your solution. I'm also going to look into this...

Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 11:47 am
by CFT6Server
Hmmm maybe jumped the gun here. It worked for a while, but after some additional configurations, it is showing the same symptoms again with message "Caught SIGSEGV, shutting down..."
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 1:13 pm
by CFT6Server
Really seeking some guidance here guys from Nagios. I cannot make any changes without running into this issue and starting to impact us. Any ideas?
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 1:46 pm
by tgriep
Lets fix the mysql database errors by running the following on the XI system in a shell.
Code: Select all
cd /usr/local/nagiosxi/scripts
./repair_databases.sh
Your system is monitoring a lot of hosts and services and it may need more time to shutdown when an apply configuration is run.
On line 209 of the /etc/init.d/nagios file, could you change this line from
Code: Select all
for i in 1 2 3 4 5 6 7 8 9 10 ; do
to
Code: Select all
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ; do
See if that helps out.
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 2:35 pm
by CFT6Server
I've ran that repair yesterday and that's why I thought it worked, but no luck.
I've also edit as per your instructions, but that line seems to be for waiting for nagios to stop when applying configurations. Right now, Nagios service won't even start after a reboot, so not sure if that will help, and I still cannot get the nagios service to run.
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 4:44 pm
by Box293
Box293 wrote:When you hit Apply Config, it runs the script
reconfigure_nagios.sh. You can run this script yourself:
Code: Select all
cd /usr/local/nagiosxi/scripts
reconfigure_nagios.sh
Can you please run the script and post the output here.
When you run this script at the CLI, does nagios remain running, or does it die like it has normally been?
Can you please upload the file:
/etc/sudoers
Also, any files in
/etc/sudoers.d/
Can you please try these steps and report back.
Re: Blank Services and Active checks disabled
Posted: Thu Sep 17, 2015 4:56 pm
by CFT6Server
Sorry I thought I had posted that.
Re: Blank Services and Active checks disabled
Posted: Fri Sep 18, 2015 12:53 pm
by tgriep
What is puzzling is this line from your output.txt file.
/etc/init.d/nagios: line 141: kill: (19280) - No such process
Do you have multiple people accessing your system and applying the config?
Could you run the following and post back the results?
Code: Select all
cat /usr/local/nagios/var/nagios.lock
ps -ef |grep bin/nagios
And the following line is missing from the /etc/sudoers file. Can you add it and see that helps?
Code: Select all
NAGIOSXI ALL = NOPASSWD:/usr/local/nagiosxi/scripts/reset_config_perms.sh
Re: Blank Services and Active checks disabled
Posted: Fri Sep 18, 2015 12:58 pm
by jdalrymple
CFT6Server wrote:Really seeking some guidance here guys from Nagios. I cannot make any changes without running into this issue and starting to impact us. Any ideas?
One thing that I'm a bit vague on - how are you (ever) getting it back up? It sounds like you're unable to get the service up right now?
To answer your earlier question about what the reconfigure script does:
1) look for files that require updating
2) create and download the appropriate files from nagiosql
3) config verify
4) nagios restart
When you're trying to start the nagios service it just tanks and nothing is showing up useful in nagios.log? Is ndo2db starting and running OK? Anything in dmesg or /var/log/messages?
What do you get if you try to start nagios interactively?
Code: Select all
/usr/local/nagios/bin/nagios -c /usr/local/nagios/etc/nagios.cfg