Blank Services and Active checks disabled

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: Blank Services and Active checks disabled

Post by jdalrymple »

Well that sounds like a whole different problem.

To expand on why I think what I said earlier is interesting - I've seen installations where the apply config ........................................................................ might take about 30 seconds, but the difference between parent and child is like 5 minutes. In those instances, it genuinely took Nagios that long to come up and get going - I've seen it cause all sorts of weirdness, none of which I can recall at this exact moment :)

That said, something else is going on in your environment. nagios.log usually says why it crapped out if it does. Anything?
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Blank Services and Active checks disabled

Post by CFT6Server »

WillemDH wrote:How many hosts / services do you have if I may ask?
I have 1535 hosts and 21872 services
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Blank Services and Active checks disabled

Post by CFT6Server »

jdalrymple wrote:Well that sounds like a whole different problem.

To expand on why I think what I said earlier is interesting - I've seen installations where the apply config ........................................................................ might take about 30 seconds, but the difference between parent and child is like 5 minutes. In those instances, it genuinely took Nagios that long to come up and get going - I've seen it cause all sorts of weirdness, none of which I can recall at this exact moment :)

That said, something else is going on in your environment. nagios.log usually says why it crapped out if it does. Anything?
just this right now. Is there a verbose level setting?

Code: Select all

[1442267646] Successfully launched command file worker with pid 18934
[1442267646] Caught SIGSEGV, shutting down...
It started with that symptom you describe, where applying configurations could take a while, then the services takes a while like a few minutes or more. But now, it just crashes the nagios process.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Blank Services and Active checks disabled

Post by tgriep »

In the nagios.cfg file, you can change the debug_level from 0 to 15 and then restart nagios.
Check this file for any errors.

Code: Select all

/usr/local/nagios/var/nagios.debug
Let us know what you find in there.
Be sure to check out our Knowledgebase for helpful articles and solutions!
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Blank Services and Active checks disabled

Post by CFT6Server »

So I just did a small network wizard and applied configuration. It is stuck or the Nagios server stopped. Attached is the debug log.

Again this is what I see in the nagios.log file

Code: Select all

[1442341702] Successfully launched command file worker with pid 25726
[1442341702] Caught SIGSEGV, shutting down...
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Blank Services and Active checks disabled

Post by tgriep »

Debug file didn't catch anything. Try changing the debug_level to 7 and see if that catches anything.
Be sure to check out our Knowledgebase for helpful articles and solutions!
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Blank Services and Active checks disabled

Post by CFT6Server »

Give this a try. Set to debug level 7.
I basically applied configuration, but no luck. Nagios service is not running or doesn't stay running.

Doesn't look like there are any errors. Is there something we can check? perhaps on the DB side?
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Blank Services and Active checks disabled

Post by tgriep »

The only other log files you could check are the following.

Code: Select all

/var/log/messages
/var/log/mysqld.log
Lets run a Nagios config verification to be sure. Run this to see if they have any errors.

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Be sure to check out our Knowledgebase for helpful articles and solutions!
CFT6Server
Posts: 506
Joined: Wed Apr 15, 2015 4:21 pm

Re: Blank Services and Active checks disabled

Post by CFT6Server »

I don't see any errors. The clue here is that this was a gradual issue, where after applying changes, it took longer and longer to get services to start showing up to now where it never completes. So what is happening during that process?

messages.log

Code: Select all

Sep 15 12:27:11 kdcnagxi01 rsyslogd-2177: imuxsock begins to drop messages from pid 8112 due to rate-limiting
Sep 15 12:27:24 kdcnagxi01 rsyslogd-2177: imuxsock lost 3223 messages from pid 8112 due to rate-limiting
Sep 15 12:27:24 kdcnagxi01 nagios: Successfully launched command file worker with pid 8250
Sep 15 12:27:24 kdcnagxi01 nagios: Caught SIGSEGV, shutting down...
has a bunch of warnings about check intervals and notifications but i didn't want to include as it shows a lot of our server names etc.

mysqld.log

Code: Select all

150912 18:50:42  InnoDB: Completed initialization of buffer pool
150912 18:50:42  InnoDB: Started; log sequence number 0 44233
150912 18:50:42 [Note] Event Scheduler: Loaded 0 events
150912 18:50:42 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
150915 12:11:44 [Note] /usr/libexec/mysqld: Normal shutdown

150915 12:11:44 [Note] Event Scheduler: Purging the queue. 0 events
150915 12:11:44  InnoDB: Starting shutdown...
150915 12:11:47  InnoDB: Shutdown completed; log sequence number 0 44233
150915 12:11:47 [Note] /usr/libexec/mysqld: Shutdown complete

150915 12:11:48 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
150915 12:12:41 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150915 12:12:42  InnoDB: Initializing buffer pool, size = 8.0M
150915 12:12:42  InnoDB: Completed initialization of buffer pool
150915 12:12:42  InnoDB: Started; log sequence number 0 44233
150915 12:12:42 [Note] Event Scheduler: Loaded 0 events
150915 12:12:42 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
I'll PM you the check, it will have the similar output as messages but has the warnings that I was talking about.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Blank Services and Active checks disabled

Post by tgriep »

I received the PM. No errors so that is good.
This link talks about broker modules and patches that could be applied to your system.
https://support.nagios.com/forum/viewto ... =7&t=31096

We may need more of the /var/log/messages file to see what before the SIGSEGV entry to see what is happening.

Also, if the system is running out of memory, the kernel could shut down the daemon.
How much memory does the system have and how much free memory is available?
Try adding more memory and see if that helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked