Performance Issues / fork() errors

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Gavin
Posts: 58
Joined: Mon Dec 24, 2012 4:56 am

Performance Issues / fork() errors

Post by Gavin »

I'm seeing a lot of these errors in 'nagios.log':

Code: Select all

[1359237728] Warning: The check of service 'service' on host 'host' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1359237518] Warning: The check of service 'service' on host 'host' could not be performed due to a fork() error: 'Resource temporarily unavailable'.  The check will be rescheduled.
The load average on the Nagios server is around 30, despite implementing a combination of ramdisk and rrdcached. I'm beginning to think the 7.2k hard drives in the system are a limitation, and that our performance issues relate to I/O. We're running approx 1500 checks, most of which are checked every minute. I've configured 'dumb' spacing of services, which seemed to help quite a bit.

However, the above error seems to suggest something else is going wrong (it's logged very frequently). Googling suggests this could be caused by the use of embedded perl, but I don't know enough about Linux / Perl / the inner workings of Nagios to understand if this is the case, and what we can do to resolve.

The server is a Quad Core Xeon with 16GB RAM.

Any ideas?

Many thanks,

Gavin
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Performance Issues / fork() errors

Post by scottwilkerson »

This error actually sounds like you could have multiple processes of nagios running
http://support.nagios.com/wiki/index.ph ... g_Orphaned

As for the performance, first, if you are not running XI 2012R1.4 I would highly recommend upgrading as there have been significant performance increases.

Aside from that, the best thing you could do at this point would be to offload MySQL to another server which would help the IO problem a lot.
http://assets.nagios.com/downloads/nagi ... Server.pdf
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Gavin
Posts: 58
Joined: Mon Dec 24, 2012 4:56 am

Re: Performance Issues / fork() errors

Post by Gavin »

The fork() error seems to have been a red herring, and is no longer occurring. As for performance, we are considering either offload MySQL or using SSD drives instead.

We are running the latest version of XI on a system with the following specifications:

* Intel Xeon E3-1225, Quad 3.2Ghz
* 16GB 1333Mhz DDR3 RAM
* 2x 2TB Hitachi 7200RPM SATA3 Drives

We're using our XI instance as follows:
* 1500 active service checks every minute, most with performance data
* 110 hosts
* 5x users logged into the portal at the same time

We've got RRDCached (with a delay of five minutes) configured, as well as a RAM Disk (as per the published document).

The server rarely gets a load average lower than 20. Does this seem normal? I know its difficult to say for sure, but I'd be interested to hear what you think. We're probably going to end up with just under 4,000 active checks, and a further 4,000 passive checks on this box eventually, so it'd be good to get some idea of the performance we can expect.

Thanks again for your help, excellent as always.

Thanks,

Gavin
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Performance Issues / fork() errors

Post by scottwilkerson »

Gavin wrote:The server rarely gets a load average lower than 20. Does this seem normal? I know its difficult to say for sure, but I'd be interested to hear what you think.
With what you have said, checking most of these every minute, it is probably normal. The remaining this you could do to reduce the load (which is still likely caused by IO Waite) Would be to offload the MySQL Server.
http://library.nagios.com/library/produ ... ote-server

Going further than that would be a Mod Gearman setup
http://library.nagios.com/library/produ ... -nagios-xi
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Gavin
Posts: 58
Joined: Mon Dec 24, 2012 4:56 am

Re: Performance Issues / fork() errors

Post by Gavin »

Thanks Scott. We're looking into those options.

I had assumed the 'Cannot connect to database' error we've been seeing in CCM was due to load, but every it happens, ndoutils seems to die. If I grep the messages log for ndo, I get...

Code: Select all

Jan 29 21:17:47 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:17:48 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:17:48 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:17:48 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 21:27:32 Nagios nagios: ndomod: Shutdown complete.
Jan 29 21:27:32 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:27:33 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:27:33 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:27:33 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 21:30:35 Nagios nagios: ndomod: Shutdown complete.
Jan 29 21:30:35 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:30:36 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:30:36 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:30:36 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 21:37:59 Nagios nagios: ndomod: Shutdown complete.
Jan 29 21:37:59 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:38:00 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:38:00 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:38:00 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 21:39:53 Nagios nagios: ndomod: Shutdown complete.
Jan 29 21:39:53 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:39:54 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:39:54 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:39:54 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 21:55:04 Nagios nagios: ndomod: Shutdown complete.
Jan 29 21:55:04 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:55:05 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:55:05 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:55:05 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 21:57:39 Nagios nagios: ndomod: Shutdown complete.
Jan 29 21:57:39 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 21:57:40 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 21:57:40 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 21:57:40 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Jan 29 22:01:26 Nagios nagios: ndomod: Shutdown complete.
Jan 29 22:01:26 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
Jan 29 22:01:28 Nagios nagios: ndomod: NDOMOD 1.5.1 (05-15-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Jan 29 22:01:28 Nagios nagios: ndomod: Successfully connected to data sink.  0 queued items to flush.
Jan 29 22:01:28 Nagios nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Any ideas? This is really annoying when trying to configure a host or service, and it happens really frequently. I've repaired the database to no avail.

Thanks,

Gavin
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Performance Issues / fork() errors

Post by abrist »

What version of XI are you running?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Gavin
Posts: 58
Joined: Mon Dec 24, 2012 4:56 am

Re: Performance Issues / fork() errors

Post by Gavin »

2012R1.4 on CentOS 6.3

Thanks,

Gavin
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Performance Issues / fork() errors

Post by abrist »

I assume you have verified that you only have 1 nagios process running?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Gavin
Posts: 58
Joined: Mon Dec 24, 2012 4:56 am

Re: Performance Issues / fork() errors

Post by Gavin »

I just did a 'killall -9 nagios' and started it again to be sure, and the database problem reoccurred very quickly. We're not seeing any of the fork errors any more, that seems to have been a total one off.

Thanks,

Gavin
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Performance Issues / fork() errors

Post by abrist »

Fair enough. I am still digging on this one.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Locked