Nagios XI host check orphaned and duplicate nagios process

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by tmcdonald »

The latest XI is 5.2.7 and was just released last week on Wednesday. Want us to keep this open for a bit?
Former Nagios employee
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

Yes please. I will be updating our production environment once all of the kinks have been taken care of in test and I have documentation to give folks. By the way is there is new documentation available for enabling/disabling notifications, adding/removing hosts with the new updated GUI?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by scottwilkerson »

emartine wrote:Yes please. I will be updating our production environment once all of the kinks have been taken care of in test and I have documentation to give folks. By the way is there is new documentation available for enabling/disabling notifications, adding/removing hosts with the new updated GUI?
By enabling/disabling do you mean per user, host/service or server wide?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

A tutorial user guide for host/service setting into downtime/disabling a single host or service.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by ssax »

Not that I've seen, I've created a couple of documentation task for it though with a link back to this thread:

Code: Select all

Task 8265 Detail: Guide for setting host/service downtime

Code: Select all

Task 8266 Detail: Guide for disabling single hosts/services
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

This issue has come up again. We are on Nagios XI 5.4.4

Sometime at ~5:25 AM this morning nagios started sending out false alerts regarding hosts being down.

Info: (host check orphaned, is the mod-gearman worker on queue host running?)
Date/Time: 2017-07-07 05:24:32

Gearman was restarted, nagios process had to be killed and all services restarted again. The database seems to be somewhat slow as well.

Any idea where to start troubleshooting this?
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

Audit log shows

2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)

At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by scottwilkerson »

emartine wrote:Audit log shows

2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)

At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.
ID=1119 is COMMAND_DELETE_SYSTEM_BACKUP, so it deleted a previous backup.

This shouldn't at all cause your setup to fail.. Before you kill off the processes do you note if there are multiple nagios processes?

I have seen this happen on a Nagios restart if it has to wait too long for mod_gearman workers to return their results
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by scottwilkerson »

A colleague noted to me that you could also experience problem s if the XI server machine time is not synced closely with the time on the workers, similar issues could occur.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
emartine
Posts: 660
Joined: Thu Dec 29, 2011 10:47 am

Re: Nagios XI host check orphaned and duplicate nagios proce

Post by emartine »

All 4 workers machine time are in sync with Nagios server machine time.
Locked