Page 6 of 7

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Mon Apr 11, 2016 1:12 pm
by tmcdonald
The latest XI is 5.2.7 and was just released last week on Wednesday. Want us to keep this open for a bit?

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Mon Apr 11, 2016 2:14 pm
by emartine
Yes please. I will be updating our production environment once all of the kinks have been taken care of in test and I have documentation to give folks. By the way is there is new documentation available for enabling/disabling notifications, adding/removing hosts with the new updated GUI?

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Mon Apr 11, 2016 4:22 pm
by scottwilkerson
emartine wrote:Yes please. I will be updating our production environment once all of the kinks have been taken care of in test and I have documentation to give folks. By the way is there is new documentation available for enabling/disabling notifications, adding/removing hosts with the new updated GUI?
By enabling/disabling do you mean per user, host/service or server wide?

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Tue Apr 12, 2016 10:08 am
by emartine
A tutorial user guide for host/service setting into downtime/disabling a single host or service.

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Tue Apr 12, 2016 3:47 pm
by ssax
Not that I've seen, I've created a couple of documentation task for it though with a link back to this thread:

Code: Select all

Task 8265 Detail: Guide for setting host/service downtime

Code: Select all

Task 8266 Detail: Guide for disabling single hosts/services

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Fri Jul 07, 2017 10:04 am
by emartine
This issue has come up again. We are on Nagios XI 5.4.4

Sometime at ~5:25 AM this morning nagios started sending out false alerts regarding hosts being down.

Info: (host check orphaned, is the mod-gearman worker on queue host running?)
Date/Time: 2017-07-07 05:24:32

Gearman was restarted, nagios process had to be killed and all services restarted again. The database seems to be somewhat slow as well.

Any idea where to start troubleshooting this?

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Fri Jul 07, 2017 10:16 am
by emartine
Audit log shows

2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)

At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Fri Jul 07, 2017 12:06 pm
by scottwilkerson
emartine wrote:Audit log shows

2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)

At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.
ID=1119 is COMMAND_DELETE_SYSTEM_BACKUP, so it deleted a previous backup.

This shouldn't at all cause your setup to fail.. Before you kill off the processes do you note if there are multiple nagios processes?

I have seen this happen on a Nagios restart if it has to wait too long for mod_gearman workers to return their results

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Fri Jul 07, 2017 1:24 pm
by scottwilkerson
A colleague noted to me that you could also experience problem s if the XI server machine time is not synced closely with the time on the workers, similar issues could occur.

Re: Nagios XI host check orphaned and duplicate nagios proce

Posted: Fri Jul 07, 2017 4:28 pm
by emartine
All 4 workers machine time are in sync with Nagios server machine time.