Nagios XI host check orphaned and duplicate nagios process
Re: Nagios XI host check orphaned and duplicate nagios proce
The latest XI is 5.2.7 and was just released last week on Wednesday. Want us to keep this open for a bit?
Former Nagios employee
Re: Nagios XI host check orphaned and duplicate nagios proce
Yes please. I will be updating our production environment once all of the kinks have been taken care of in test and I have documentation to give folks. By the way is there is new documentation available for enabling/disabling notifications, adding/removing hosts with the new updated GUI?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios XI host check orphaned and duplicate nagios proce
By enabling/disabling do you mean per user, host/service or server wide?emartine wrote:Yes please. I will be updating our production environment once all of the kinks have been taken care of in test and I have documentation to give folks. By the way is there is new documentation available for enabling/disabling notifications, adding/removing hosts with the new updated GUI?
Re: Nagios XI host check orphaned and duplicate nagios proce
A tutorial user guide for host/service setting into downtime/disabling a single host or service.
Re: Nagios XI host check orphaned and duplicate nagios proce
Not that I've seen, I've created a couple of documentation task for it though with a link back to this thread:
Code: Select all
Task 8265 Detail: Guide for setting host/service downtimeCode: Select all
Task 8266 Detail: Guide for disabling single hosts/servicesRe: Nagios XI host check orphaned and duplicate nagios proce
This issue has come up again. We are on Nagios XI 5.4.4
Sometime at ~5:25 AM this morning nagios started sending out false alerts regarding hosts being down.
Info: (host check orphaned, is the mod-gearman worker on queue host running?)
Date/Time: 2017-07-07 05:24:32
Gearman was restarted, nagios process had to be killed and all services restarted again. The database seems to be somewhat slow as well.
Any idea where to start troubleshooting this?
Sometime at ~5:25 AM this morning nagios started sending out false alerts regarding hosts being down.
Info: (host check orphaned, is the mod-gearman worker on queue host running?)
Date/Time: 2017-07-07 05:24:32
Gearman was restarted, nagios process had to be killed and all services restarted again. The database seems to be somewhat slow as well.
Any idea where to start troubleshooting this?
Re: Nagios XI host check orphaned and duplicate nagios proce
Audit log shows
2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)
At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.
2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)
At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios XI host check orphaned and duplicate nagios proce
ID=1119 is COMMAND_DELETE_SYSTEM_BACKUP, so it deleted a previous backup.emartine wrote:Audit log shows
2017-07-07 05:18:31 95296 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1119)
2017-07-07 05:01:02 95295 Nagios XI INFO localhost User submitted a command to the subsystem (ID=1117)
At 5:01 nagios does an SSH backup. I am not sure what the process at 5:18 is.
This shouldn't at all cause your setup to fail.. Before you kill off the processes do you note if there are multiple nagios processes?
I have seen this happen on a Nagios restart if it has to wait too long for mod_gearman workers to return their results
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios XI host check orphaned and duplicate nagios proce
A colleague noted to me that you could also experience problem s if the XI server machine time is not synced closely with the time on the workers, similar issues could occur.
Re: Nagios XI host check orphaned and duplicate nagios proce
All 4 workers machine time are in sync with Nagios server machine time.