Page 1 of 1
Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Tue Aug 19, 2014 12:41 am
by rajasegar
Nagios 2014R1.2 -> 2014R1.4
Mod Gearman services was working fine with 2014R1.2.
Did test upgrade in Dev and it was fine.
However when upgraded production. The scheduling went haywire and could never keep up.
After upgrade to R1.4, waited for 30 minutes still same
19-08-2014 07-35-33 AM.png
Restore back to R1.2
19-08-2014 07-55-46 AM.png
No error messages anywhere in logs or in upgrade log.
Please advice on this issue.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Tue Aug 19, 2014 1:33 pm
by slansing
We had this reported from one user who upgraded from 2014 r1.2 to r1.3, the resolution was the following:
After the upgrade -
Code: Select all
service nagios stop
mv /usr/local/nagios/var/retention.dat /usr/local/nagios/var/retention.dat.bak
service nagios start
Your scheduling should be clean now.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Tue Aug 19, 2014 6:02 pm
by rajasegar
slansing wrote:We had this reported from one user who upgraded from 2014 r1.2 to r1.3, the resolution was the following:
After the upgrade -
Code: Select all
service nagios stop
mv /usr/local/nagios/var/retention.dat /usr/local/nagios/var/retention.dat.bak
service nagios start
Your scheduling should be clean now.
Please update the side effects of doing this.
The last time you guys asked to do this, nagios send notifications again for all the services that had been set for single notification.
This caused a lot of issues at our end.
There must be a better way.
Since this issue was known why was it not communicated? This would have saved us a lot of unnecessary work.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Wed Aug 20, 2014 4:48 pm
by tmcdonald
You can simply disable notifications, do slansing's suggestion, then re-enable notifications once checking resumes. You shouldn't get emails "queued up" and sent after you re-enable.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Wed Aug 20, 2014 10:56 pm
by rajasegar
tmcdonald wrote:You can simply disable notifications, do slansing's suggestion, then re-enable notifications once checking resumes. You shouldn't get emails "queued up" and sent after you re-enable.
Sounds good in theory but this cannot be applied in an Enterprise Environment.
We have over 1200 devices and 10,000 services being monitored.
What if some important alert is not sent out during this time?
Most of them is set to notify only once until the status changes.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Thu Aug 21, 2014 10:05 am
by abrist
I understand your concerns and my apologies for the work to disable notifications.. This is why it is best to upgrade/perform maintenance in a maintenance window.
Are you referring to losing your acknowledgements, or some other setting?
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Thu Aug 21, 2014 6:41 pm
by rajasegar
abrist wrote:I understand your concerns and my apologies for the work to disable notifications.. This is why it is best to upgrade/perform maintenance in a maintenance window.
Are you referring to losing your acknowledgements, or some other setting?
There is no maintenance window for monitoring systems. It is requested when required.
I am worried about new notifications not being sent out.
Disabling notification only helps for those alerts already sent out and we want to avoid resending it.
I suggest Nagios to setup a known issues tracker so that those attempting upgrade are aware of issues.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Fri Aug 22, 2014 9:29 am
by slansing
Well, known issues insinuates that the problems are occurring on all systems effected by a certain version. Unfortunately, we can not pull a scare tactic and post all possible issues that may effect someone without verifying them against our systems internally, or other customer/user systems, as the majority of them are one shots on independent systems. There is a bug/feature request tracker at
http://tracker.nagios.com/ .
It would be nice to upgrade in place, unfortunately, that is really impossible unless you were to upgrade a complete carbon copy of your production server, and fail your production server over. Though even that would incite some lag and the possibility of dropping monitoring / alerting for a short period of time. There is an expected period of monitoring downtime when you must restart services, and change/move files on the server itself. If your hosts/services are still down when you re-enable notifications, or went down during the upgrade, Nagios should send a notification along it's interval that you have set for those systems. You could also make use of the Ops Screen, or Ops Center pages to keep an eye on the current issues in your monitoring environment, that should persist visibly through an update, though it may lock up for a moment when apache/nagios are restarted.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Posted: Sat Aug 23, 2014 5:58 am
by rajasegar
slansing wrote:Well, known issues insinuates that the problems are occurring on all systems effected by a certain version. Unfortunately, we can not pull a scare tactic and post all possible issues that may effect someone without verifying them against our systems internally, or other customer/user systems, as the majority of them are one shots on independent systems. There is a bug/feature request tracker at .
http://tracker.nagios.com/
Sorry I dont agree with you. I just dont have the time to filter through the bug tracker.
Please close this thread.