Nagios 2014R1.2 -> 2014R1.4
Mod Gearman services was working fine with 2014R1.2.
Did test upgrade in Dev and it was fine.
However when upgraded production. The scheduling went haywire and could never keep up.
After upgrade to R1.4, waited for 30 minutes still same
Restore back to R1.2
No error messages anywhere in logs or in upgrade log.
Please advice on this issue.
Mod Gearman Issue post 2014R1.4 Upgrade
Mod Gearman Issue post 2014R1.4 Upgrade
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Mod Gearman Issue post 2014R1.4 Upgrade
We had this reported from one user who upgraded from 2014 r1.2 to r1.3, the resolution was the following:
After the upgrade -
Your scheduling should be clean now.
After the upgrade -
Code: Select all
service nagios stop
mv /usr/local/nagios/var/retention.dat /usr/local/nagios/var/retention.dat.bak
service nagios startRe: Mod Gearman Issue post 2014R1.4 Upgrade
Please update the side effects of doing this.slansing wrote:We had this reported from one user who upgraded from 2014 r1.2 to r1.3, the resolution was the following:
After the upgrade -
Your scheduling should be clean now.Code: Select all
service nagios stop mv /usr/local/nagios/var/retention.dat /usr/local/nagios/var/retention.dat.bak service nagios start
The last time you guys asked to do this, nagios send notifications again for all the services that had been set for single notification.
This caused a lot of issues at our end.
There must be a better way.
Since this issue was known why was it not communicated? This would have saved us a lot of unnecessary work.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Mod Gearman Issue post 2014R1.4 Upgrade
You can simply disable notifications, do slansing's suggestion, then re-enable notifications once checking resumes. You shouldn't get emails "queued up" and sent after you re-enable.
Former Nagios employee
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Sounds good in theory but this cannot be applied in an Enterprise Environment.tmcdonald wrote:You can simply disable notifications, do slansing's suggestion, then re-enable notifications once checking resumes. You shouldn't get emails "queued up" and sent after you re-enable.
We have over 1200 devices and 10,000 services being monitored.
What if some important alert is not sent out during this time?
Most of them is set to notify only once until the status changes.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: Mod Gearman Issue post 2014R1.4 Upgrade
I understand your concerns and my apologies for the work to disable notifications.. This is why it is best to upgrade/perform maintenance in a maintenance window.
Are you referring to losing your acknowledgements, or some other setting?
Are you referring to losing your acknowledgements, or some other setting?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
There is no maintenance window for monitoring systems. It is requested when required.abrist wrote:I understand your concerns and my apologies for the work to disable notifications.. This is why it is best to upgrade/perform maintenance in a maintenance window.
Are you referring to losing your acknowledgements, or some other setting?
I am worried about new notifications not being sent out.
Disabling notification only helps for those alerts already sent out and we want to avoid resending it.
I suggest Nagios to setup a known issues tracker so that those attempting upgrade are aware of issues.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Well, known issues insinuates that the problems are occurring on all systems effected by a certain version. Unfortunately, we can not pull a scare tactic and post all possible issues that may effect someone without verifying them against our systems internally, or other customer/user systems, as the majority of them are one shots on independent systems. There is a bug/feature request tracker at http://tracker.nagios.com/ .
It would be nice to upgrade in place, unfortunately, that is really impossible unless you were to upgrade a complete carbon copy of your production server, and fail your production server over. Though even that would incite some lag and the possibility of dropping monitoring / alerting for a short period of time. There is an expected period of monitoring downtime when you must restart services, and change/move files on the server itself. If your hosts/services are still down when you re-enable notifications, or went down during the upgrade, Nagios should send a notification along it's interval that you have set for those systems. You could also make use of the Ops Screen, or Ops Center pages to keep an eye on the current issues in your monitoring environment, that should persist visibly through an update, though it may lock up for a moment when apache/nagios are restarted.
It would be nice to upgrade in place, unfortunately, that is really impossible unless you were to upgrade a complete carbon copy of your production server, and fail your production server over. Though even that would incite some lag and the possibility of dropping monitoring / alerting for a short period of time. There is an expected period of monitoring downtime when you must restart services, and change/move files on the server itself. If your hosts/services are still down when you re-enable notifications, or went down during the upgrade, Nagios should send a notification along it's interval that you have set for those systems. You could also make use of the Ops Screen, or Ops Center pages to keep an eye on the current issues in your monitoring environment, that should persist visibly through an update, though it may lock up for a moment when apache/nagios are restarted.
Re: Mod Gearman Issue post 2014R1.4 Upgrade
Sorry I dont agree with you. I just dont have the time to filter through the bug tracker.slansing wrote:Well, known issues insinuates that the problems are occurring on all systems effected by a certain version. Unfortunately, we can not pull a scare tactic and post all possible issues that may effect someone without verifying them against our systems internally, or other customer/user systems, as the majority of them are one shots on independent systems. There is a bug/feature request tracker at .http://tracker.nagios.com/
Please close this thread.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation