Service checks notifications swithcing back to enabled

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Service checks notifications swithcing back to enabled

Post by krobertson71 »

Redhat el6.. Nagios XI 2.7

We have been running into a weird issue lately. We have all or ncpa swap partition checks disabled. We collect data we just do not alert on it. For several weeks now, off and on, we will notice that all the swap checks are enabled again, usually an angry admin saying they got alerted. Wont get into the history of that decision.

At first I thought it was that some of our nagios admins were just disabling entire hostgroups instead of using scheduled downtime for patching nights. But I checked this myself last night and scheduling and unscheduling downtime does not enable the service checks again (which is good).

So I have combed through the many logs, subsystem logs, audits, etc.. and can find nothing showing when this is happening.

Is there a known bug with service notifications for one reason or another that could be doing this?

If not, can someone point me in the right direction to try to find out how this is occuring?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Service checks notifications swithcing back to enabled

Post by Box293 »

Can you download the History tab component:

https://exchange.nagios.org/directory/A ... ab/details

Upload it via Admin > System Extensions > Manage Components

Now go to one of these services and click the History tab.
Is there any information displayed here that may show if Notifications were enabled?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Service checks notifications swithcing back to enabled

Post by krobertson71 »

First thanks for the history tab! Some good information there.

Problem:

If I enable and disable notifications on a single service check it will register it on the history tab.

I had to mass disable about 80 service checks yesterday and those are not showing up in the respective history tabs.

I did this by going into the service group information screen and selecting "disable all services in the service group".

I have checked several of those and nothing in the history tab shows this event occurring.

There was a scheduled downtime earlier that day for patching so everything was put into downtime. I am wondering by removing the downtime this somehow enabled the notifications for services that had it disabled before the downtime.

I have gone through all the logs I can think of. I tested this last night manually. I put one host's services into a scheduled downtime, with one of the services having their notifications disabled beforehand.

I then went and removed that downtime via Mass Acknowledgement. The service check remained disabled. This is the same process we follow when scheduling and removing downtime.

Any other ideas would be a great help here!
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Service checks notifications swithcing back to enabled

Post by lmiltchev »

Problem:

If I enable and disable notifications on a single service check it will register it on the history tab.

I had to mass disable about 80 service checks yesterday and those are not showing up in the respective history tabs.

I did this by going into the service group information screen and selecting "disable all services in the service group".

I have checked several of those and nothing in the history tab shows this event occurring.
You are correct. Disabling notifications for all services in a services group won't show up in the "history tab" component. I will talk to Troy to see if this is something that can be added easily to the component.
There was a scheduled downtime earlier that day for patching so everything was put into downtime. I am wondering by removing the downtime this somehow enabled the notifications for services that had it disabled before the downtime.
I was not able to recreate this issue in house. You said you went through bunch of different logs. Have you tried grepping the nagios.log for the name of the "problem" service? Anything that can give us some clues? Also, have to tried disabling notifications in the CCM (Alert Settings tab)? Does this change "stick"?
Be sure to check out our Knowledgebase for helpful articles and solutions!
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Service checks notifications swithcing back to enabled

Post by krobertson71 »

Let me say this better.

When I go through the audit log in XI all it shows it a user submitted a cmd to to subsystem. What they submitted is not presented. I have checked the nagios.log, cmdsubsys log, etc.. and I cannot see anything that shows notifications being disabled.

I have grepped the entire log directory looking for "grep -i disable" "grep -i notifications" (lots of hits, not what I was looking for though).

Guess another question is.. is it possible that when we are ending downtime, since what downtime really does is disable notifications, that it is turning all notifications back on?

We have multiple hostgroups for different applications and services. We also have a LIVE hostgroup that contains all the productions hosts and services. When patching time comes around we schedule downtime for the hostgroups and all services.
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Service checks notifications swithcing back to enabled

Post by krobertson71 »

We are also having another issue around this.. I was going to open another thread, but now it seems to fit into here better. Couple of our admins were trying out using the /import directory to make changes to hosts, like mass updating thresholds to a NCPA CPU check.

What they are doing is taking the service.cfg file for the hosts.. doing a mass edit, and importing them via the import directory over again. Concerns me as these are the same config files that state "Do not edit this file" and it is not using the same format as stated in your "Automation of Hosts and Services" documentation. I am wondering if this could be causing some issues with notification settings.. Here is why:

The front end (GUI) will have the icon that notifications are disabled:
nagiios-notificaions-ccm-2.png
IN CCM it will show the opposite:
nagiios-notificaions-ccm-1.png
You do not have the required permissions to view the files attached to this post.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Service checks notifications swithcing back to enabled

Post by abrist »

Object config state and runtime state are two separate things. You will most likely find that notifications are disabled in the retention.dat file, even though they are enabled in the CCM. On a restart, Nagios will read the object configs (generated from the CCM) first, writing that information into objects.cache and status.dat. It will then parse the retention.dat file and overwrite any settings that are different in status.dat.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Service checks notifications swithcing back to enabled

Post by Box293 »

krobertson71 wrote:Problem:

If I enable and disable notifications on a single service check it will register it on the history tab.

I had to mass disable about 80 service checks yesterday and those are not showing up in the respective history tabs.

I did this by going into the service group information screen and selecting "disable all services in the service group".

I have checked several of those and nothing in the history tab shows this event occurring.
lmiltchev wrote:You are correct. Disabling notifications for all services in a services group won't show up in the "history tab" component. I will talk to Troy to see if this is something that can be added easily to the component.
I've added this to my "to do list", it might take a while to get to this as I'm busy with some other projects.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
CatalystX
Posts: 12
Joined: Tue Mar 31, 2015 2:03 pm

Re: Service checks notifications swithcing back to enabled

Post by CatalystX »

abrist wrote:Object config state and runtime state are two separate things. You will most likely find that notifications are disabled in the retention.dat file, even though they are enabled in the CCM. On a restart, Nagios will read the object configs (generated from the CCM) first, writing that information into objects.cache and status.dat. It will then parse the retention.dat file and overwrite any settings that are different in status.dat.
Umm ... Can you clarify that? Because if configs are correct and CCM is correct, then objects.cache and status.dat should take precedence, shouldn't they?
krobertson71
Posts: 444
Joined: Tue Feb 11, 2014 10:16 pm

Re: Service checks notifications swithcing back to enabled

Post by krobertson71 »

So if retention.dat is off, what is the proper procedure to make sure it is set the way we want it to be since it says at the top of the file not to modify this file?

Can we delete this file, then make changes to the services we want notifications disabled, and have it regenerate?
Locked