Contacts and Alerts

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jandrews-flhi
Posts: 13
Joined: Wed Aug 02, 2017 9:05 am

Contacts and Alerts

Post by jandrews-flhi »

Hello,

I just received a report that multiple people are getting Nagios alerts they're not supposed to be receiving. The specific systems that these alerts are for should only be getting sent to two specific individuals and after reviewing the settings everything appears to be properly configured, yet alerts are being distributed to everyone.

The following is an example of the format used in the API calls that are made when systems are spun up on AWS. As you can see the contact group is properly set. Yet these systems and all others even deployed with the wizards are deploying notifications to every contact in the system if an alert is triggered.

Code: Select all

# SERVICE CPU STATS
curl -XPOST "http://nagios.flhi.com/nagiosxi/api/v1/config/service?apikey=${API}&pretty=1" -d "host_name=${ID}&\
service_description=CPU Stats&\
check_command=check_nrpe\!check_cpu_stats\!-a '-w 85 -c 95'&\
check_interval=5&\
retry_interval=1&\
max_check_attempts=5&\
check_period=24x7&\
contact_groups=Linux&\
notification_interval=5&\
notification_period=24x7&\
applyconfig=1"
  • 1.Linux Distribution and version? CentOS Linux release 7.4.1708
    2. 32 or 64bit? x64
    3. VMware Image or Manual Install of XI? Manual
    4. Are there special configurations on your system? No.
The following is an example of the nagios.log located in /usr/local/nagios/libexec/var/nagios.log... This is showing the alerts generated by an EC2 Instance being destroyed.. This uses the initial API calls that I provided above so the only ones who should be getting these alerts are the Linux contact group. Instead as you can see below, its distributing these alerts to everyone.

Code: Select all

[1513023896] SERVICE ALERT: i-044826cb9a126f379;Winbind Daemon;CRITICAL;SOFT;4;(Return code of 255 is out of bounds)
[1513023961] SERVICE ALERT: i-044826cb9a126f379;System Logging Daemon;CRITICAL;HARD;5;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023961] SERVICE NOTIFICATION: fjubas;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout
after 30 seconds.
[1513023961] SERVICE NOTIFICATION: fsuarez;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout
 after 30 seconds.
[1513023961] SERVICE NOTIFICATION: jfisher;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout
 after 30 seconds.
[1513023961] SERVICE NOTIFICATION: jvelez;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout
after 30 seconds.
[1513023961] SERVICE NOTIFICATION: mmoscater;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeo
ut after 30 seconds.
[1513023961] SERVICE NOTIFICATION: nagiosadmin;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket tim
eout after 30 seconds.
[1513023961] SERVICE NOTIFICATION: rweatherbee;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket tim
eout after 30 seconds.
[1513023961] SERVICE NOTIFICATION: jandrews;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeou
t after 30 seconds.
[1513023961] SERVICE NOTIFICATION: mcole;i-044826cb9a126f379;System Logging Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout a
fter 30 seconds.
[1513023983] SERVICE ALERT: i-044826cb9a126f379;Winbind Daemon;CRITICAL;HARD;5;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023983] SERVICE NOTIFICATION: fjubas;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 3
0 seconds.
[1513023983] SERVICE NOTIFICATION: fsuarez;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after
30 seconds.
[1513023983] SERVICE NOTIFICATION: jfisher;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after
30 seconds.
[1513023983] SERVICE NOTIFICATION: jvelez;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023983] SERVICE NOTIFICATION: mmoscater;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023983] SERVICE NOTIFICATION: nagiosadmin;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023983] SERVICE NOTIFICATION: rweatherbee;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023983] SERVICE NOTIFICATION: jandrews;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 30 seconds.
[1513023983] SERVICE NOTIFICATION: mcole;i-044826cb9a126f379;Winbind Daemon;CRITICAL;xi_service_notification_handler;CHECK_NRPE: Socket timeout after 30 seconds.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Contacts and Alerts

Post by dwhitfield »

Please attach or PM your objects.cache and your contactsgroups.cfg.

Of less importance, but might be useful, would be your profile. You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.

You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh

That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.

After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.

If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44

UPDATE: request info shared with techs
Last edited by dwhitfield on Mon Dec 11, 2017 5:56 pm, edited 1 time in total.
Reason: pm received
jandrews-flhi
Posts: 13
Joined: Wed Aug 02, 2017 9:05 am

Re: Contacts and Alerts

Post by jandrews-flhi »

Hello,

PM has been sent with the requested items.

Thank you, Doug.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Contacts and Alerts

Post by dwhitfield »

I found your problem (at least at a high level). In your objects.cache, you have this:

Code: Select all

define contactgroup {
	contactgroup_name	Linux
	alias	Linux Administrators
	members	mcole,jandrews,rweatherbee,nagiosadmin,mmoscater,jvelez,jfisher,fsuarez,fjubas
	}
In the CCM, if you go to the Linux contact group does it have another contactgroup listed? (second button under "Assign Memberships")
jandrews-flhi
Posts: 13
Joined: Wed Aug 02, 2017 9:05 am

Re: Contacts and Alerts

Post by jandrews-flhi »

Doug,

This was literally the first place I checked. Via CCM it only has the two, myself and mcole.
Screen Shot 2017-12-12 at 8.31.54 AM.png
Screen Shot 2017-12-12 at 8.36.00 AM.png
You do not have the required permissions to view the files attached to this post.
kyang

Re: Contacts and Alerts

Post by kyang »

Let's go to your CCM --> under Tools click Config File Management --> click write --> delete --> write --> verify.

Then let's see if that removes it from your objects.cache.

You could resend us a new profile or just the objects.cache file.

Thanks!


Update: Profile received!
Last edited by kyang on Wed Dec 13, 2017 10:54 am, edited 2 times in total.
Reason: added profile to teamshare
jandrews-flhi
Posts: 13
Joined: Wed Aug 02, 2017 9:05 am

Re: Contacts and Alerts

Post by jandrews-flhi »

I've sent you the profile.
kyang

Re: Contacts and Alerts

Post by kyang »

Can you show us the screenshot of this icon in the contact group?
Capture.PNG
Click the Blue i and show us the relationships of contact group.

I am specifically looking to see if you have a contact template assigned to the Linux contact group.

If it is connected to a contact template, please show us which users are in that contact template.
Capture.PNG
Like this, because even though I removed some of my contacts in a contactgroup, the objects.cache file still shows all contacts. Which it should not.

I saw my contactgroup was assigned to a template, which was for xi_contactgroup_all. (Once I removed this, then my objects.cache file showed the correct users)

Hopefully that is the case for you. Let us know!
You do not have the required permissions to view the files attached to this post.
jandrews-flhi
Posts: 13
Joined: Wed Aug 02, 2017 9:05 am

Re: Contacts and Alerts

Post by jandrews-flhi »

Per your request.
Screen Shot 2017-12-14 at 9.13.56 AM.png
Screen Shot 2017-12-14 at 9.13.38 AM.png
You do not have the required permissions to view the files attached to this post.
kyang

Re: Contacts and Alerts

Post by kyang »

Yes, that one. In that same contact-template take out that contact group Linux.

Apply configuration, and then run this command and see if it outputs correctly for that contact group. Because your objects.cache file is still showing all contacts in that contact_group, which it shouldn't.

Hopefully, this resolves it.

Code: Select all

grep -A 3 Linux /usr/local/nagios/var/objects.cache
Please post the output, thanks!
Locked