Oncall Change Over procedure

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
griffithusg
Posts: 64
Joined: Sun Nov 07, 2010 7:16 pm

Oncall Change Over procedure

Post by griffithusg »

Hello.
This is not so much a support question but more a question to the general NagiosXi Community.

In our old system we would change who was oncall by having an oncall group that all the alerts goto and then changing who is a member of that group depending on who is oncall for that week. For the moment we are going to continue to do this in NagiosXi. The main reason for this is that it allows one person to change who is oncall by removing the previous person and adding themselves into the group.

This may just be my understanding of how NagiosXi has its notification system setup, but is the only way to achieve the above using the NagiosXi notification handler is:

Step 1. The person that was oncall log in to the nagiosxi interface select their notification settings and uncheck the box that says enable notificiations.
Step 2. The new oncall person logs in and check the notification box.

In a perfect world,(and this is a really perfect world) It would be cool to be able to import a csv with usernames and dates and have Xi automatically change oncall.
I understand this would require some work. But still would be a cool feature to have.

So this is the part where I ask people out there what they do when it comes to changing oncall?

Would be interesting to see peoples processes around this and how they have changes when they have moved to Xi.

Thanks,
GUSG
tonyyarusso
Posts: 1128
Joined: Wed Mar 03, 2010 12:38 pm
Location: St. Paul, MN, USA
Contact:

Re: Oncall Change Over procedure

Post by tonyyarusso »

There isn't a particularly elegant way to do this in XI currently, but if I were to design such a thing here is my "perfect world" approach to how I would do it, personally:

The crux of the matter is that I would likely do this on the MTA level. This works if you're doing notifications by e-mail and/or an e-mail-to-SMS gateway, and possibly have a shared pager, but would require some extra work if you're doing SMS notifications out of band through a directly connected cell modem to personal phones.

First, I'd rip out sendmail and replace it with Postfix. Then, since Postfix supports SQL maps, I would configure it to use as its alias map a SELECT of the username and email fields from the xi_users table of the nagiosxi PostgreSQL database. Third, I would create a dummy user in XI for "On Call Person", with a local email address (like "[email protected]" or "oncall@localhost"), and set up Postfix to do alias lookups for whatever the domain of that address is. Next I'd either just use the "notification_times" keys in the xi_usermeta table, or if those are being used differently then add to the table with a keyname of "oncall_times", with a format similar to to "notification_times". Then, I'd write an external script that selected all of the "notification_times" (or "oncall_times") values, joined to the appropriate usernames, parsed their content, and marked any that the current time fit into. The script would then take the usernames associated with any of those time periods, concatenate them with commas, and set the full result as the alias value for the "oncallperson" record (their "email address"). Thus, when nagios tried to send an email to "oncallperson@localhost", Postfix would expand the alias to "jim@localhost, mary@localhost, tom@localhost", and then in turn expand those to all of their email addresses, so "[email protected], [email protected], [email protected]". My update script would then be set up to run via cron on some reasonable interval. Finally, depending on exact needs, I'd set up an importer to accept a CSV like you mentioned and write to the xi_usermeta table, assign permissions for who can edit those values, etc. and make a nice portion of the web interface to do it.

So there's that. :) Fundamental concept to take away: Use Postfix as the MTA, store alias maps in a SQL database, create a dummy "On Call Person" user, and dynamically update the alias for that user with all of the real users that fit the current timeframe, based on on call time period definitions stored elsewhere.

If you aren't using it for other things, you can also just define each users' "notification times" under their notification settings, and leave "enable notifications" as true, and they'll only get notified during those times, but that precludes you from letting them always get notifications about one machine and only get them for another box during their designated on-call times.
Tony Yarusso
Technical Services
___
TIES
Web: http://ties.k12.mn.us/
Locked