Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Alert

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
ashleyo
Posts: 5
Joined: Thu May 19, 2016 3:13 pm

Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Alert

Post by ashleyo »

I am not sure if this is a PagerDuty or Nagios XI issue, but I am having trouble getting an UNKNOWN state in Nagios XI to send a notification to PagerDuty. I am currently evaluating both products and am trying to get a result from shutting down specific services. If I were to stop an actual service, a notification is sent with no problem. If I were to disconnect a drive, however, it says the state is UNKNOWN. I am using an agentless instance of Nagios XI on my server with WMI. I have attached a file that shows the nagios log. The lines I am curious about are "wproc: SERVICE EVENTHANDLER job 13 from worker Core Worker 7080 is a non-check helper but exited with return code 2", "wproc: stderr line 01: usage: pd-nagios [-h] -k SERVICE_KEY -t {PROBLEM, ACKNOWLEDGEMENT, RECOVERY}", "wproc: stderr lin 02: [-i INCIDENT_KEY] [-f FIELDS] -n {service,host}", and "wproc: stderr line 03: pd-nagios: error: argument -t/--event-type: invalid choice: ' ' (choose from 'PROBLEM', 'ACKNOWLEDGEMENT', 'RECOVERY'). I set up the service for the drive to alert when an UNKNOWN state is present. I also called support and was given the option to try the negate command, but if there are more services that have this issue I am unsure that would be the most efficient fix.
I Drive is now HARD after 5 attempts.JPG
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by tgriep »

Can you post how your pager duty command is defined in Nagios XI?

Take a look at the guide from PagerDuty, it says that it should ignore unknown status messages.
https://www.pagerduty.com/docs/guides/n ... ion-guide/
Be sure to check out our Knowledgebase for helpful articles and solutions!
ashleyo
Posts: 5
Joined: Thu May 19, 2016 3:13 pm

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by ashleyo »

Nofity-host-by-pagerduty: /usr/share/pdagent-integrations/bin/pd-nagios -n host -k $CONTACTPAGER$ -t "$NOTIFICATIONTYPE$" -f HOSTNAME="$HOSTNAME$" -f HOSTSTATE="$HOSTSTATE$"

Notify-service-by-pagerduty: /usr/share/pdagent-integrations/bin/pd-nagios -n service -k $CONTACTPAGER$ -t "$NOTIFICATIONTYPE$" -f SERVICEDESC="$SERVICEDESC$" -f SERVICESTATE="$SERVICESTATE$" -f HOSTNAME="$HOSTNAME$" -f SERVICEOUTPUT="$SERVICEOUTPUT$"

That is the walkthrough that was followed when setting up the integration.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by tgriep »

Is the pagerduty notifications setup at an event handler for those services and not a notification handler?
It has to be setup as a notification as the event handler will not pass the information needed by the pagerduty scripts and that is causing the error.
Try changing that and see if that works for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ashleyo
Posts: 5
Joined: Thu May 19, 2016 3:13 pm

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by ashleyo »

The only option to select when setting up a service is a drop-down for Event Handlers. I have selected the command I created "notify-service-by-pagerduty". I'm not sure what you mean by changing it to a notification handler.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by tgriep »

The notification handlers are tied to the contacts.
What you would do is edit the contact, Click on the Alert Settings tab and add the pagerduty notification commands under the Manage Host/Service Notification Commands buttons.
Then you would setup that contact to receive notifications from those hosts and services.
When one of those hosts ot services has an issue, it will send a notification to that contact that is how the pagerduty commands get the data send to them.
Take a look at section 6 in the Pagerduty link I posted earlier.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ashleyo
Posts: 5
Joined: Thu May 19, 2016 3:13 pm

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by ashleyo »

I have notify-host-by-pagerduty and notify-service-by-pagerduty listed in the commands for the contact. I am currently receiving notifications for things such as an SSL cert being down to PagerDuty because they are showing as critical. It is just the UNKNOWN states that are not being sent to PagerDuty.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by tgriep »

You may have to contact the people at Pagerduty but I found this at the bottom of the web link.
PagerDuty can process PROBLEM, ACKNOWLEDGEMENT, and RECOVERY messages. All other messages, including FLAPPINGSTART and FLAPPINGSTOP, are ignored.
Is an UNKNOWN part of the PROBLEM messages, I couldn't find that.

You should remove the Pagerduty commands from the Event handlers so they will not error out anymore.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ashleyo
Posts: 5
Joined: Thu May 19, 2016 3:13 pm

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by ashleyo »

I'm mainly concerned about why a disconnected drive would show up as UNKNOWN. I understand that it is technically UNKNOWN because WMI cannot get the information, but I would think it would be a PROBLEM because it doesn't see what is placed in the configuration. Is there a way to show an UNKNOWN state as CRITICAL?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios XI and PagerDuty Integrations UNKNOWN Doesn't Ale

Post by tgriep »

You could use the negate command to change the unknown to a critical for the status on the check_wmi_plus plugin.

./negate -u CRITICAL <command>

From: ./negate --help
-u, --unknown=STATUS
STATUS can be 'OK', 'WARNING', 'CRITICAL' or 'UNKNOWN' without single
quotes. Numeric values are accepted. If nothing is specified, permutes
OK and CRITICAL.
Here is a link to a manual that explains how to use the negate plugin.
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked