We use nrpe check to monitor all drives in a single alert for all drives of a server. When the state changes for one or more drives it alerts (critical or warning) and display the respective drives in output. When does Nagios trigger a closure of the alert…when one drive is back to normal or when all are back to normal…or????
Just wondering how Nagios handles this.
Alerting
Re: Alerting
In the case of NRPE's check_disk, if your criteria to examine is "all disks", then while one (or more) disks is in a warning/critical state, the check will return a warning/critical state.
All Nagios cares about is what the plugin's output is. If the plugin says "CRITICAL", Nagios will treat the service check as being in a critical state.anish wrote:When does Nagios trigger a closure of the alert…when one drive is back to normal or when all are back to normal…or????
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
Re: Alerting
Do we know when the plug in will return OK? Is it when all drives are OK?
Re: Alerting
If you're having it check multiple disks at once, then it will send an OK if all disks reach your criteria. If one of them hits a warning / critical, then it will change the status on the entire check accordingly.
Former Nagios Employee
Re: Alerting
This might be a quick one
Scenario:
Windows NRPE command to check disk utilization of all
./check_nrpe -H xx.xx.xx.xx -t 30 -c checkdrivesize -a CheckAll MinWarn=20% MinCrit=15%
Say D drive is critical. 5 attempts, then turn into HARD state
The service stability is Unchanging stable. I can see an alert in logs with 4 soft state and 5th hard state.
Now C drive also goes critical. The output of that check changes but the log entry is not created or service notification is not send as the state is still critical. This is creating a problem for us in terms of ticketing.
So we have ticket as D Drive was critical and it created an event. Now C drive is also critical so a notification or event should be created so a ticket for C drive is created.
Is there any specific setting for that service which needs to be enabled to make this work.
Happy to be on a call to discuss this. I think our support has 5 calls. Let me know the number and I will send a meeting invite to discuss and understand this better. This is a show stopper for us to use nagios.
Scenario:
Windows NRPE command to check disk utilization of all
./check_nrpe -H xx.xx.xx.xx -t 30 -c checkdrivesize -a CheckAll MinWarn=20% MinCrit=15%
Say D drive is critical. 5 attempts, then turn into HARD state
The service stability is Unchanging stable. I can see an alert in logs with 4 soft state and 5th hard state.
Now C drive also goes critical. The output of that check changes but the log entry is not created or service notification is not send as the state is still critical. This is creating a problem for us in terms of ticketing.
So we have ticket as D Drive was critical and it created an event. Now C drive is also critical so a notification or event should be created so a ticket for C drive is created.
Is there any specific setting for that service which needs to be enabled to make this work.
Happy to be on a call to discuss this. I think our support has 5 calls. Let me know the number and I will send a meeting invite to discuss and understand this better. This is a show stopper for us to use nagios.
Re: Alerting
The way to get this working is by setting up individual disk checks, for each drive mounted. Then, the alerting will work independently per drive, rather than as a group.
If you'd like to use one of your calls, feel free to call in and one of us will help you out. There is no need to schedule it ahead of time. We are here 9-5 Mon - Thurs, and 9-2 Fri, CST.
If you'd like to use one of your calls, feel free to call in and one of us will help you out. There is no need to schedule it ahead of time. We are here 9-5 Mon - Thurs, and 9-2 Fri, CST.
Former Nagios Employee
Re: Alerting
Thank you. Please close this. We send alerts to a tool called Evanios. We have scripts in evanios which separates them into individual alerts and creates incident in ServiceNow
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm