View details during the period a service was above threshold
Posted: Tue Apr 10, 2012 11:03 am
Hi,
I am trying out Nagios Core 3.2.3. and I have a question.
I read somewhere that:
Instead of monitoring values, Nagios only uses four states to describe status: OK, WARNING, CRITICAL,
and UNKNOWN.
Now this looks like a problem to me.
Say for a host, I have configured a service to check CPU load average and send me a mail if the load average is more than 2 for three minutes. I left office at 9:00 PM. In the morning I saw a mail alert that regarding high load average. I went to Nagios web interface and found that load avg was above threshold all the way between 10:00 PM to 04:00 AM. Now I would be interested in knowing how the load average behaved the whole night. I would like to see the system's load average values at intervals of say every 5 minutes during this entire period. It would be great If get to know that
- load avg was around 15 from 10:00 PM to 11:00 PM
- then load avg was around 20 from 11:00PM to 1:00 AM
- then load avg was around 50 from 1:00PM to 1:30 PM
- then load avg dropped and it was around 10 from 1:30 AM to 2:00 AM
- load avg was around 5 from 2:00AM to 4:00 AM.
Having this information will help a lot in investigation. But I don't see how I can have this information in Nagios.
I am trying out Nagios Core 3.2.3. and I have a question.
I read somewhere that:
Instead of monitoring values, Nagios only uses four states to describe status: OK, WARNING, CRITICAL,
and UNKNOWN.
Now this looks like a problem to me.
Say for a host, I have configured a service to check CPU load average and send me a mail if the load average is more than 2 for three minutes. I left office at 9:00 PM. In the morning I saw a mail alert that regarding high load average. I went to Nagios web interface and found that load avg was above threshold all the way between 10:00 PM to 04:00 AM. Now I would be interested in knowing how the load average behaved the whole night. I would like to see the system's load average values at intervals of say every 5 minutes during this entire period. It would be great If get to know that
- load avg was around 15 from 10:00 PM to 11:00 PM
- then load avg was around 20 from 11:00PM to 1:00 AM
- then load avg was around 50 from 1:00PM to 1:30 PM
- then load avg dropped and it was around 10 from 1:30 AM to 2:00 AM
- load avg was around 5 from 2:00AM to 4:00 AM.
Having this information will help a lot in investigation. But I don't see how I can have this information in Nagios.