Getting info about services state from nagios
Posted: Mon Dec 07, 2015 12:04 pm
Hello everybody.
Could someone of experienced users help me to get right way.
Task, that I want to resolve, described below.
I'd like to get information from nagios about hosts and services, that have defined status (critical or warning) and have this status unchanged for a long time.
Next type of task is to find all hosts and services that have no notification configured for them, but have critical or warning state.
I think, I will write special programm (script), that will compute appropriate time lenght and other conditions and send special notification in according to it.
This specific task needs us because we have lot of services monitored by nagios and sometimes one of our admins accidentally forget to fix issue during long time. Simultaneously, this services have created downtime, so we cannot receive any notification. Only, accidentally, we can see something wrong during inspect nagios web-pages.
Another one reason is when someone of our developers or other owners of monitored by nagios services, ask us to turn on downtime for their services for long period. And then, them forget about it at all.
So, I ask your advices - what is better way should I use to resolve these tasks?
I think, there are two methods, that can help me.
1. One is to use "NDOUtils" ( https://exchange.nagios.org/directory/A ... ls/details ). This way suppose, that my script will get some info from mysql database, that contain everything about current state of all nagios checks.
2. Another one way is to use "MK Livestatus" ( http://mathias-kettner.de/checkmk_livestatus.html ). This way suppose that my script will get appropriate information with "LQL - The Livestatus Query Language".
So, whay you say?
With best regards,
Sergey.
Could someone of experienced users help me to get right way.
Task, that I want to resolve, described below.
I'd like to get information from nagios about hosts and services, that have defined status (critical or warning) and have this status unchanged for a long time.
Next type of task is to find all hosts and services that have no notification configured for them, but have critical or warning state.
I think, I will write special programm (script), that will compute appropriate time lenght and other conditions and send special notification in according to it.
This specific task needs us because we have lot of services monitored by nagios and sometimes one of our admins accidentally forget to fix issue during long time. Simultaneously, this services have created downtime, so we cannot receive any notification. Only, accidentally, we can see something wrong during inspect nagios web-pages.
Another one reason is when someone of our developers or other owners of monitored by nagios services, ask us to turn on downtime for their services for long period. And then, them forget about it at all.
So, I ask your advices - what is better way should I use to resolve these tasks?
I think, there are two methods, that can help me.
1. One is to use "NDOUtils" ( https://exchange.nagios.org/directory/A ... ls/details ). This way suppose, that my script will get some info from mysql database, that contain everything about current state of all nagios checks.
2. Another one way is to use "MK Livestatus" ( http://mathias-kettner.de/checkmk_livestatus.html ). This way suppose that my script will get appropriate information with "LQL - The Livestatus Query Language".
So, whay you say?
With best regards,
Sergey.