Getting wrong alerts for check_rabbitmq_queue in nagios
Posted: Mon Aug 07, 2017 4:57 am
Hi All,
I'm new to nagios. In my server.cfg, i have a check_rabbitmq_queue which is used to monitor queues for some services in rabbitmq. But lately, i'm getting wrong alerts for the same. Everything is alerted as critical. I have manually checked and everything is fine. My inbox is flooded with these alerts (around 5 alerts in an hour for a particular service). Please let me know how to fix it and also how to change the intervals in alerts, so that i'm not being flooded with alerts.
This was the service queue.cfg
define service {
use generic-service
host_name server
service_description ServiceQueue
check_command check_rabbitmq_queue!15672!username!password!ServiceQueue!10,-1,-1,-1!15,-1,-1,-1
contact_groups nagiosadmin
}
This is the service queue.cfg now
define service {
use generic-service
host_name server
service_description ServiceQueue
check_command check_rabbitmq_queue!15672!username!password!ServiceQueue!10,5,5,1!15,10,10,2
contact_groups nagiosadmin
}
Even now, the issue still persists.
This is the alert i'm getting
>***** Nagios *****
>
>Notification Type: PROBLEM
>
>Service: ServiceQueue
>Host: server
>Address: x.x.x.x
>State: CRITICAL
>
>Date/Time: Mon Aug 07 15:42:02 UTC 2017
>
>Additional Info:
>
>RABBITMQ_QUEUE CRITICAL - consumers CRITICAL (3), messages OK (0)
>messages_ready OK (0) messages_unacknowledged OK (0)
Thanks in advance of for all the help and suggestions.
I'm new to nagios. In my server.cfg, i have a check_rabbitmq_queue which is used to monitor queues for some services in rabbitmq. But lately, i'm getting wrong alerts for the same. Everything is alerted as critical. I have manually checked and everything is fine. My inbox is flooded with these alerts (around 5 alerts in an hour for a particular service). Please let me know how to fix it and also how to change the intervals in alerts, so that i'm not being flooded with alerts.
This was the service queue.cfg
define service {
use generic-service
host_name server
service_description ServiceQueue
check_command check_rabbitmq_queue!15672!username!password!ServiceQueue!10,-1,-1,-1!15,-1,-1,-1
contact_groups nagiosadmin
}
This is the service queue.cfg now
define service {
use generic-service
host_name server
service_description ServiceQueue
check_command check_rabbitmq_queue!15672!username!password!ServiceQueue!10,5,5,1!15,10,10,2
contact_groups nagiosadmin
}
Even now, the issue still persists.
This is the alert i'm getting
>***** Nagios *****
>
>Notification Type: PROBLEM
>
>Service: ServiceQueue
>Host: server
>Address: x.x.x.x
>State: CRITICAL
>
>Date/Time: Mon Aug 07 15:42:02 UTC 2017
>
>Additional Info:
>
>RABBITMQ_QUEUE CRITICAL - consumers CRITICAL (3), messages OK (0)
>messages_ready OK (0) messages_unacknowledged OK (0)
Thanks in advance of for all the help and suggestions.