No notification after an error

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
arenist
Posts: 27
Joined: Fri Nov 29, 2013 9:29 am

No notification after an error

Post by arenist »

Hi supporters,

nagios found an error in a log file but didn't notify me.

I'm using nagios 3.5.0. The service is:

Code: Select all

define service{
        use generic-service
        hostgroup_name                  iAS
        service_description             ElsaMarke errors
        contact_groups                  admins,ias_log
        max_check_attempts              1
        notification_options            w,u,c
#       normal_check_interval           60
#       notification_interval           240
        check_command                   check_nrpe!check_elsaMarke
}
On the remote-machine:

Code: Select all

command[check_elsaMarke]=/usr/local/nagios/libexec/check_iaslog -F /opt/jboss/jboss-eap-5.1/jboss-as/server/elsaMarke/log/marke.debug.log -O /usr/local/nagios/libexec/elsaMarke.error.log -q "error"
The script check_iaslog is a modified variant of check_log.

I found in an archived nagios log the message:

Code: Select all

[1423060750] SERVICE ALERT: majb-vp-11;ElsaMarke errors;CRITICAL;HARD;1;< at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) < Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded < 2015-02-04 15:34:50,394 [-0.0.0.0-8009-1] [elheller(m=312) ] ERROR interceptor.TxInterceptor - Serverfehler : app_inst bin boot dev etc home lib lib64 lost+found media misc mnt net opt proc root sbin selinux shared_data srv sys test_pdf tftpboot tmp usr var java.lang.OutOfMemoryError - GC overhead limit exceeded < at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) < Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded < 2015-02-04 15:34:50,395 [0.0.0.0-8009-18] [anblumst ] ERROR s.common.CommonWebService - Systemfehler : null < at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) < 2015-02-04 15:34:50,417 [orkManager(2)-7] [ankammac(m=7635) ] ERROR e.CommonStatelessBeanBase - java.io.IOException: Invalid HTTP server response [408] - Request Time-out < 2015-02-04 15:34:50,418 [orkManager(2
[1423061350] SERVICE ALERT: majb-vp-11;ElsaMarke errors;OK;HARD;1;Log check ok - 0 pattern matches found
I don't understand why there was no notification about an OutOfMemoryError in a JBoss log file.

Can you help me please?

Regards,
arenist
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: No notification after an error

Post by tgriep »

Are the contacts in the contact groups "admins" and "ias_log" setup to receive notifications?
Could you post the settings for those groups and the users assigned to them?
Also, is the server setup with notifications enabled?
Be sure to check out our Knowledgebase for helpful articles and solutions!
arenist
Posts: 27
Joined: Fri Nov 29, 2013 9:29 am

Re: No notification after an error

Post by arenist »

Hi tgriep,

my contacts.cfg is configured well. I think I found out myself why nagios didn't post the incident. The service "ElsaMarke Errors" on server majb-vp-11 was flapping:

Code: Select all

[1422946150] SERVICE FLAPPING ALERT: majb-vp-11;ElsaMarke errors;STARTED; Service appears to have started flapping (22.6% change >= 20.0% threshold)
...
[1423060750] SERVICE ALERT: majb-vp-11;ElsaMarke errors;CRITICAL;HARD;1;< at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) < Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded < 2015-02-04 15:34:50,394 [-0.0.0.0-8009-1] [elheller(m=312) ] ERROR interceptor.TxInterceptor - Serverfehler : app_inst bin boot dev etc home lib lib64 lost+found media misc mnt net opt proc root sbin selinux shared_data srv sys test_pdf tftpboot tmp usr var java.lang.OutOfMemoryError - GC overhead limit exceeded < at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) < Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded < 2015-02-04 15:34:50,395 [0.0.0.0-8009-18] [anblumst ] ERROR s.common.CommonWebService - Systemfehler : null < at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) < 2015-02-04 15:34:50,417 [orkManager(2)-7] [ankammac(m=7635) ] ERROR e.CommonStatelessBeanBase - java.io.IOException: Invalid HTTP server response [408] - Request Time-out < 2015-02-04 15:34:50,418 [orkManager(2
...
[1423079350] SERVICE FLAPPING ALERT: majb-vp-11;ElsaMarke errors;STOPPED; Service appears to have stopped flapping (3.8% change < 5.0% threshold)
If you agree to my thoughts you can close this thread.

Thanks for your investigations,
arenist
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: No notification after an error

Post by tmcdonald »

Yes, that would make sense if you have flap detection enabled.
Former Nagios employee
Locked