check_logfiles and duration of --sticky option

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
starless
Posts: 4
Joined: Mon Oct 05, 2015 10:09 am

check_logfiles and duration of --sticky option

Post by starless »

Hi, I'm using the check_logfiles v3.4.2 plugin to check for errors in syslog messages on Linux machines, and I cannot make the --sticky option work as I would expect to.

Upon detecting an error in logs it keeps the alert status for a few minutes, but then goes back to OK even without matching an okpattern.

I tried just "--sticky", but also "--sticky=0" and "--sticky=90000". The result appears to be always the same.

How can the sticky duration be set? Am I missing anything?
Thanks.
Marco
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: check_logfiles and duration of --sticky option

Post by hsmith »

Have you tried to update the the latest version of the plugin? Looking at their page, it looks like there have been quite a few releases from since the version you are using.
Former Nagios Employee.
me.
starless
Posts: 4
Joined: Mon Oct 05, 2015 10:09 am

Re: check_logfiles and duration of --sticky option

Post by starless »

Thank you, but I've read the release notes and I see no changes in the --sticky option after v3.4.2, so I doubt I'll get a different behaviour... I can try anyway on a test machine, but I'm not sure I'll be able to bring the new version on the production machine.
starless
Posts: 4
Joined: Mon Oct 05, 2015 10:09 am

Re: check_logfiles and duration of --sticky option

Post by starless »

I downloaded the latest version 3.7.3 and tested it on a test machine, but the result is still the same: the alarm seems to be reset after a few minutes, whatever option I use.

Any clues?
Thanks.
Marco
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: check_logfiles and duration of --sticky option

Post by hsmith »

Being that this is not our plugin, we really have no direct connection with the author it becomes pretty difficult to troubleshoot. Can you try something like 4000 seconds? Maybe there is a limit I am not aware of.
Former Nagios Employee.
me.
starless
Posts: 4
Joined: Mon Oct 05, 2015 10:09 am

Re: check_logfiles and duration of --sticky option

Post by starless »

Problem solved!
My fault... I didn't realize that there was another machine launching check_logfiles via nrpe on my test machine, without the --sticky option, which was resetting the sticky I was setting by launching the command locally instead.
I stopped the automated check and now my local tests give the expected results, also with version 3.4.2.

For the record, I was helped in debugging by the -v option (only available in the newer version) and by the debug output the plugin creates in /tmp/check_logfiles.trace if you create the file beforehand.
Now I can also see that the default behaviour with just "--sticky", no numeric values, will keep alarms sticky for 10 years (!!).

Thank you for making me insist in testing ;).
Marco
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: check_logfiles and duration of --sticky option

Post by hsmith »

Awesome! I'm glad to hear it is working. I was a little bit concerned because people seem to have very few issues with that plugin. I'll go ahead and close this thread. Please let us know if you need help with anything else.

Thanks.
Former Nagios Employee.
me.
Locked