I have a customer that is looking for us to monitor logs on their linux server application using nsca.
The aim of this effort is to monitor log files. I found an article from Nagios that suggests that events like log file monitoring are better done with passive checks, since are asynchronous in nature.
Here is the article : http://nagios.manubulon.com/traduction/ ... hecks.html
I found a java library that can integrate with our application development to send passive notification to Nagios. Is this something that you can accommodate ?
The documentation can be found on page 3 of the URL http://jsendnsca.googlecode.com/svn/jse ... 0Guide.pdf
Example code is below
NagiosSettings settings = new NagiosSettingsBuilder()
.withNagiosHost("localhost")
.withPort(5667)
.withEncryption(Encryption.XOR)
.create();
MessagePayload payload = new MessagePayloadBuilder()
.withHostname("hostname of machine sending check")
.withLevel(Level.OK)
.withServiceName("jsendnsca")
.withMessage("It works!")
.create();
NagiosPassiveCheckSender sender = new NagiosPassiveCheckSender(settings);
At this time we don't have NCSA configured on the nagios xi server to accept 5667 TCP and we would need to start making firewall changes... among other things.
The doc seems to be somewhat outdated... like 2001..... are there better methods to do this (preferred active checks)?
Are they looking for a specific string of text in their log files? What is the ultimate goal for monitoring these log files? You should be able to do this fairly easily with active checks.
I was finally able to get a comprehensive list and did some research on checks for this.
Found a plugin called check_log3.pl
Not sure if these are the most appropriate but I created commands in nrpe for:
Check logfile called kpi.log for pattern SQLException and create a seek file called /var/log/seek_files/check_log3_kpi_sql.seek where once instance of this pattern found in the log makes the result critical.
24x7 checks and alerts, check interval of 3 minutes with a retry interval of 1440 Minutes. Max check Attempts 1.
Log is written to daily from 1:00AM-2:00AM (this changed since daylight savings wasn't taken into account with their application which is now 2:00-3:30).
Check kpi.log If it wasn't written to since last scan using seek file /var/log/seek_files/check_log3_kpi_after.seek. Once instance of this is critical
check period only from 1:00-2:00AM, check interval of 30 minutes, retry interval of 1 minute, max check attempts 2. Same above daylight savings issue.
Check log file kpi.log for patterns Processing, End and Start with a seek file /var/log/seek_files/check_log3_kpi_completion.seek. Once instance of this is critical
check period only from 1:00-2:00AM, check interval of 30 minutes, retry interval of 1 minute, max check attempts 1. Same above daylight savings issue.
Just curious if using Nagios Log Server (even the free version) might be easier, save you a lot of time, and give you more functionality?
The error you're seeing is indicative of not being able to make an SSL connection. This is often caused by the client NRPE server not being compiled on a system with the SSL libraries installed. Are you using other NRPE checks? Are they working?
Try changing your "check_nrpe" command to execute "check_nrpe -n" instead of "$ARG1$/check_nrpe" and that will disable the SSL connection attempt. You can test from the command line with: /usr/local/nagios/libexec/check_nrpe -H <hostname> -n
We are not using Nagios Log Server. How is it easier? Might be interested in this....
Screenshot attached. Yes I am using other checks besides the log checks and those are fine. The log checks are flapping between states and only those are experiencing this.
You do not have the required permissions to view the files attached to this post.
Well. This is a solaris 10 server. Although it shouldn't be... seems like this might be a problem for syslog on the server to send the logs. The dev guys currently have syslog configured on the server. I don't want to overwrite anything they already have.