Hi,
We are trying to monitor a particular pattern 24/7 in a folder contains logs that get rotated for every 10 MB of data. These are basically execution logs of batch Jobs from a WebSphere system. We want to monitor the Exception caught string in logfiles no matter where it is.
so we are using check_log3.pl script and we gave the log folder in -l option with filename pattern as POsync.log.* in -m option but results are not accurate.
For a job, it created the pattern in POsync.log itself but still it wasn’t recognized.
Here is an example.
iptposnc 2018-04-27 11:31:06,832 [main] INFO org.springframework.beans.factory.xml.XmlBeanDefinitionReader - Loading XML bean definitions from class path resource [org/springframework/jdbc/support/sql-error-codes.xml]
iptposnc 2018-04-27 11:31:06,931 [main] INFO org.springframework.jdbc.support.SQLErrorCodesFactory - SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase, Hana]
iptposnc 2018-04-27 11:31:06,948 [main] INFO PaymentHostDao.class - Exception caught while getting PO data T7756_IPT_PO tablePreparedStatementCallback; SQL [UPDATE LOWES.T7756_IPT_PO set RR_NBR = ? , FNL_DTN_LCT_NBR = ?, T7624_PO_STS_CD = CASE WHEN (T7624_PO_STS_CD > ?) THEN T7624_PO_STS_CD ELSE ? END , E213_PO_STS_CD = CASE WHEN (E213_PO_STS_CD > ?) THEN E213_PO_STS_CD ELSE ? END , CDK_FCY_LCN_NBR = ? , SHP_DT = ?, ARV_DT = ?, INI_RCP_DT = CASE WHEN (INI_RCP_DT IS NOT NULL) THEN INI_RCP_DT ELSE ? END , SHP_FRO_VBU_NBR = ?, CDN_VBU_NBR = ?, BYR_USE_NME = ? , MER_CST_AMT = ?, TOT_PO_ORD_QTY = ?, TOT_GRS_WGT_MSR = ?, TOT_GRS_WGT_UOM_TXT = 'LBS', TOT_PO_VLM_MSR = ?, TOT_PO_VLM_UOM_TXT = 'CFT', TL_NBR = ?, FIN_IMT_NBR = ?, T4830_CNR_ORD_SIZ_CD = ?, T4830_CNR_SHP_SIZ_CD = ?, T5104_LAD_PNT_ECG_LCT_CD = ?, T5104_PNT_DCH_ECG_LCT_CD = ?, T5104_PNT_OF_RSB_LCT_CD = ?, PO_SNC_IDC = 'N',PO_LST_SNC_DM = current_timestamp, UPD_DM = current_timestamp,UPD_ID = ? WHERE T7756_PO_ID = ? AND (PO_SNC_IDC = 'Y' OR PO_SNC_IDC = 'H' OR PO_SNC_IDC = 'A' ) ]; [jcc][t4][102][10040][4.18.60] Batch failure. The batch was submitted, but at least one exception occurred on an individual member of the batch.
Use getNextException() to retrieve the exceptions for specific batched elements. ERRORCODE=-4229, SQLSTATE=null; nested exception is com.ibm.db2.jcc.am.BatchUpdateException: [jcc][t4][102][10040][4.18.60] Batch failure. The batch was submitted, but at least one exception occurred on an individual member of the batch.
Use getNextException() to retrieve the exceptions for specific batched elements. ERRORCODE=-4229, SQLSTATE=null
Logs:
-rw-r--r-- 1 iptrbolt staff 1048658 Feb 20 09:59 POSync.log.10
-rw-r--r-- 1 iptrbolt staff 1048681 Feb 21 14:14 POSync.log.9
-rw-r--r-- 1 iptrbolt staff 1048755 Feb 21 14:14 POSync.log.8
-rw-r--r-- 1 iptrbolt staff 1048711 Feb 21 14:14 POSync.log.7
-rw-r--r-- 1 iptrbolt staff 1048934 Feb 21 14:14 POSync.log.6
-rw-r--r-- 1 iptrbolt staff 1048631 Apr 27 10:26 POSync.log.5
-rw-r--r-- 1 iptrbolt staff 1048958 Apr 27 11:19 POSync.log.4
-rw-r--r-- 1 iptrbolt staff 1048671 Apr 27 11:19 POSync.log.3
-rw-r--r-- 1 iptrbolt staff 1048752 Apr 27 11:19 POSync.log.2
-rw-r--r-- 1 iptrbolt staff 1049103 Apr 27 11:19 POSync.log.1
-rw-r--r-- 1 iptrbolt staff 666687 Apr 27 12:17 POSync.log
Location:
/usr/iptbatch/PaymentsPOSyncBatch/log
Command:
"/home/batman/nagios/check_log3.pl -l '/usr/iptbatch/PaymentsPOSyncBatch/log/' -m 'POSync.log.*' -p 'Exception caught' -p 'exception caught' -c 1"
Please let us know how we can achieve this.
Let me know if any clarifications.
Logfile monitoring for rotated logs
Re: Logfile monitoring for rotated logs
Just a hunch (I could be totally wrong), but it's possible that check_log3 is persisting information that it already checked those files since their names are consistent and they all seem to be roughly the same length. check_log3 by default will store a file in /tmp that keeps track of which files and the lines in those files that it's already checked. You might take a look at that seekfile to confirm/deny this hunch. I've not spent any time looking at how that seekfile works.
If my hunch is right, you might alter your logrotate rule for these logs to append the date or timestamp rather than a simple series number and see if that simple change has the desired effect.
Also, a plug for Nagios Log Server which is superior to check_log3 in many ways.
Code: Select all
-s, --seekfile=
The temporary file to store the seek position of the last scan. If not
specified, it will be automatically generated in /tmp, based on the
log file's base name. If this is a directory, the seek file will be auto-
generated there instead of in /tmp.
If you specify the system's null device (/dev/null), the entire log file
will be read every time.Also, a plug for Nagios Log Server which is superior to check_log3 in many ways.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/