Hi all,
I had Nagios working fine on Voyage (Debian) Linux. I then tried to follow this guide to setting up SMS notifications:
http://www.jonathangazeley.com/2009/08/ ... th-nagios/
But I think while trying to get that running I broke Nagios. I had to update cpan, install gcc, make and build-essentials to get the perl module to install
I think while installing the various packages apaches configuration was altered and I thought I might have a permissions issue but I can't even change the permissions of the files mentioned in the below log file. I am getting Input/Output errors. I created test files in those directories to see if it was disk corruption.
Here is my /var/log/nagios3/nagios.log file
[1343244722] Warning: Could not stat() check result file '/var/lib/nagios3/spool/checkresults/ch5zlMa'.
[1343244722] Error: Unable to rename file '/var/cache/nagios3/nagios.tmp4H62m8' to '/var/cache/nagios3/status.dat': Input/output error
[1343244722] Error: Unable to update status data file '/var/cache/nagios3/status.dat': Input/output error
[1343244726] Caught SIGTERM, shutting down...
[1343244726] Successfully shutdown... (PID=6452)
[1343244726] Nagios 3.2.1 starting... (PID=6572)
[1343244726] Local time is Wed Jul 25 20:32:06 IST 2012
[1343244726] LOG VERSION: 2.0
[1343244726] Finished daemonizing... (New PID=6573)
[1343244726] Error: Unable to rename file '/var/cache/nagios3/nagios.tmprlT48L' to '/var/cache/nagios3/status.dat': Input/output error
[1343244726] Error: Unable to update status data file '/var/cache/nagios3/status.dat': Input/output error
[1343244736] Warning: Could not stat() check result file '/var/lib/nagios3/spool/checkresults/ch5zlMa'.
[1343244736] Error: Unable to rename file '/var/cache/nagios3/nagios.tmp9fVLF3' to '/var/cache/nagios3/status.dat': Input/output error
[1343244736] Error: Unable to update status data file '/var/cache/nagios3/status.dat': Input/output error
[1343244746] Warning: Could not stat() check result file '/var/lib/nagios3/spool/checkresults/ch5zlMa'.
[1343244746] Error: Unable to rename file '/var/cache/nagios3/nagios.tmpMKIzO9' to '/var/cache/nagios3/status.dat': Input/output error
[1343244746] Error: Unable to update status data file '/var/cache/nagios3/status.dat': Input/output error
Any help would be much appreciated!
Thanks
Think I broke Nagios already
-
brien.crean
- Posts: 6
- Joined: Mon Jul 16, 2012 11:30 am
Think I broke Nagios already
Last edited by brien.crean on Thu Jul 26, 2012 9:39 am, edited 1 time in total.
Re: Think I broke Nagios already
Hi!
Maybe you can try fsck.<yourFStype> to see if there's some fs error, and badblocks to test disk for bad sectors.
Have a nice day
Debe
take a look also at dmesg output after the IN/Out error.brien.crean wrote:but I can't even change the permissions of the files mentioned in the below log file. I am getting Input/Output errors.
Maybe you can try fsck.<yourFStype> to see if there's some fs error, and badblocks to test disk for bad sectors.
Have a nice day
Debe
-
brien.crean
- Posts: 6
- Joined: Mon Jul 16, 2012 11:30 am
Re: Think I broke Nagios already
Thanks Debe. I am running Linux on an Alix with a CF card. I ran fsck.ext2 on the filesystem and I got:gdeber wrote: Maybe you can try fsck.<yourFStype> to see if there's some fs error, and badblocks to test disk for bad sectors.
root@voyage:~# fsck.ext2 -n /dev/hda1
e2fsck 1.41.12 (17-May-2010)
Warning! /dev/hda1 is mounted.
ROOT_FS contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'ch5zlMa.ok' in /var/lib/nagios3/spool/checkresults (82096) has deleted/unused inode 82243. Clear? no
Entry 'ch5zlMa' in /var/lib/nagios3/spool/checkresults (82096) has deleted/unused inode 82170. Clear? no
Entry 'cB9mGcg.ok' in /var/lib/nagios3/spool/checkresults (82096) has deleted/unused inode 82262. Clear? no
Entry 'cB9mGcg' in /var/lib/nagios3/spool/checkresults (82096) has deleted/unused inode 82261. Clear? no
Entry 'status.dat' in /var/cache/nagios3 (82097) has deleted/unused inode 82181. Clear? no
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -338869 -(341382--341385) -(341998--341999) -(343125--343127)
Fix? no
Free blocks count wrong (352235, counted=352242).
Fix? no
Inode bitmap differences: -82170 -82181 -(82243--82244) -(82247--82249)
Fix? no
Free inodes count wrong (100603, counted=100604).
Fix? no
ROOT_FS: ********** WARNING: Filesystem still has errors **********
ROOT_FS: 22277/122880 files (0.5% non-contiguous), 138835/491070 blocks
Looks like a filesystem/bad block issue
badblocks didnt return any errors
Can I run fsck.ext2 while the file system is mounted or would I be better removing the CF card and connecting it via USB to another Linux server and then run fsck.ext2?
Thanks
Brien
Re: Think I broke Nagios already
Do not run fsck on mounted volume, definitely unmount it and run fsck on it elsewhere, unmounted. Only tears and sadness will come from running fsck on a mounted volume.
Nicholas Scott
Former Nagios employee
Former Nagios employee
-
brien.crean
- Posts: 6
- Joined: Mon Jul 16, 2012 11:30 am
Re: Think I broke Nagios already
Thanks nscott! I did unmount it and then I ran fsck on the filesystem. All is working fine now, until the next time! Thanks everyone for your helpnscott wrote:Do not run fsck on mounted volume, definitely unmount it and run fsck on it elsewhere, unmounted. Only tears and sadness will come from running fsck on a mounted volume.
Re: Think I broke Nagios already
nscott wrote:Only tears and sadness will come from running fsck on a mounted volume.