Page 2 of 2
Re: HDD Monitoring on Bare metal server
Posted: Mon Mar 02, 2020 3:01 pm
by MOHANREDDY
The script didn't pass, its failing with the below error.
./check_smarton -d /dev/sda1
CRITICAL: device does not pass health status
But the disk is healthy. I am not aware of what to do here.
Thanks,
Re: HDD Monitoring on Bare metal server
Posted: Mon Mar 02, 2020 3:15 pm
by lmiltchev
Can you run the following command and show the output?
Re: HDD Monitoring on Bare metal server
Posted: Mon Mar 02, 2020 4:18 pm
by MOHANREDDY
Please see PM
Re: HDD Monitoring on Bare metal server
Posted: Mon Mar 02, 2020 4:25 pm
by lmiltchev
It seems like that smartctl is not able to monitor the drive as SMART support is not available...
SMART support is: Unavailable - device lacks SMART capability.
You would need to contact the hardware vendor and find out if your device has a SMART capability, and how it can be monitored. Obviously, this cannot be done via the smartctl package.
Re: HDD Monitoring on Bare metal server
Posted: Wed Mar 18, 2020 10:34 am
by MOHANREDDY
i wrote a script to monitor multiple hard drives, i have 6 servers with same configuration, the script works fine from local servers. also from nagios on 1 server i got OK state but on the remaining servers, its throwing UNKNOWN state.
# /usr/local/nagios/libexec/check_nrpe -H server1 -t 90 -c check_smart_multidrive
OK: [megaraid,0] - Device is clean --- [megaraid,1] - Device is clean --- [megaraid,2] - Device is clean --- [megaraid,3] - Device is clean --- [megaraid,4] - Device is clean --- [megaraid,5] - Device is clean --- [megaraid,6] - Device is clean --- [megaraid,7] - Device is clean --- [megaraid,8] - Device is clean --- [megaraid,9] - Device is clean --- [megaraid,10] - Device is clean --- [megaraid,11] - Device is clean|
# /usr/local/nagios/libexec/check_nrpe -H server2 -t 90 -c check_smart_multidrive
UNKNOWN: [megaraid,0] - No health status line found --- [megaraid,1] - No health status line found --- [megaraid,2] - No health status line found --- [megaraid,3] - No health status line found --- [megaraid,4] - No health status line found --- [megaraid,5] - No health status line found --- [megaraid,6] - No health status line found --- [megaraid,7] - No health status line found --- [megaraid,8] - No health status line found --- [megaraid,9] - No health status line found --- [megaraid,10] - No health status line found --- [megaraid,11] - No health status line found|
do i need to check on anything ?
Re: HDD Monitoring on Bare metal server
Posted: Wed Mar 18, 2020 12:22 pm
by lmiltchev
Troubleshooting custom scripts is out of scope of Nagios support. Having said that, one thing you should check is if you can run your script locally as nagios user. It is possible that check_nrpe is not working for some of the machines as nagios doesn't have sufficient permissions to obtain the data. Check to see if nagios is added to the sudoers file on the remote machines.