I followed a tutorial to install 4.0.2 on Solaris 11(oracle.com) and found a few directory differences which I put down to the fact I was installing 4.1.1
Generally all appeared good but it's not good at all!
I'll try to keep this as concise as possible:
The web interface opens fine (http://localhost/nagios/), Nagios reports it's running with PID 1878, a check on Services shows Localhost Services OK, remote host (Security Server, Server 2008 R2) services show "Critical": Connection refused.
Now a few checks: COTESS-SYSMON is the Nagios host, Security server IP=192.168.0.115
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_ssh -H 192.168.0.115
Connection refused
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_ssh -H 127.0.0.1
Server answer:
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.115
I (0,4,1,73 2012-12-17) seem to be doing fine...
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
CHECK_NRPE: Error - Could not complete SSL handshake.
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H localhost
connect to address ::1 port 5666: Connection refused
CHECK_NRPE: Error - Could not complete SSL handshake.
root@COTESS-SYSMON:~# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.1.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-19-2015
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 15 services.
Checked 2 hosts.
Checked 2 host groups.
Checked 0 service groups.
Checked 1 contacts.
Checked 1 contact groups.
Checked 24 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 2 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
root@COTESS-SYSMON:~# /etc/rc.d/init.d/nagios start
bash: /etc/rc.d/init.d/nagios: No such file or directory
root@COTESS-SYSMON:~# /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root@COTESS-SYSMON:~# svcs -xv
svc:/application/nagios:default (?)
State: maintenance since April 18, 2016 07:39:36 AM AEST
Reason: Start method failed repeatedly, last died on Killed (9).
See: http://support.oracle.com/msg/SMF-8000-KS
See: /var/svc/log/application-nagios:default.log
Impact: This service is not running.
Solaris gives no indication of a service running with PID 1878, which Nagios claims to be running under. There is no such process shown with the "top" command.
root@COTESS-SYSMON:~# kill 1878
bash: kill: (1878) - No such process
I'm at a real loss here guys, any advice will be greatly appreciated.
Thanks in advance,
Andrew.
Nagios 4.1.1 on Solaris 11.3 major headaches
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
Can you provide us with a URL link to the tutorial that you followed?I followed a tutorial to install 4.0.2 on Solaris 11(oracle.com) and found a few directory differences which I put down to the fact I was installing 4.1.1
Generally all appeared good but it's not good at all!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
Sure can. This one here:
http://www.oracle.com/technetwork/artic ... 79071.html
I used the GNU compiler method. As I said above a few files that needed to be edited were in different directories to what was listed and occasionally didn't need to be edited, I assumed that was because I was compiling 4.1.1 not 4.0.2.
Thanks for the reply.
Please note; I'm a bit of a Unix newbie.
http://www.oracle.com/technetwork/artic ... 79071.html
I used the GNU compiler method. As I said above a few files that needed to be edited were in different directories to what was listed and occasionally didn't need to be edited, I assumed that was because I was compiling 4.1.1 not 4.0.2.
Thanks for the reply.
Please note; I'm a bit of a Unix newbie.
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
It looks like there are two issues here. The first being with plugins, and the second with the PID not being killed. Just to clarify -- is your Nagios running smoothly, and just the service checks failing?
The second part, check_ssh -H 127.0.0.1 should be working fine. What is the output of sshd -h from the Nagios host?
The second check, against localhost suggests it's failing because it's routing over IPv6. Can you take a look at your /etc/hosts and make sure IPv4 is resolvable?
You mentioned 192.168.0.115 being the Nagios host, but your check indicates it's actually a Windows machine running NSClient++ 0.4.1. This is why your SSH is getting connection refused. Please run nmap 192.168.0.115.root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_ssh -H 192.168.0.115
Connection refused
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_ssh -H 127.0.0.1
Server answer:
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.115
I (0,4,1,73 2012-12-17) seem to be doing fine...
The second part, check_ssh -H 127.0.0.1 should be working fine. What is the output of sshd -h from the Nagios host?
Did you compile Nagios plugins with SSL support? What is the output of running /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -n? The -n flag will run without SSL.root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
CHECK_NRPE: Error - Could not complete SSL handshake.
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H localhost
connect to address ::1 port 5666: Connection refused
CHECK_NRPE: Error - Could not complete SSL handshake.
The second check, against localhost suggests it's failing because it's routing over IPv6. Can you take a look at your /etc/hosts and make sure IPv4 is resolvable?
Former Nagios Employee
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
Sorry my wording was misleading; COTESS-SYSMON is the Nagios host (IP=192.168.0.33), Security server is the remote client, IP=192.168.0.115, so, yes you're right rkennedy.
I must apologize, I'm not at work today.
I'll be in front of the server from 7:30am tomorrow (22 hours from now, in case we're in different time zones
) and I hope you guys can continue with the help.
Greatly appreciate your replies,
Andrew.
I shall run "nmap 192.168.0.115", "/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -n" and "sshd -h" before posting tomorrow
I must apologize, I'm not at work today.
I'll be in front of the server from 7:30am tomorrow (22 hours from now, in case we're in different time zones
Greatly appreciate your replies,
Andrew.
I shall run "nmap 192.168.0.115", "/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -n" and "sshd -h" before posting tomorrow
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
Sounds good - we will watch for your update.
Former Nagios Employee
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
Hopefully there is some helpful info there.rkennedy wrote:It looks like there are two issues here. The first being with plugins, and the second with the PID not being killed. Just to clarify -- is your Nagios running smoothly, and just the service checks failing?.....Nagios appears to be running OK. EDIT: Not really, it displays as though it's running but I've found it doesn't respond to any commands from the WEB UI. I believe it may load long enough to display data but then is in fact shutdown. It's very strange. The WEB UI says running with PID xxxx, "svcs -xv" says the service isn't running, it's in a state of "maintenance".
You mentioned 192.168.0.115 being the Nagios host, but your check indicates it's actually a Windows machine running NSClient++ 0.4.1. This is why your SSH is getting connection refused. Please run nmap 192.168.0.115.
root@COTESS-SYSMON:~# nmap 192.168.0.115
Starting Nmap 6.25 ( http://nmap.org ) at 2016-04-20 07:46 AEST
Nmap scan report for security.cote.local (192.168.0.115)
Host is up (0.00034s latency).
Not shown: 986 closed ports
PORT STATE SERVICE
80/tcp open http
135/tcp open msrpc
139/tcp open netbios-ssn
443/tcp open https
445/tcp open microsoft-ds
1311/tcp open rxmon
1433/tcp open ms-sql-s
2179/tcp open vmrdp
2383/tcp open ms-olap4
3389/tcp open ms-wbt-server
5666/tcp open nrpe
49152/tcp open unknown
49153/tcp open unknown
49154/tcp open unknown
MAC Address: 00:19:B9:EF:1E:C8 (Dell)
The second part, check_ssh -H 127.0.0.1 should be working fine. What is the output of sshd -h from the Nagios host?
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_ssh -H 127.0.0.1
Server answer: root@COTESS-SYSMON:~#
root@COTESS-SYSMON:~# sshd -h
bash: sshd: command not found
.....Do I need to change directories for this command to work?
I tried this:
root@COTESS-SYSMON:~# svcs ssh
STATE STIME FMRI
online 7:36:53 svc:/network/ssh:default
Did you compile Nagios plugins with SSL support? What is the output of running /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -n? The -n flag will run without SSL.
root@COTESS-SYSMON:~# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -n
CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.
.....Reading back over the install tutorial, I can't see SSL support being configured, but again, I am a Unix newbie.
The second check, against localhost suggests it's failing because it's routing over IPv6. Can you take a look at your /etc/hosts and make sure IPv4 is resolvable?
"/etc/inet/hosts" exists...."/etc/hosts" is non-existent
#
# Copyright 2009 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
#
# Internet host table
#
127.0.0.1 COTESS-SYSMON localhost loghost
::1 COTESS-SYSMON localhost
Thanks,
Andrew
Last edited by ruffy01 on Wed Apr 20, 2016 5:00 am, edited 2 times in total.
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
After 6.5 hours of totally unproductive work today trying to get this setup functioning, I'm thinking (my boss is thinking!) maybe a different OS would be the way to go.
Solaris 11.3 appears to be a little flakey on our hardware, combined with Nagios issues....
Any advice either way will be greatly appreciated. Bottom line is I can't afford too much more time on this configuration.
We have a Dell R300 which will be solely configured to host Nagios on a Unix/Linux platform to monitor several Windows physical servers & several Hyper-V servers (Windows also).
It must be rock solid & reliable. The monitoring is required for compliance regulations relevant to the industry my client operates in.
Cheers,
Andrew.
Solaris 11.3 appears to be a little flakey on our hardware, combined with Nagios issues....
Any advice either way will be greatly appreciated. Bottom line is I can't afford too much more time on this configuration.
We have a Dell R300 which will be solely configured to host Nagios on a Unix/Linux platform to monitor several Windows physical servers & several Hyper-V servers (Windows also).
It must be rock solid & reliable. The monitoring is required for compliance regulations relevant to the industry my client operates in.
Cheers,
Andrew.
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
I'd recommend Centos 7. I wrote this guide, which should work for you without issues.
https://assets.nagios.com/downloads/nag ... entos7.pdf
As for the deal with your plugins, I suspect NRPE wasn't working because plugins were compiled without SSL. This should work with ease, once things are setup properly.
Another option, is to use our enterprise product (https://www.nagios.com/products/nagios-xi/), and deploy the OVA file straight to your VM infrustructure. You could also do a source install on your R300 if you wanted to stick with a bare metal system.
https://assets.nagios.com/downloads/nag ... entos7.pdf
As for the deal with your plugins, I suspect NRPE wasn't working because plugins were compiled without SSL. This should work with ease, once things are setup properly.
Another option, is to use our enterprise product (https://www.nagios.com/products/nagios-xi/), and deploy the OVA file straight to your VM infrustructure. You could also do a source install on your R300 if you wanted to stick with a bare metal system.
Former Nagios Employee
Re: Nagios 4.1.1 on Solaris 11.3 major headaches
Thanks rkennedy.rkennedy wrote:I'd recommend Centos 7. I wrote this guide, which should work for you without issues.
https://assets.nagios.com/downloads/nag ... entos7.pdf
As for the deal with your plugins, I suspect NRPE wasn't working because plugins were compiled without SSL. This should work with ease, once things are setup properly.
Another option, is to use our enterprise product (https://www.nagios.com/products/nagios-xi/), and deploy the OVA file straight to your VM infrustructure. You could also do a source install on your R300 if you wanted to stick with a bare metal system.
Do you have a link to recompile the plugins using SSL? I'd like to give that a quick look.
If not, then Centos it is
Cheers,
Andrew.