Manually Installing Log Server Issue

swilsongresh · Post by **swilsongresh** » Wed Jul 22, 2015 4:00 am

I am attempting to install Nagios Log Server on CentOS7 for a trial purposes. I have followed the instruction found in the *.pdf here:

https://library.nagios.com/library/prod ... structions

Once the fullinstall script completes I see the following:

Nagios Log Server Installation Success!

You can finish the final setup steps for Nagios Log Server by visiting:
http:///nagioslogserver/

So I believe the install has been successful.

I then browse to the url (in my case http://myip/nagioslogserver) where I get the screen to complete the install. I choose new instance, 60 day free trial and enter my details as requested. I then complete the installation.

At this stage I am left with a message in the browser which says "Waiting for database startup. It looks like your local elasticsearch service is starting. Why am I getting this error? Elasticsearch can take a little while to start up because of it's indexing. This may take a few seconds. The page will refresh automatically after 5 seconds..."

This seems to never complete. If I run sudo service elasticsearch status on the server itself the service is running. Whether the indexing is taking place (and just taking a ridiculously long time) or if the indexing has stopped I can't be sure. Either way I cannot actually start the system and I am not sure where to go to for some proper logging information. I have check /var/logs and I can see some areas of potential interest but they have shed no light for me.

To confirm I have installed this on a completely vanilla version of Centos7.

Can anybody advise as to what I may be doing wrong/point me in the direction of some meaningful logs?

swilsongresh · Post by **swilsongresh** » Wed Jul 22, 2015 6:21 am

Update to this, if I tail the poller.log file in /usr/local/nagioslogserver/var/ I see the following error:

Updating Cluster Hosts File
ERROR: Connection to elasticsearch cannot be made
Updating Elasticsearch with instance...
ERROR: Connection to elasticsearch cannot be made
Updating Cluster Hosts File
ERROR: Connection to elasticsearch cannot be made
Updating Elasticsearch with instance...
ERROR: Connection to elasticsearch cannot be made
Finished Polling.

To confirm if I run sudo service elasticsearch status on the (same) server then I see the following:

elasticsearch.service - LSB: This service manages the elasticsearch daemon
Loaded: loaded (/etc/rc.d/init.d/elasticsearch)
Active: active (exited) since Wed 2015-07-22 11:52:32 BST; 26min ago

Jul 22 11:52:32 nagioslog runuser[3690]: pam_unix(runuser:session): session opened for user nagios by (uid=0)
Jul 22 11:52:32 nagioslog elasticsearch[3646]: Starting elasticsearch: [ OK ]
Jul 22 11:52:32 nagioslog systemd[1]: Started LSB: This service manages the elasticsearch daemon.

It looks to me as though the service is running and I have now tried with both firewalld running and stopped.

jolson · Post by **jolson** » Wed Jul 22, 2015 9:56 am

There are a few files that we should look at here.

First, the elasticsearch log is likely to be the most telling:

Code: Select all

cat /var/log/elasticsearch/*.log

Httpd logs:

Code: Select all

cat /var/log/httpd/error_log
cat /var/log/httpd/access_log

Is SELinux disabled?

Code: Select all

sestatus

Let's also check out your sudoers files:

Code: Select all

cat /etc/sudoers
cat /etc/sudoers.d/*

swilsongresh · Post by **swilsongresh** » Wed Jul 22, 2015 10:39 am

Thank you for the reply jolson.

The results of the commands are as follows:

cat /var/log/elasticsearch/*.log - No results. If I look in that directory I have empty log files.

cat /var/log/httpd/error_log - in the attached logserver.zip, nothing jumping out at me other than FQN message, possibly the problem?
cat /var/log/httpd/access_log - in the attached logserver.zip. The IP you see is my local IP.

sestatus - SELinux status: disabled

Sudoers file - in the attached logserver.zip

One thing to confirm, I assume Logserver is supported on CentOS7?

Also worth mentiong that I have now attempted to uninstall and resinstall as well as build on a different server. Each time I see the same issue.

jolson · Post by **jolson** » Wed Jul 22, 2015 11:57 am

cat /var/log/elasticsearch/*.log - No results. If I look in that directory I have empty log files.

This is a definite red flag to me. Can you start elasticsearch as root?

Log in as root and run the following:

Code: Select all

systemctl elasticsearch restart
/etc/init.d/elasticsearch restart

swilsongresh · Post by **swilsongresh** » Thu Jul 23, 2015 5:10 am

Thanks again jolson.

I am now completely confused. After seeing your post this morning I did exactly as you suggested, logged in as root (which is what I had been doing previously) and ran those 2 commands. Both completed successfully and then I started to see entries in the /var/log/elasticsearch/*.log files.

At this point I attempted to hit the URL and bingo, I could see the login UI.

As such I attempted to login as nagiosadmin but saw the error:

The username specified does not exist. I searched through the forums and found this post:

https://support.nagios.com/forum/viewto ... in#p123695

Ran through these steps (so created the user "someuser") and was able to login.

I really do not know what has gone on here. Previously I had been running the command service elasticsearch start (as root) and the service was starting (as also confirmed by the message I was seeing in the UI as well as service elasticsearch status was returning the service as active). I guess this must be my misunderstanding of the difference between the 2 commands?

The good news is that the nagioslogserver does look to be up and running so I really appreciate your assistance with this as I can't believe I would have ever got to the bottom of that!

swilsongresh · Post by **swilsongresh** » Thu Jul 23, 2015 6:30 am

Perhaps I spoke to soon.

After attempting to add a log source, all appears to go well. I copy the command from within Nagios Logserver so:

curl -s -O http://192.168.150.209/nagioslogserver/ ... p-linux.sh
bash setup-linux.sh -s 192.168.150.209 -p 5544 -f "/path/to/file /path/to/another/file/*.log" -t FILE_TAG

I change the path to match the path of my application log directory run it (on the application server) and I can see all the logs displayed with a final message of:

SELinux is disabled.
rsyslog configuration check passed.
Restarting rsyslog service with 'service'...
Redirecting to /bin/systemctl restart rsyslog.service
Okay.
rsyslog is running with the new configuration.
Visit your Nagios Log Server dashboard to verify that logs are being received.

I login to the dashboard and can see nothing, it still only sees the one host (itself). To confirm the firewall is disabled on the application server.

The /var/log/elasticsearch/*.log (Attached) is showing the following:

[2015-07-23 12:22:34,765][DEBUG][action.search.type ] [b80d81b4-6d79-4a0c-ba18-9fe18da640e8] All shards failed for phase: [query_fetch]
org.elasticsearch.index.IndexShardMissingException: [nagioslogserver][0] missing

I am really not convinced on the true validity of my installation here.....

jolson · Post by **jolson** » Thu Jul 23, 2015 9:13 am

I have a hunch regarding what might be going on here, please try running the following commands:

Code: Select all

cat /usr/local/nagioslogserver/var/cluster_uuid
cat /usr/local/nagioslogserver/logstash/etc/conf.d/*
curl 'localhost:9200/_cluster/health?level=indices&pretty'

It's possible that the 'nagioslogserver' or similar index went bad.

swilsongresh · Post by **swilsongresh** » Thu Jul 23, 2015 9:26 am

Result are as follows:

cat /usr/local/nagioslogserver/var/cluster_uuid

3b70f83b-d36f-48ae-ba17-9f94d1d65244

cat /usr/local/nagioslogserver/logstash/etc/conf.d/*

#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 22 Jul 2015 11:53:49 +0100
#

#
# Global inputs
#

#
# Local inputs
#

#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 22 Jul 2015 11:53:49 +0100
#

#
# Global filters
#

#
# Local filters
#

#
# Logstash Configuration File
# Dynamically created by Nagios Log Server
#
# DO NOT EDIT THIS FILE. IT WILL BE OVERWRITTEN.
#
# Created Wed, 22 Jul 2015 11:53:49 +0100
#

#
# Required output for Nagios Log Server
#

output {
elasticsearch {
cluster => '3b70f83b-d36f-48ae-ba17-9f94d1d65244'
host => 'localhost'
document_type => '%{type}'
node_name => ''
protocol => 'transport'
workers => 4
}
}

#
# Global outputs
#

#
# Local outputs
#

curl 'localhost:9200/_cluster/health?level=indices&pretty'

{
"cluster_name" : "3b70f83b-d36f-48ae-ba17-9f94d1d65244",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 11,
"active_shards" : 11,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 11,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"indices" : {
"kibana-int" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},
"nagioslogserver" : {
"status" : "yellow",
"number_of_shards" : 1,
"number_of_replicas" : 1,
"active_primary_shards" : 1,
"active_shards" : 1,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 1
},
"nagioslogserver_log" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
}
}
}

Does this help?

jolson · Post by **jolson** » Thu Jul 23, 2015 9:37 am

Does this help?

Yes, the information that you've reported shows us that your cluster is in a healthy state and that there isn't likely any index corruption.

The bad news is that you don't have any 'data' indices being generated. What I want you to do is the following:

Get your node_UUID:

Code: Select all

cat /usr/local/nagioslogserver/var/node_uuid

Use the node UUID as a setting in your output:

Code: Select all

vi /usr/local/nagioslogserver/logstash/etc/conf.d/999_outputs.conf

Change:
node_name => ''
To:
node_name => 'YournodeUUID'
Replacing YournodeUUID with the output of the first cat command we ran. Do not remove the single quotes from the above configuration.

Restart logstash:

Code: Select all

service logstash restart

Hopefully that will help you with your problem. Any luck?

Nagios Support Forum

Manually Installing Log Server Issue

Manually Installing Log Server Issue

Re: Installing LogServer

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue

Re: Manually Installing Log Server Issue