Elasticsearch service failure on Nagios Logserver

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Elasticsearch service failure on Nagios Logserver

Post by pbroste »

Hello @HIINNS

Thanks for following up with the results. Appears that the command came back with the 'last login' with cron, which seems to be pulling the last login from a cronjob.

Circling back a bit to when you stated;
Further looking into this revealed that the /usr/share/elasticsearch directory is set up using docker'
.

Since 'elasticsearch' is installed inside a 'Docker Container' which is basically not supported within our application configuration. The following will provide guidance. Let's list all the Docker Containers to at least get a list of them.

Code: Select all

docker container ls
Let's find out the status:

Code: Select all

docker ps
Then we can take a peek at the logs for the 'elasticsearch' Docker Container so we can see what is going on there:

Code: Select all

docker container logs --details <containernamehere>
Thanks,
Perry
HIINNS
Posts: 172
Joined: Wed Mar 14, 2018 9:43 am

Re: Elasticsearch service failure on Nagios Logserver

Post by HIINNS »

docker container ls
-bash: docker: command not found

whereis docker
docker: /etc/docker

cd /etc/docker
ls -la
certs.d

cd /usr/share/common/Docker
ls -la

containerd.io-1.3.7-3.1.el7.x86_64.rpm
containerd.io-1.4.3-3.1.el7.x86_64.rpm
docker-ce-19.03.13-3.el7.x86_64.rpm
docker-ce-20.10.2-3.el7.x86_64.rpm
docker-ce-cli-19.03.13-3.el7.x86_64.rpm
docker-ce-cli-20.10.2-3.el7.x86_64.rpm
docker-ce-rootless-extras-20.10.2-3.el7.x86_64.rpm
nodejs-14.15.4-1nodesource.x86_64.rpm
nodesource-release-el7-1.noarch.rpm
node-v14.15.4-linux-x64.tar.gz
node-v14.15.4-linux-x64.tar.xz

Looks like Docker was installed and removed. I can't find the executable anywhere.
HIINNS
Posts: 172
Joined: Wed Mar 14, 2018 9:43 am

Re: Elasticsearch service failure on Nagios Logserver

Post by HIINNS »

I found the following in directories /lib/systemd/system and /etc/systemd/system

[Unit]
Description=Elasticsearch
Documentation=https://www.elastic.co
Wants=network-online.target
After=network-online.target

[Service]
Type=notify
RuntimeDirectory=elasticsearch
PrivateTmp=true
Environment=ES_HOME=/usr/share/elasticsearch
Environment=ES_PATH_CONF=/etc/elasticsearch
Environment=PID_DIR=/var/run/elasticsearch
Environment=ES_SD_NOTIFY=true
#EnvironmentFile=-/etc/sysconfig/elasticsearch
EnvironmentFile=/etc/sysconfig/elasticsearch

WorkingDirectory=/usr/share/elasticsearch

User=elasticsearch
Group=elasticsearch

ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet

# StandardOutput is configured to redirect to journalctl since
# some error messages may be logged in standard output before
# elasticsearch logging system is initialized. Elasticsearch
# stores its logs in /var/log/elasticsearch and does not use
# journalctl by default. If you also want to enable journalctl
# logging, you can simply remove the "quiet" option from ExecStart.
StandardOutput=journal
StandardError=inherit

# Specifies the maximum file descriptor number that can be opened by this process
LimitNOFILE=65535

# Specifies the maximum number of processes
LimitNPROC=4096

# Specifies the maximum size of virtual memory
LimitAS=infinity

# Specifies the maximum file size
LimitFSIZE=infinity

# Disable timeout logic and wait until process is stopped
TimeoutStopSec=0

# SIGTERM signal is used to stop the Java process
KillSignal=SIGTERM

# Send the signal only to the JVM rather than its control group
KillMode=process

# Java process is never killed
SendSIGKILL=no

# When a JVM receives a SIGTERM signal it exits with code 143
SuccessExitStatus=143

# Allow

I fonund that this command " ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet" is used to stand up elasticsearch using Docker. I looked at the code in an earlier version that works, and the above was not found in either directory.

The earlier version contains a startup script in /etc/rec3.d/S80elasticsearch and /etc/rc5.d/S80elasticsearch. It is also found in /etc/inet.d.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Elasticsearch service failure on Nagios Logserver

Post by pbroste »

Hello @HIINNS

Circling back, appears that the logstash and elasticsearch services are running and that the api checks verify:

Code: Select all

curl -X GET "localhost:9200/_cat/nodes?v=true&pretty"
curl -XGET http://localhost:9200/_status?pretty
curl -XGET 'localhost:9200/_cluster/health?pretty'
If that is the case then we want to check that apache service and configs verify. First, is the apache.service running?

Code: Select all

systemctl status httpd
or 'systemctl status apache2' depending on distro.

Let's send over the apache logs;

Code: Select all

tar -czvf /tmp/apachelogs.tar.gz /var/log/httpd/
Thanks,
Perry
HIINNS
Posts: 172
Joined: Wed Mar 14, 2018 9:43 am

Re: Elasticsearch service failure on Nagios Logserver

Post by HIINNS »

curl -X GET "localhost:9200/_cat/nodes?v=true&pretty"
host ip heap.percent ram.percent load node.role master name
myserver 127.0.1.1 1 64 0.00 d * bec4d3fc-0bab-49f5-88cf-fb1094c85cfd

curl -XGET http://localhost:9200/_status?pretty
{
"_shards" : {
"total" : 0,
"successful" : 0,
"failed" : 0
},
"indices" : { }
}

curl -XGET http://localhost:9200/_status?pretty
{
"_shards" : {
"total" : 0,
"successful" : 0,
"failed" : 0
},
"indices" : { }
}

curl -XGET 'localhost:9200/_cluster/health?pretty'
{
"cluster_name" : "c0d6d20a-0a3c-4d14-9cda-1f2f4fcb6b55",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0

systemctl status httpd
Last login: Thu Dec 2 05:30:04 EST 2021
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2021-11-08 12:19:57 EST; 3 weeks 2 days ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 9890 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
Main PID: 18084 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
CGroup: /system.slice/httpd.service
├─ 3895 /usr/sbin/httpd -DFOREGROUND
├─ 3954 /usr/sbin/httpd -DFOREGROUND
├─ 3962 /usr/sbin/httpd -DFOREGROUND
├─ 9895 /usr/sbin/httpd -DFOREGROUND
├─ 9896 /usr/sbin/httpd -DFOREGROUND
├─ 9897 /usr/sbin/httpd -DFOREGROUND
├─ 9898 /usr/sbin/httpd -DFOREGROUND
├─ 9899 /usr/sbin/httpd -DFOREGROUND
└─18084 /usr/sbin/httpd -DFOREGROUND

Nov 22 03:34:01 myserver systemd[1]: Reloading The Apache HTTP Server.
Nov 22 03:34:02 myserver systemd[1]: Reloaded The Apache HTTP Server.
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailabl
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Elasticsearch service failure on Nagios Logserver

Post by pbroste »

Hello @HIINNS

The curl from the api status all passes, and apache is running, want to find out what you see when curl the log server login page:

Code: Select all

curl -k http://yourhostaddresshere/nagioslogserver/login&pretty --verbose
Add redirect to file to send it over > /tmp/curlresults.txt

Thanks,
Perry
HIINNS
Posts: 172
Joined: Wed Mar 14, 2018 9:43 am

Re: Elasticsearch service failure on Nagios Logserver

Post by HIINNS »

curl -k http://192.168.23.207/nagioslogserver/login&pretty --verbose > /tmp/curlresults.txt
[1] 4081
-bash: pretty: command not found
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Elasticsearch service failure on Nagios Logserver

Post by pbroste »

You can remove the '&pretty' from the command,

Code: Select all

curl -k --verbose http://yourhostaddresshere/nagioslogserver/login
Perry
HIINNS
Posts: 172
Joined: Wed Mar 14, 2018 9:43 am

Re: Elasticsearch service failure on Nagios Logserver

Post by HIINNS »

curl -k --verbose http://nnagsr10/nagioslogserver/login > /tmp/curlresults.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to myserver port 80 (#0)
* Trying 127.0.1.1...
* Connected to myserver (127.0.1.1) port 80 (#0)
> GET /nagioslogserver/login HTTP/1.1
> User-Agent: curl/7.29.0
> Host: myserver
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 500 Internal Server Error
< Date: Mon, 06 Dec 2021 10:37:15 GMT
< Server: Apache/2.4.6 (Red Hat Enterprise Linux) PHP/5.4.16
< X-Powered-By: PHP/5.4.16
< Content-Length: 0
< Connection: close
< Content-Type: text/html; charset=UTF-8
<
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Closing connection 0
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Elasticsearch service failure on Nagios Logserver

Post by pbroste »

Hello @HIINNS

Thanks for following up. Even though elasticsearch from all aspects is running, we cannot figure out the association with a true Docker Container and why everything from the backend is running, and the API status is green. Is this a test environment or Proof of Concept?

We see several issues one; logstash config files are missing
:message=>"No config files found: /usr/local/nagioslogserver/logstash/etc/conf.d/*\nCan you make sure this path is a logstash config file?"
Indication that the install was incomplete or permissions are incorrect. Take a look at the file structure within the /usr/local/nagioslogserver/... directory verify that configs are present and that the permissions look correct.

Here is the Nagios Log server installer setup to set up a fresh install without Docker:
https://assets.nagios.com/downloads/nag ... Server.pdf

Attached is an example of the directory structure owner and group permissions listed.

Thanks,
Perry
You do not have the required permissions to view the files attached to this post.
Locked