Page 3 of 4

Re: Elasticsearch service failure on Nagios Logserver

Posted: Tue Nov 30, 2021 3:04 pm
by pbroste
Hello @HIINNS

Thanks for following up with the results. Appears that the command came back with the 'last login' with cron, which seems to be pulling the last login from a cronjob.

Circling back a bit to when you stated;
Further looking into this revealed that the /usr/share/elasticsearch directory is set up using docker'
.

Since 'elasticsearch' is installed inside a 'Docker Container' which is basically not supported within our application configuration. The following will provide guidance. Let's list all the Docker Containers to at least get a list of them.

Code: Select all

docker container ls
Let's find out the status:

Code: Select all

docker ps
Then we can take a peek at the logs for the 'elasticsearch' Docker Container so we can see what is going on there:

Code: Select all

docker container logs --details <containernamehere>
Thanks,
Perry

Re: Elasticsearch service failure on Nagios Logserver

Posted: Wed Dec 01, 2021 5:36 am
by HIINNS
docker container ls
-bash: docker: command not found

whereis docker
docker: /etc/docker

cd /etc/docker
ls -la
certs.d

cd /usr/share/common/Docker
ls -la

containerd.io-1.3.7-3.1.el7.x86_64.rpm
containerd.io-1.4.3-3.1.el7.x86_64.rpm
docker-ce-19.03.13-3.el7.x86_64.rpm
docker-ce-20.10.2-3.el7.x86_64.rpm
docker-ce-cli-19.03.13-3.el7.x86_64.rpm
docker-ce-cli-20.10.2-3.el7.x86_64.rpm
docker-ce-rootless-extras-20.10.2-3.el7.x86_64.rpm
nodejs-14.15.4-1nodesource.x86_64.rpm
nodesource-release-el7-1.noarch.rpm
node-v14.15.4-linux-x64.tar.gz
node-v14.15.4-linux-x64.tar.xz

Looks like Docker was installed and removed. I can't find the executable anywhere.

Re: Elasticsearch service failure on Nagios Logserver

Posted: Wed Dec 01, 2021 12:55 pm
by HIINNS
I found the following in directories /lib/systemd/system and /etc/systemd/system

[Unit]
Description=Elasticsearch
Documentation=https://www.elastic.co
Wants=network-online.target
After=network-online.target

[Service]
Type=notify
RuntimeDirectory=elasticsearch
PrivateTmp=true
Environment=ES_HOME=/usr/share/elasticsearch
Environment=ES_PATH_CONF=/etc/elasticsearch
Environment=PID_DIR=/var/run/elasticsearch
Environment=ES_SD_NOTIFY=true
#EnvironmentFile=-/etc/sysconfig/elasticsearch
EnvironmentFile=/etc/sysconfig/elasticsearch

WorkingDirectory=/usr/share/elasticsearch

User=elasticsearch
Group=elasticsearch

ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet

# StandardOutput is configured to redirect to journalctl since
# some error messages may be logged in standard output before
# elasticsearch logging system is initialized. Elasticsearch
# stores its logs in /var/log/elasticsearch and does not use
# journalctl by default. If you also want to enable journalctl
# logging, you can simply remove the "quiet" option from ExecStart.
StandardOutput=journal
StandardError=inherit

# Specifies the maximum file descriptor number that can be opened by this process
LimitNOFILE=65535

# Specifies the maximum number of processes
LimitNPROC=4096

# Specifies the maximum size of virtual memory
LimitAS=infinity

# Specifies the maximum file size
LimitFSIZE=infinity

# Disable timeout logic and wait until process is stopped
TimeoutStopSec=0

# SIGTERM signal is used to stop the Java process
KillSignal=SIGTERM

# Send the signal only to the JVM rather than its control group
KillMode=process

# Java process is never killed
SendSIGKILL=no

# When a JVM receives a SIGTERM signal it exits with code 143
SuccessExitStatus=143

# Allow

I fonund that this command " ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet" is used to stand up elasticsearch using Docker. I looked at the code in an earlier version that works, and the above was not found in either directory.

The earlier version contains a startup script in /etc/rec3.d/S80elasticsearch and /etc/rc5.d/S80elasticsearch. It is also found in /etc/inet.d.

Re: Elasticsearch service failure on Nagios Logserver

Posted: Wed Dec 01, 2021 4:30 pm
by pbroste
Hello @HIINNS

Circling back, appears that the logstash and elasticsearch services are running and that the api checks verify:

Code: Select all

curl -X GET "localhost:9200/_cat/nodes?v=true&pretty"
curl -XGET http://localhost:9200/_status?pretty
curl -XGET 'localhost:9200/_cluster/health?pretty'
If that is the case then we want to check that apache service and configs verify. First, is the apache.service running?

Code: Select all

systemctl status httpd
or 'systemctl status apache2' depending on distro.

Let's send over the apache logs;

Code: Select all

tar -czvf /tmp/apachelogs.tar.gz /var/log/httpd/
Thanks,
Perry

Re: Elasticsearch service failure on Nagios Logserver

Posted: Thu Dec 02, 2021 11:46 am
by HIINNS
curl -X GET "localhost:9200/_cat/nodes?v=true&pretty"
host ip heap.percent ram.percent load node.role master name
myserver 127.0.1.1 1 64 0.00 d * bec4d3fc-0bab-49f5-88cf-fb1094c85cfd

curl -XGET http://localhost:9200/_status?pretty
{
"_shards" : {
"total" : 0,
"successful" : 0,
"failed" : 0
},
"indices" : { }
}

curl -XGET http://localhost:9200/_status?pretty
{
"_shards" : {
"total" : 0,
"successful" : 0,
"failed" : 0
},
"indices" : { }
}

curl -XGET 'localhost:9200/_cluster/health?pretty'
{
"cluster_name" : "c0d6d20a-0a3c-4d14-9cda-1f2f4fcb6b55",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0

systemctl status httpd
Last login: Thu Dec 2 05:30:04 EST 2021
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2021-11-08 12:19:57 EST; 3 weeks 2 days ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 9890 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
Main PID: 18084 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
CGroup: /system.slice/httpd.service
├─ 3895 /usr/sbin/httpd -DFOREGROUND
├─ 3954 /usr/sbin/httpd -DFOREGROUND
├─ 3962 /usr/sbin/httpd -DFOREGROUND
├─ 9895 /usr/sbin/httpd -DFOREGROUND
├─ 9896 /usr/sbin/httpd -DFOREGROUND
├─ 9897 /usr/sbin/httpd -DFOREGROUND
├─ 9898 /usr/sbin/httpd -DFOREGROUND
├─ 9899 /usr/sbin/httpd -DFOREGROUND
└─18084 /usr/sbin/httpd -DFOREGROUND

Nov 22 03:34:01 myserver systemd[1]: Reloading The Apache HTTP Server.
Nov 22 03:34:02 myserver systemd[1]: Reloaded The Apache HTTP Server.
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailabl

Re: Elasticsearch service failure on Nagios Logserver

Posted: Fri Dec 03, 2021 10:55 am
by pbroste
Hello @HIINNS

The curl from the api status all passes, and apache is running, want to find out what you see when curl the log server login page:

Code: Select all

curl -k http://yourhostaddresshere/nagioslogserver/login&pretty --verbose
Add redirect to file to send it over > /tmp/curlresults.txt

Thanks,
Perry

Re: Elasticsearch service failure on Nagios Logserver

Posted: Fri Dec 03, 2021 12:03 pm
by HIINNS
curl -k http://192.168.23.207/nagioslogserver/login&pretty --verbose > /tmp/curlresults.txt
[1] 4081
-bash: pretty: command not found

Re: Elasticsearch service failure on Nagios Logserver

Posted: Fri Dec 03, 2021 3:08 pm
by pbroste
You can remove the '&pretty' from the command,

Code: Select all

curl -k --verbose http://yourhostaddresshere/nagioslogserver/login
Perry

Re: Elasticsearch service failure on Nagios Logserver

Posted: Mon Dec 06, 2021 5:39 am
by HIINNS
curl -k --verbose http://nnagsr10/nagioslogserver/login > /tmp/curlresults.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to myserver port 80 (#0)
* Trying 127.0.1.1...
* Connected to myserver (127.0.1.1) port 80 (#0)
> GET /nagioslogserver/login HTTP/1.1
> User-Agent: curl/7.29.0
> Host: myserver
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 500 Internal Server Error
< Date: Mon, 06 Dec 2021 10:37:15 GMT
< Server: Apache/2.4.6 (Red Hat Enterprise Linux) PHP/5.4.16
< X-Powered-By: PHP/5.4.16
< Content-Length: 0
< Connection: close
< Content-Type: text/html; charset=UTF-8
<
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Closing connection 0

Re: Elasticsearch service failure on Nagios Logserver

Posted: Mon Dec 06, 2021 4:26 pm
by pbroste
Hello @HIINNS

Thanks for following up. Even though elasticsearch from all aspects is running, we cannot figure out the association with a true Docker Container and why everything from the backend is running, and the API status is green. Is this a test environment or Proof of Concept?

We see several issues one; logstash config files are missing
:message=>"No config files found: /usr/local/nagioslogserver/logstash/etc/conf.d/*\nCan you make sure this path is a logstash config file?"
Indication that the install was incomplete or permissions are incorrect. Take a look at the file structure within the /usr/local/nagioslogserver/... directory verify that configs are present and that the permissions look correct.

Here is the Nagios Log server installer setup to set up a fresh install without Docker:
https://assets.nagios.com/downloads/nag ... Server.pdf

Attached is an example of the directory structure owner and group permissions listed.

Thanks,
Perry