Every few days all our systems stop sending logs

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Post Reply
Frouldeste
Posts: 1
Joined: Wed Mar 27, 2024 3:05 am

Every few days all our systems stop sending logs

Post by Frouldeste »

Every few days all our systems stop sending logs (or so it appears). But, I can get the logs to start up again once I restart logstash (via "service logstash restart"). I assume I shouldn't need to continuously restart logstash. What are some possible causes and what logs on the OS or application can I look at to try and troubleshoot the issue?
User avatar
jmichaelson
Posts: 123
Joined: Wed Aug 23, 2023 1:02 pm

Re: Every few days all our systems stop sending logs

Post by jmichaelson »

You can check the logstash logs by entering journalctl -xeu logstash in a terminal window.

The logstash logs can be found in /usr/local/nagioslogserver/logstash/logs.

Look for anything relating to an unhandled exception. Feel free to post snippets here (sanitized, if necessary, to remove private data) and we can provide further help.
Please let us know if you have any other questions or concerns.

-Jason
dscrimpsher
Posts: 10
Joined: Wed Jan 22, 2014 4:24 pm

Re: Every few days all our systems stop sending logs

Post by dscrimpsher »

Hi, I am seeing this also.
I am running NLS 2024R1.0.1 on a 2-note cluster.

I have 38 unique hosts sending logs. After a couple days the number of unique hosts drops to ZERO. I know there is nothing wring with all those hists be cause when I reboot the Nagios Log Servers (2-node cluster) I get tons of logs from the last couple days suddenly showing up, including entries that should have been displayed in the log server history from those past few days. Then a couple days later same thing all over again. Nothing shows up, no new logs from my hosts. I reboot the log server and see log data gain. (Repeats...)
The NLS GUI show both nodes green, no errors indicates. Both instances are up and green check marks. No indication anything is wrong.

The command journalctl -xeu logstash shows this:
~
~
~
~
-- Logs begin at Thu 2024-05-02 15:30:01 MDT, end at Mon 2024-05-06 07:28:16 MDT. --
May 02 15:30:24 logserver1.csi.edu systemd[1]: Starting LSB: Logstash...
-- Subject: Unit logstash.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit logstash.service has begun starting up.
May 02 15:30:24 logserver1 runuser[1301]: pam_unix(runuser:session): session opened for user nagios by (uid=0)
May 02 15:30:24 logserver1 logstash[1207]: Starting Logstash Daemon: [ OK ]
May 02 15:30:24 logserver1 systemd[1]: Started LSB: Logstash.
-- Subject: Unit logstash.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit logstash.service has finished starting up.
--
-- The start-up result is done.
May 03 21:00:06 logserver1 logstash[1207]: Errno::EBADF: Bad file descriptor - Bad file descriptor
May 03 21:00:06 logserver1u logstash[1207]: each at org/jruby/RubyIO.java:3565
May 03 21:00:06 logserver1 logstash[1207]: tcp_receiver at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:173
May 03 21:00:06 logserver1 logstash[1207]: tcp_listener at /usr/local/nagioslogserver/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-syslog-2.0.5/lib/logstash/inputs/syslog.rb:159
May 03 21:00:06 logserver1 runuser[1301]: pam_unix(runuser:session): session closed for user nagios


After a reboot that command gives me:
~
~
~
-- Logs begin at Mon 2024-05-06 07:39:01 MDT, end at Mon 2024-05-06 07:41:07 MDT. --
May 06 07:39:24 logserver1 systemd[1]: Starting LSB: Logstash...
-- Subject: Unit logstash.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit logstash.service has begun starting up.
May 06 07:39:24 logserver1 runuser[1289]: pam_unix(runuser:session): session opened for user nagios by (uid=0)
May 06 07:39:25 logserver1 logstash[1197]: Starting Logstash Daemon: [ OK ]
May 06 07:39:25 logserver1 systemd[1]: Started LSB: Logstash.
-- Subject: Unit logstash.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/li ... temd-devel
--
-- Unit logstash.service has finished starting up.
--
-- The start-up result is done.
Post Reply