Automatic Source Restart

Sarg0n · Post by **Sarg0n** » Mon Aug 14, 2017 9:00 am

We have more than 200 sources added to our Nagios NA implementation. Every time we reboot the system for anything, we have to manually start every single source all over again. This takes way too much time out of the work day. Is there a configuration setting that we can adjust so that the sources will start on system boot?

Also, is there a way to start all of the sources at once instead of going down the line to start them individually? If we only had less than 20 sources, the individual startup might be all right, but since we have over 200, it's rather painful.

Thank you for your help!

scottwilkerson · Post by **scottwilkerson** » Mon Aug 14, 2017 12:04 pm

Can you verify what version of Network Analyzer you are running.

I just rebooted one of our test servers on 2.2.3 and all started as expected.

Also, can you send the results of the following:

Code: Select all

chkconfig |grep  nagiosna

Sarg0n · Post by **Sarg0n** » Mon Aug 14, 2017 12:43 pm

We are using 2.2.3 as well. I might add that we are using a CentOS 7.3.1611 OS.

When I ran "chkconfig |grep nagiosna" as requested, it returned: "Note: This output shows SysV services only and does not include native systemd services. SysV configuration data might be overridden by native systemd connfiguration. If you want to list systemd services use 'systemctl list-unit-files'".

So I ran

Code: Select all

systemctl list-unit-files |grep nagiosna

and here are the results:

Code: Select all

nagiosna.service                 enabled

Is there perhaps a configuration that can be adjusted to start all sources upon system boot?

scottwilkerson · Post by **scottwilkerson** » Mon Aug 14, 2017 1:12 pm

Lets see the output of

Code: Select all

systemctl status nagios

Sarg0n · Post by **Sarg0n** » Mon Aug 14, 2017 1:56 pm

Code: Select all

systemctl status nagios

This is what I get:

Code: Select all

Unit nagios.service could not be found.

Did you mean nagiosna? Here is the output for that...I modified the IP and domain for confidentiality reasons:

Code: Select all

● nagiosna.service - NagiosNA Daemon
   Loaded: loaded (/usr/lib/systemd/system/nagiosna.service; enabled; vendor preset: disabled)
   
Active: active (running) since Mon 2017-08-07 14:37:08 EDT; 1 weeks 0 days ago
  
Process: 2772 ExecStart=/etc/rc.d/init.d/nagiosna start (code=exited, status=0/SUCCESS)
   
CGroup: /system.slice/nagiosna.service
           
	├─2819 /usr/local/bin/nfcapd -I 1 -l /usr/local/nagiosna/var/10.X.X.X/flows -p 3601 -x /usr/local/nagiosna/bin/reap_files.py %d %f %i -P /usr/local/nagiosna/var/10.X.X.X/3601.pid -D -e -w -z

	├─2820 /usr/local/bin/nfcapd -I 1 -l /usr/local/nagiosna/var/10.X.X.X/flows -p 3601 -x /usr/local/nagiosna/bin/reap_files.py %d %f %i -P /usr/local/nagiosna/var/10.X.X.X/3601.pid -D -e -w -z
           
	├─2844 /usr/local/bin/nfcapd -I 2 -l /usr/local/nagiosna/var/10.X.X.X/flows -p 3602 -x /usr/local/nagiosna/bin/reap_files.py %d %f %i -P /usr/local/nagiosna/var/10.X.X.X/3602.pid -D -e -w -z
           
	└─2845 /usr/local/bin/nfcapd -I 2 -l /usr/local/nagiosna/var/10.X.X.X/flows -p 3602 -x /usr/local/nagiosna/bin/reap_files.py %d %f %i -P /usr/local/nagiosna/var/10.X.X.X/3602.pid -D -e -w -z



Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: Launcher: fork child.
Aug 14 15:50:10 sensor-syslog.umis.i
wan.local nfcapd[2820]: Launcher: child exec done.

Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: Run expire on '/usr/local/nagiosna/var/10.X.X.X/flows'
Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: Limits: Filesize <none>, Lifetime 86400 = 1.0 days, Watermark: 95%

Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: Current size: 1130496 = 1.1 MB, Current lifetime: 82500 = 22.9 hours, Number of files: 276

Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: expire completed - nothing to expire.

Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: laucher child exit 1 childs.

Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: launcher child 24892 exit status: 0

Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: laucher waiting childs done. 0 childs
Aug 14 15:50:10 sensor-syslog.domain.domain.local nfcapd[2820]: Launcher: Sleeping

Post by **tgriep** » Mon Aug 14, 2017 4:14 pm

The nagiosna init script "/etc/init.d/nagiosna", when it is started, runs a python script and starts all of the sources in this MYSQL table called nagiosna_Sources.

Can you run the following to get the list of sources and post it here?

Code: Select all

echo 'select * from nagiosna_Sources;' | mysql -unagiosna -pnagiosna nagiosna -t

Sarg0n · Post by **Sarg0n** » Tue Aug 15, 2017 10:14 am

I'm afraid I can't show the actual sources, as they are on a different enclave. The commands I ran for the prior post were from a test system that is mirroring the actual server. I can show the following example of what I am seeing on the server when I run that command:

Code: Select all

echo 'select * from nagiosna_Sources;' | mysql -unagiosna -pnagiosna nagiosna -t

produces

Code: Select all

+-----+------+-----------+-------------+----------+-------------------------------------+----------+------------------+
| sid | port | addresses | name        | flowtype | directory                           | lifetime | disable_abnormal |
+-----+------+-----------+-------------+----------+-------------------------------------+----------+------------------+
|   1 | 3601 |           | 10.x.x.x | netflow  | /usr/local/nagiosna/var/10.x.x.x | 0H      |                0 |
|   2 | 3602 |           | 10.x.x.x | netflow  | /usr/local/nagiosna/var/10.x.x.x | 0H      |                0 |
+-----+------+-----------+-------------+----------+-------------------------------------+----------+------------------+

scottwilkerson · Post by **scottwilkerson** » Tue Aug 15, 2017 10:37 am

Sarg0n wrote:I'm afraid I can't show the actual sources, as they are on a different enclave. The commands I ran for the prior post were from a test system that is mirroring the actual server. I can show the following example of what I am seeing on the server when I run that command:

Code: Select all

echo 'select * from nagiosna_Sources;' | mysql -unagiosna -pnagiosna nagiosna -t

produces

Code: Select all

+-----+------+-----------+-------------+----------+-------------------------------------+----------+------------------+
| sid | port | addresses | name        | flowtype | directory                           | lifetime | disable_abnormal |
+-----+------+-----------+-------------+----------+-------------------------------------+----------+------------------+
|   1 | 3601 |           | 10.x.x.x | netflow  | /usr/local/nagiosna/var/10.x.x.x | 0H      |                0 |
|   2 | 3602 |           | 10.x.x.x | netflow  | /usr/local/nagiosna/var/10.x.x.x | 0H      |                0 |
+-----+------+-----------+-------------+----------+-------------------------------------+----------+------------------+

This only shows 2 sources on the system (not 200), both of which appear to have started with the service according to your previous post.

I'm not sure where the problem is but if we are going to come to a solution we will need info from the actual server having the problem.

Sarg0n · Post by **Sarg0n** » Tue Aug 15, 2017 11:16 am

When I run it on the actual server it shows the same exact output, only from 200 sources with different hostnames and IPs.

The re-initialization failure may have something to do with the python script on our end.

scottwilkerson · Post by **scottwilkerson** » Tue Aug 15, 2017 11:20 am

Sarg0n wrote:The re-initialization failure may have something to do with the python script on our end.

Can you investigate this and get back to us once you have your findings?

You may be able to get the information you are looking for in

Code: Select all

tail -f /usr/local/nagiosna/var/backend.log

Nagios Support Forum

Automatic Source Restart

Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart

Re: Automatic Source Restart