Page 1 of 1

verify config / apply config sometimes not returning

Posted: Tue Nov 22, 2016 6:48 am
by _asp_
Hi,

we have a single instance.
When saving a new configuration (new / changed filters) the functions "verify" and "apply config" fo not retrun.
I waited several minutes.

It works slightly better when logstash service has been stopped before.

But here my questions:
1. Why is it so slow?
2. can I run both from command line, so that I can follow the progress and that I can cancel it?

In pure elk it is much easier / faster, when the config is just stored in files instead of elasticsearch. Sure the cost is the lack of a gui ;-)

Thanks, Andreas

Re: verify config / apply config sometimes not returning

Posted: Tue Nov 22, 2016 10:50 am
by rkennedy
What amount of time does it actually take you? I just tested on a machine I have here locally, and it only took 12 seconds.

1. This does not sound normal.
2. Nope, the apply configuration will need to be done through the GUI due to everything that happens. We are not just writing to a file, and restarting - there is more of a process which involves writing to a DB, and then writing out to files.

A few questions for you -
1. What modifications have been made to the system / environment?
2. What OS / version of NLS are you running?
3. What amount of resources do you have allocated to it, and how much data worth of open indexes? This could just be a resource issue.

Re: verify config / apply config sometimes not returning

Posted: Wed Nov 23, 2016 4:31 am
by _asp_
problem occurs on production:
24GB RAM, 9 CPU, RedHat 6.5, 150-200GB open indexes, version 1.4.1

But I can reproduce the problem in the VM you are offering for download:
- changed to 3 CPU and 5 GB RAM (on 4CPU + Hyperthreading, 8GB Host). Switched to HostOnlyNetwork. Version 1.4.4.
- new installation.

Verify is working. It takes about 1-2 Minutes to return.

When hitting apply, I can see the logstash process consuming up to 230% cpu. Then it's load is going down to zero.
Web frontent is hanging in status "Running". Waiting vor 10 Minutes now.

The configuration of logstash has been changed in /usr/local/nagioslogserver/logstash/etc/conf.d (checked the timestamp)

So is the problem only GUI related?

In httpd log I can still see continous requests:

Code: Select all

192.168.56.1 - - [23/Nov/2016:10:29:31 +0100] "POST /nagioslogserver/api/system/status HTTP/1.1" 200 87 "http://192.168.56.101/nagioslogserver/configure/apply" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) G                              ecko/20100101 Firefox/45.0"


Re: verify config / apply config sometimes not returning

Posted: Wed Nov 23, 2016 10:15 am
by mcapra
So is the problem only GUI related?
It could be a timeout issue with PHP/Ajax calls. I would be interested in seeing what the actual job is doing.

Code: Select all

tail -f /usr/local/nagioslogserver/var/jobs.log
And do an apply configuration. You should only see one instance of "Running command apply_config with args". If you see multiples, something is not working right which would explain the prolonged "Running" status. If it just takes a long time between "Running command apply_config with args" and "SUCCESS", then Logstash is likely just processing a very complex configuration. This is pretty common with configurations that contain several unique plugins or lots of branching logic.

Re: verify config / apply config sometimes not returning

Posted: Thu Nov 24, 2016 1:47 pm
by _asp_
Here is what you wanted:

Code: Select all

# tail -f /usr/local/nagioslogserver/var/jobs.log
Running command delete_snapshot with args 'a:1:{s:4:"path";s:75:"/usr/local/nagioslogserver/snapshots/applyconfig.snapshot.1480012681.tar.gz";}' for job id: AViXpmzBX8Pxd4A2tYTo
SUCCESS
Running command apply_config with args 'a:2:{s:5:"sh_id";s:20:"AViXpmzRX8Pxd4A2tYTp";s:10:"sh_created";i:1480013016;}' for job id: AViXpmzhX8Pxd4A2tYTq
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
SUCCESS
Processed 2 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Processed 0 node jobs.
Processed 0 global jobs.
tail: /usr/local/nagioslogserver/var/jobs.log: file truncated
Files have been updated when the 2nd success was written.
The gui stays in status "running".

Re: verify config / apply config sometimes not returning

Posted: Mon Nov 28, 2016 11:14 am
by mcapra
If the apply job is taking longer than 30 seconds, you might consider editing your /etc/php.ini and increasing the max_execution_time and max_input_time values. Be sure to restart httpd after doing this.

Another place to check would be your browser's Javascript console to see if there are any errors in there while applying configuration.

And last but not least, we can query the API directly for the status of the job. This will tell us if the job is actually stuck in a "running" state or of the Javascript simply isn't updating correctly. The process is a bit of a pain though.

You'll need to tail -f /usr/local/nagioslogserver/var/jobs.log while running an apply config, then pull out the apply_config command id and feed that into a CURL call. From your previous output, you can get the command ID from:

Running command apply_config with args 'a:2:{s:5:"sh_id";s:20:"AViXpmzRX8Pxd4A2tYTp";s:10:"sh_created";i:1480013016;}' for job id: AViXpmzhX8Pxd4A2tYTq

And plug it into a CURL call to the back-end API endpoint that checks the status of jobs:

Code: Select all

curl -XPOST 'http://<nagios_log_server_host>/nagioslogserver/api/system/get_cmd_info?token=<api_key>&cmd_id=AViXpmzhX8Pxd4A2tYTq'