Gateway Timeout - after LS periodic upgrade

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
dariusz.nalazek
Posts: 39
Joined: Thu Nov 16, 2017 6:46 am

Gateway Timeout - after LS periodic upgrade

Post by dariusz.nalazek »

Hello.

A few days ago I've perform Nagios LS upgrade to latest version as well as RHEL update (just regular periodic update, but with minor release update 8.3->8.4).

After this, I’m experiencing "Gateway Timeout" after some short period of time after server reboot. Just after restart web page is accessible.

Elastic and logstash seems to work, cluster shows proper status.

Code: Select all

curl -XGET 'http://localhost:9200/_cat/nodes?v=true'
host                 ip            heap.percent ram.percent  load node.role master name                                 
wa-ls1- xxxx yyyy6           41          68  8.78 d         *      903343ea-ad24-4c48-8f25-7b265f59YYYY 
wa-ls2-xxxx zzzzz7           18          53 14.09 d         m      ddc19c62-9fba-4bfa-ab6e-8b849146ZZZZ 

curl -XGET 'http://locahost:9200/_cat/health?v=true'
epoch      timestamp cluster                              status node.total node.data shards pri relo init unassign pending_tasks 
1625140436 13:53:56  4d61897c-50b7-4d74-b24e-587b2023XXXX green           2         2   1074 537    0    0        0             0
I increased some parameters of PHP/Apache (more memory, bigger timeouts, more open files), from some other posts found on this forum + some recommendations from Nagios XI according PHP article.
But all the efforts and tries just not helping. System is unusable, since I can’t access GUI.


Both cluster nodes should have resources to process httpd w/o problems.
iostats shows avg-cpu %idle ~70%, %iowait less than 0.1 and disks subsystem ~20-30% utilization.
I have ~100GB daily logs, so far (until upgrade) cluster was performing OK.

Temporary I disabled firewall and uninstalled AV software, but it wasn’t a case. Selinux disabled.

Im not sure if it looks normal (part of pstree output), a lot sleeping proceses spawned by cron:

Code: Select all

MASTER NODE:
        ├─crond─┬─834*[crond───sh───php───curator.sh───curator]
        │       └─7*[crond───sh───php]

2nd NODE:
        ├─crond─┬─132*[crond───sh───php───curator.sh───curator]
        │       └─7*[crond───sh───php]
D.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Gateway Timeout - after LS periodic upgrade

Post by cdienger »

We have received a ticket for this issue and will continue troubleshooting through that.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked