A few days ago I've perform Nagios LS upgrade to latest version as well as RHEL update (just regular periodic update, but with minor release update 8.3->8.4).
After this, I’m experiencing "Gateway Timeout" after some short period of time after server reboot. Just after restart web page is accessible.
Elastic and logstash seems to work, cluster shows proper status.
Code: Select all
curl -XGET 'http://localhost:9200/_cat/nodes?v=true'
host ip heap.percent ram.percent load node.role master name
wa-ls1- xxxx yyyy6 41 68 8.78 d * 903343ea-ad24-4c48-8f25-7b265f59YYYY
wa-ls2-xxxx zzzzz7 18 53 14.09 d m ddc19c62-9fba-4bfa-ab6e-8b849146ZZZZ
curl -XGET 'http://locahost:9200/_cat/health?v=true'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks
1625140436 13:53:56 4d61897c-50b7-4d74-b24e-587b2023XXXX green 2 2 1074 537 0 0 0 0
But all the efforts and tries just not helping. System is unusable, since I can’t access GUI.
Both cluster nodes should have resources to process httpd w/o problems.
iostats shows avg-cpu %idle ~70%, %iowait less than 0.1 and disks subsystem ~20-30% utilization.
I have ~100GB daily logs, so far (until upgrade) cluster was performing OK.
Temporary I disabled firewall and uninstalled AV software, but it wasn’t a case. Selinux disabled.
Im not sure if it looks normal (part of pstree output), a lot sleeping proceses spawned by cron:
Code: Select all
MASTER NODE:
├─crond─┬─834*[crond───sh───php───curator.sh───curator]
│ └─7*[crond───sh───php]
2nd NODE:
├─crond─┬─132*[crond───sh───php───curator.sh───curator]
│ └─7*[crond───sh───php]