NFS tuning or LogServer tuning to help CPU usage

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
chud
Posts: 36
Joined: Thu Jul 18, 2019 5:51 pm

NFS tuning or LogServer tuning to help CPU usage

Post by chud »

Hello folks.

My Nagios Log Server is storing logs in an NFS share on our Isilon. I know local storage is preferable, but this was what was best for us in terms of available space.

Anyway, our Log Server (which has 4 CPU) keeps alerting in Nagios XI, and after some troubleshooting it looks like accessing the NFS share may be one reason for the high CPU usage.

One additional piece of info, the CPU alerts became more of a problem after we added our Cisco Firepower to start sending logs; we have dialed this device back to only sending logs for errors, but CPU alerts continue with the Log Server.

Any advice on tuning to help with CPU usage?
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: NFS tuning or LogServer tuning to help CPU usage

Post by mbellerue »

NFS storage tuning is outside of what we can help with. I can point you to a helpful doc on the subject, but everyone's storage setup is unique to their environment.

The Linux Documentation Project: Optimizing NFS Performance
https://www.tldp.org/HOWTO/NFS-HOWTO/performance.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
chud
Posts: 36
Joined: Thu Jul 18, 2019 5:51 pm

Re: NFS tuning or LogServer tuning to help CPU usage

Post by chud »

How about Log Server or ELK stack tuning?
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: NFS tuning or LogServer tuning to help CPU usage

Post by mbellerue »

We do have a doc on general performance tuning. I apologize, I should have posted this earlier.
https://assets.nagios.com/downloads/nag ... hrough.pdf

For ELK stack, that's a little more complicated. You can certainly search and find a number of articles related to performance tuning, but do be careful before implementing any changes.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
chud
Posts: 36
Joined: Thu Jul 18, 2019 5:51 pm

Re: NFS tuning or LogServer tuning to help CPU usage

Post by chud »

My main question regarding the ELK stack is if I increase the log server's RAM or CPU, what should I adjust in either LogServer or in ELK to take advantage of the additional resources?
For example: Logstash workers, ElasticSearch Java heap size, etc ?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NFS tuning or LogServer tuning to help CPU usage

Post by scottwilkerson »

chud wrote:My main question regarding the ELK stack is if I increase the log server's RAM or CPU, what should I adjust in either LogServer or in ELK to take advantage of the additional resources?
For example: Logstash workers, ElasticSearch Java heap size, etc ?
You just need to restart elasticsearch

We have a script that calculates what the best heap size it that set it when elasticsearch starts
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
chud
Posts: 36
Joined: Thu Jul 18, 2019 5:51 pm

Re: NFS tuning or LogServer tuning to help CPU usage

Post by chud »

scottwilkerson wrote: We have a script that calculates what the best heap size it that set it when elasticsearch starts
Can you provide the location of that script so that I can take a look at it?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NFS tuning or LogServer tuning to help CPU usage

Post by scottwilkerson »

On a CentOS/RHEL system it is

Code: Select all

/etc/sysconfig/elasticsearch
On Ubuntu/Debian

Code: Select all

/etc/default/elasticsearch
The line you are looking for is

Code: Select all

ES_HEAP_SIZE=$(expr $(free -m|awk '/^Mem:/{print $2}') / 2 )m
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
chud
Posts: 36
Joined: Thu Jul 18, 2019 5:51 pm

Re: NFS tuning or LogServer tuning to help CPU usage

Post by chud »

As mentioned previously, our Nagios Log Server is alerting in Nagios XI because of all the traffic from our Cisco Firepower and various servers that are sending logs.
However at this point we don't even have all of our servers, routers, and switches sending to Log Server yet.

So the question is, what do you do for Log Server to help it handle the traffic?

Is it just a matter of adding more CPU and increasing the NIC capacity on the server itself?
Or can you balance the traffic if you have a cluster?
My understanding is that if you have a 2-node cluster, the second log server is just a mirror of the first - so it is a cluster from a data redundancy standpoint, but not from a performance standpoint - there is no added performance benefit.
If we go to a 3-node (or more) cluster, is there a performance advantage (sort of like putting multiple web servers behind a load balancer)?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NFS tuning or LogServer tuning to help CPU usage

Post by scottwilkerson »

chud wrote: Is it just a matter of adding more CPU and increasing the NIC capacity on the server itself?
this will help a little
chud wrote:Or can you balance the traffic if you have a cluster?
Yes this would be the preferred method, you can send logs to any of the instances in the cluster and this will spread out the load caused by log ingestion
chud wrote:My understanding is that if you have a 2-node cluster, the second log server is just a mirror of the first - so it is a cluster from a data redundancy standpoint, but not from a performance standpoint - there is no added performance benefit.
This is incorrect, you can use all instances for spreading the load for ingestion. Additionally, while you are correct in that with a 2 node cluster you have a replica, you are incorrect in thinking that it doesn't help performance.

Each of the nodes can not only be used for ingestion of logs, but additionally, they all participate in spreading the load when log data is being queried no matter which node you are logged into, they all share the load.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked