CPU, Memory and Disk Space resource utilization

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Hi Support Team,

I am using Nagios Core 4.4.5 on CentOS 7.7. I am monitoring 25 servers as of now. Is there a way to find out what is the current resource utilization of all the 25 servers. For example:- Server 1 -> CPU, Memory and Disk Space resource utilization and so on and so forth for the remaining servers. This is an exercise to do Capacity Planning in our Organisation to find out if the capability of current infrastructure (25 servers) is sufficient for the next 6 months to a year.

Thanks in Advance and I look forward to hearing from you.

Best Regards,

Kaushal
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: CPU, Memory and Disk Space resource utilization

Post by Box293 »

Nagios XI comes with a lot of this built in as it comprises of a lot of different technologies.

Most likely you'll extrapolate that data from performance data. I would suggest you look at implementing Influxdb/Grafana. Here's some documentation on how to do that:

https://support.nagios.com/kb/article/n ... u-802.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

Re: CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Thanks Troy Lea for the reply. I will keep you posted as it progresses. Much appreciated for your help
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: CPU, Memory and Disk Space resource utilization

Post by cdienger »

Sounds good!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

Re: CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Hi

Functionally it is working. I am attaching the screenshot. I am running

[img]
[/img]

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.4.5
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2019-08-20
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
Checked 311 services.
Checked 34 hosts.
Checked 1 host groups.
Checked 0 service groups.
Checked 28 contacts.
Checked 9 contact groups.
Checked 39 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 34 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check

Versions :- influxdb-1.7.9-1.x86_64 , grafana-6.5.3-1.x86_64, histou v0.4.3, Nagflux v0.4.1, CentOS Linux release 7.7.1908 (Core), Nagios Core 4.4.5

Please let me know if you need any additional information.

Best Regards,

Kaushal
Attachments
nagiosperformancedatagraph.png
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

Re: CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Hi Troy,

I see breaks in the graph as per the screenshot attached.

Best Regards,

Kaushal
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

Re: CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Hi Troy,

I am attaching the screenshot again for your reference.
Screenshot 2020-01-30 at 11.43.47 AM.png
Best Regards,

Kaushal
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: CPU, Memory and Disk Space resource utilization

Post by Box293 »

The breaks in your performance data generally mean you are not receiving valid data back from the plugin during these intervals, or perhaps for some reason the performance data is being deleted before it is being processed.

You may want to look at the influxdb logs to see if it is reporting any errors.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

Re: CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Hi Troy,

Please find the below details after investigating further.

Code: Select all

[b]systemctl status nagflux.service[/b]
● nagflux.service - A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/nagflux.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-01-31 04:10:26 UTC; 12h ago
     Docs: https://github.com/Griesbacher/nagflux
 Main PID: 28895 (nagflux)
   CGroup: /system.slice/nagflux.service
           └─28895 /opt/nagflux/nagflux -configPath /opt/nagflux/config.gcfg
Jan 31 17:05:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:05:59 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:28 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:28 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:28 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:28 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:28 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:28 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:29 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:29 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:29 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:29 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:29 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:29 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:59 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:59 Critical: Connection type is unknown, options are: tcp, file. Input:
Jan 31 17:06:59 ip-172-31-0-145.ap-south-1.compute.internal nagflux[28895]: 2020-01-31 17:06:59 Critical: Connection type is unknown, options are: tcp, file. Input :
cat /opt/nagflux/config.gcfg

Code: Select all

[main]
	NagiosSpoolfileFolder = "/usr/local/nagios/var/spool/nagfluxperfdata"
	NagiosSpoolfileWorker = 1
	InfluxWorker = 2
	MaxInfluxWorker = 5
	DumpFile = "nagflux.dump"
	NagfluxSpoolfileFolder = "/usr/local/nagios/var/nagflux"
	FieldSeparator = "&"
	BufferSize = 10000
	FileBufferSize = 65536
	DefaultTarget = "all"

[Log]
	LogFile = ""
	MinSeverity = "INFO"

[Livestatus]
#        # tcp or file
         Type = "file"
#        # tcp: 127.0.0.1:6557 or file /var/run/live
        file /usr/local/nagios/var/live.sock
#        #Address = "127.0.0.1:6557"
#        # The amount to minutes to wait for livestatus to come up, if set to 0 the detection is disabled
        MinutesToWait = 2
#        # Set the Version of Livestatus. Allowed are Nagios, Icinga2, Naemon.
#        # If left empty Nagflux will try to detect it on it's own, which will not always work.
       Version = ""

[InfluxDBGlobal]
	CreateDatabaseIfNotExists = true
	NastyString = ""
	NastyStringToReplace = ""
	HostcheckAlias = "hostcheck"

[InfluxDB "nagflux"]
	Enabled = true
	Version = 1.0
	Address = "http://127.0.0.1:8086"
	Arguments = "precision=ms&u=root&p=root&db=nagflux"
	StopPullingDataIfDown = true

[InfluxDB "fast"]
	Enabled = false
	Version = 1.0
	Address = "http://127.0.0.1:8086"
	Arguments = "precision=ms&u=root&p=root&db=fast"
	StopPullingDataIfDown = false
Livestatus live socker file is /usr/local/nagios/var/live.sock
srw-rw----. 1 nagios nagios 0 Jan 29 07:20 /usr/local/nagios/var/live.sock

I have enabled the below in /opt/nagflux/config.gcfg. nagflux service does not start at all.

Code: Select all

[Livestatus]
#        # tcp or file
        Type = "file"
#        # tcp: 127.0.0.1:6557 or file /var/run/live
        file /usr/local/nagios/var/live.sock
#        #Address = "127.0.0.1:6557"
#        # The amount to minutes to wait for livestatus to come up, if set to 0 the detection is disabled
        MinutesToWait = 2
#        # Set the Version of Livestatus. Allowed are Nagios, Icinga2, Naemon.
#        # If left empty Nagflux will try to detect it on it's own, which will not always work.
        Version = ""
Nagios Cfg file

Code: Select all

process_performance_data=1
host_perfdata_file=/usr/local/nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file-nagflux
#
service_perfdata_file=/usr/local/nagios/var/host-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file-nagflux


#systemctl status nagflux.service
● nagflux.service - A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/nagflux.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Fri 2020-01-31 17:10:48 UTC; 3s ago
Docs: https://github.com/Griesbacher/nagflux
Process: 10845 ExecStart=/opt/nagflux/nagflux -configPath /opt/nagflux/config.gcfg (code=exited, status=2)
Main PID: 10845 (code=exited, status=2)

Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal nagflux[10845]: main.main()
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal nagflux[10845]: /root/gorepo/src/github.com/griesbacher/nagflux/main.go:68 +0x22e
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Unit nagflux.service entered failed state.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: nagflux.service failed.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: nagflux.service holdoff time over, scheduling restart.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Stopped A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: start request repeated too quickly for nagflux.service
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Failed to start A connector which transforms performancedata from Nagios/Icinga(2)/Naemon to InfluxDB/Elasticsearch.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: Unit nagflux.service entered failed state.
Jan 31 17:10:48 ip-172-31-0-145.ap-south-1.compute.internal systemd[1]: nagflux.service failed.

Please suggest further and correct me if I am missing anything. I look forward to hearing from you. Thanks in Advance.

Best Regards,

Kaushal
kaushalshriyan
Posts: 124
Joined: Fri May 22, 2015 7:12 am

Re: CPU, Memory and Disk Space resource utilization

Post by kaushalshriyan »

Hi Troy,

Checking in again if you had a chance to look at the post to this forum?

Best Regards,

Kaushal
Locked