Page 2 of 3

Re: Memory reports showing no data for few Linux servers

Posted: Mon Feb 15, 2021 6:22 pm
by benjaminsmith
Hi,

You'll want to modify the output string, adding the pipe character and perf data. For example (change to be made for all exit codes):

Code: Select all

echo "OK - Memory Utilization is $mem_used % | memory_used=$mem_used%"
exit 0
Optionally, you can add the critical and warn metrics after the data separated by a semicolon.

Code: Select all

'label'=value[UOM];[warn];[crit];[min];[max]
Once the changes are made, it will take about 15 minutes for the graph to generate. If you change the output later, delete the graph and let it start over.

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 12, 2021 1:39 am
by pratikmehta003
Hi Benjamin,

Tried like this, is it correct?

echo "CRITICAL - Memory Utilization is $mem_used % | 'Memory Used'=$mem_used%;$warning_val;$critical_val;;"
exit 2
elif [ $mem_used -ge $warning_val ]
then
echo "WARNING - Memory Utilization is $mem_used %| 'Memory Used'=$mem_used%;$warning_val;$critical_val;;"
exit 1
else
echo "OK - Memory Utilization is $mem_used % | 'Memory Used'=$mem_used%;$warning_val;$critical_val;;"
exit 0
fi

and after doing this, under advanced setting for the service it does show the value but graph still not visible.
screenshot for service monitor.PNG

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 12, 2021 5:47 pm
by benjaminsmith
Hi,

It does take some time to generate the graph, and let it run for while, then let me know if that's graphing after 15-20 minutes.

The other is that after making several changes to the output, there may an issue with the current template structure. Try deleting the old files, and then let me know if it starts to graph.

You'll find the performance data in the following folder. Navigate to the host and find the service and remove the old rrd and XML files.

Code: Select all

/usr/local/nagios/share/perfdata
If that does not resolve the issue, please follow the guide to increase the verbosity of logging for the performance graph and upload the logs. Thanks, Benjamin

Nagios XI - Performance Graph Problems

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 19, 2021 1:51 am
by pratikmehta003
Hi Benjamin,

I tried to make a copy of the service and then i could see the graphs coming.. Strange!!

However i came across anothing yest. For all the hosts and services, data stopped getting collected and i was seeing that the monitoring engine status was Red.

In the event log, i could see this etnry, is it due to that?

WARNING: RLIMIT_NPROC is 7224, total max estimated processes is 8134! You should increase your limits (ulimit -u, or limits.conf)

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 19, 2021 12:56 pm
by lmiltchev
If you are dealing with resource limits, and need to tweak the limits.conf settings, please review the KB article below:

https://support.nagios.com/kb/article/n ... ng-19.html

Hope this helps.

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 19, 2021 12:59 pm
by pratikmehta003
in addition to that, also seeing below errors in event log today.. and also seeing graphs of all services are affected, its showing lot of breakages...

Can you let me know what changes are required to be done?
error2.PNG

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 19, 2021 1:21 pm
by pratikmehta003
Hi,

For the limits.conf I see the document that 2-4 different scenarios.. so which one needs to be checked?

And can u also check my prev message with screenshot about the details I am seeing in event log..

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 19, 2021 1:34 pm
by lmiltchev
For the limits.conf I see the document that 2-4 different scenarios.. so which one needs to be checked?
I was talking about this section of the article:
example-01.jpg
in addition to that, also seeing below errors in event log today.. and also seeing graphs of all services are affected, its showing lot of breakages...
It's possible that you have too many perf data files piled up in some of the directories. Can you check how many files you have in the xidpe, perfdata, and checkresults directory?

Code: Select all

ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -l
If you have thousands of files in the xidpe directory (for example), you would need to remove it, then recreate it, since most likely you won't be able to just delete the files.

Example:

Code: Select all

cd /usr/local/nagios/var/spool
rm -rf xidpe
mkdir xidpe
chown nagios.nagios xidpe
chmod 755 xidpe
If you are still having issues, I would recommend that you open a support ticket via our support center here:

https://support.nagios.com/tickets/

and send your latest profile (Admin > System Config > System Profile > Download Profile). Our support techs will need to review your configs and logs in order to further troubleshoot the issue.

Re: Memory reports showing no data for few Linux servers

Posted: Fri Mar 19, 2021 9:02 pm
by pratikmehta003
Hi,

i ran the 3 commands but dont see any huge files.. 2 of them are showing '0' below snip for reference:
screenshot of cmd.PNG

Re: Memory reports showing no data for few Linux servers

Posted: Mon Mar 22, 2021 11:09 am
by benjaminsmith
Hi,

Thanks for posting the results, they look normal, but it's still likely there's still a resource issue causing the graphs to stop processing.

Please send over a system profile and we can take a closer look at the logs and settings.

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button

Or, as suggested earlier, please open a support ticket at:

https://support.nagios.com/tickets/

Thanks!
Benjamin