Page 1 of 1

Was getting swap critical yesterday

Posted: Tue Mar 11, 2014 8:50 am
by billperrotta
***** Nagios *****

Notification Type: PROBLEM

Service: Swap Usage
Host: localhost
Address: 127.0.0.1
State: CRITICAL

Date/Time: Mon Mar 10 21:09:45 EDT 2014

Additional Info:

SWAP CRITICAL - 0% free (0 MB out of 1496 MB)

Seems to have recovered.

May not need to do anything inless it comes back.


***** Nagios *****

Notification Type: RECOVERY

Service: Current Load
Host: localhost
Address: 127.0.0.1
State: OK

Date/Time: Tue Mar 11 01:02:41 EDT 2014

Additional Info:

OK - load average: 0.00, 0.05, 2.79

Re: Was getting swap critical yesterday

Posted: Tue Mar 11, 2014 11:07 am
by sreinhardt
How much memory do you have allocated to the system? Also do you see memory increases about when this happened?

Re: Was getting swap critical yesterday

Posted: Tue Mar 11, 2014 12:09 pm
by billperrotta
Stupid question but I haven't checked memory in linux for a while.

What is the command to check it?

And when it happens again how do i monitor it when it happens.

Re: Was getting swap critical yesterday

Posted: Tue Mar 11, 2014 12:50 pm
by abrist

Code: Select all

free -m
You can use "check_mem" with nrpe - but it is a non standard check, you may have to get it from the exchange.

Re: Was getting swap critical yesterday

Posted: Wed Mar 12, 2014 7:42 am
by billperrotta
How does this look?

ahgmonitor:~ # free -m
total used free shared buffers cached
Mem: 994 767 227 0 198 418
-/+ buffers/cache: 150 844
Swap: 1496 61 1435


This means I have about a 1 gig of Ramm correct?

Is that ok for a monitoring server?

is 767 to much used should I add Ramm if possible? not a big deal if not getting errors. I think two days ago when there was a lot of activity, It was saying out of memory but not now.

Re: Was getting swap critical yesterday

Posted: Wed Mar 12, 2014 11:00 am
by slansing
That is quite a bit below what you would want on a monitoring server. Generally we recommend at least 8GB dedicated to the server, you could get away with less depending on what you are checking, how you are checking "active/passive," and how often you are checking "check/retry intervals."