How to keep nagios from killing drives?

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

How to keep nagios from killing drives?

Postby t3dus » Sat Jan 13, 2018 6:16 pm

How do I keep Nagios insane amount of reads/writes from killing hard drives and SD cards??

Let me explain.
Within a year and a half I had setup 4 Nagios servers but each time Nagios has entirely killed the drive it lived on..

The first 3 nagios servers were on a Raspberry Pi with a 32GB SanDisk Ultra SD card. Nagios killed the SD cards with all it's read/writes it does from logging and whatever else it is doing.. SD cards are useless now.

The 4th server that just died today was on an old Dell with a SATA HD. This was a brand new hard drive that I put in that server about 6 months ago. Nagios ended up killing it off too. Now the drive has bad sectors and is failing.

What is it with Nagios that causes it to kill off my drives? It's frustrating. For now I gave up having any nagios server on my home network because I simply can't afford to replace drives every 3-6 months.

My current nagios server is on a VPS so if it kills the vps drives at least it's somebody else's problem to replace.
But I can no longer monitor my home network like i want which annoys me.
User avatar
t3dus
 
Posts: 109
Joined: Thu Feb 04, 2016 3:46 pm
Location: IA

Re: How to keep nagios from killing drives?

Postby dwhitfield » Sat Jan 13, 2018 9:44 pm

There are several things you could do, but a ramdisk would be the most direct thing to do: https://assets.nagios.com/downloads/nag ... giosXI.pdf

While the document is written for XI, I use a ramdisk for my browser cache. They are definitely not just for XI. They are pretty amazing for transitory files. I'm not sure the pi has enough RAM. The pi is not really built for intense workloads.

Aside from the ramdisk, you could also check less often.

Are you collecting perfdata? If so, turning that off would be another great way to reduce i/o.

What class of SD cards? I wouldn't try to do anything substantial on less than Class 10.
dwhitfield
Former Nagios Staff
 
Posts: 4569
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN

Re: How to keep nagios from killing drives?

Postby t3dus » Sat Jan 13, 2018 10:31 pm

dwhitfield wrote:There are several things you could do, but a ramdisk would be the most direct thing to do: https://assets.nagios.com/downloads/nag ... giosXI.pdf

While the document is written for XI, I use a ramdisk for my browser cache. They are definitely not just for XI. They are pretty amazing for transitory files. I'm not sure the pi has enough RAM. The pi is not really built for intense workloads.

Might have to try the ramdisk suggestion.

Nagios barely tapped the CPU and memory power of the PI. It just ate SD cards for breakfast. Roughly killed them every 3 months.
dwhitfield wrote:Aside from the ramdisk, you could also check less often.

I had it set for the default check intervals. what would you suggest for the interval if not the default?
dwhitfield wrote:Are you collecting perfdata? If so, turning that off would be another great way to reduce i/o.

No
dwhitfield wrote:What class of SD cards? I wouldn't try to do anything substantial on less than Class 10.

Class 10 always. The SD cards weren't especially cheap.
User avatar
t3dus
 
Posts: 109
Joined: Thu Feb 04, 2016 3:46 pm
Location: IA

Re: How to keep nagios from killing drives?

Postby dwhitfield » Mon Jan 15, 2018 1:24 pm

We had a bit of a discussion around here before the twitter response went out. I don't know for certain where your configs are, but if you point this to your configs and then PM them to me I can take a look: tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc

We were thinking perhaps you have debugging turned up. You'd want to look in the nagios.cfg, but also ndo2db.cfg (although, I guess you may not be running this) and npcd.cfg (if you aren't collecting perfdata, may not be running this...but really you should probably just uninstall this if you aren't collecting perfdata). You could also probably turn down generally logging at the system level. Are you running non-nagios pi's that would give us a baseline?
dwhitfield
Former Nagios Staff
 
Posts: 4569
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN

Re: How to keep nagios from killing drives?

Postby t3dus » Mon Jan 15, 2018 1:33 pm

dwhitfield wrote:We had a bit of a discussion around here before the twitter response went out. I don't know for certain where your configs are, but if you point this to your configs and then PM them to me I can take a look: tar -zcvf /tmp/supporttar.tar.gz /usr/local/nagios/etc

We were thinking perhaps you have debugging turned up. You'd want to look in the nagios.cfg, but also ndo2db.cfg (although, I guess you may not be running this) and npcd.cfg (if you aren't collecting perfdata, may not be running this...but really you should probably just uninstall this if you aren't collecting perfdata). You could also probably turn down generally logging at the system level. Are you running non-nagios pi's that would give us a baseline?


Well my current nagios pi servers have since died. I have a brand new Class 10 Sandisk Ultra sd card put in my pi with a fresh install of ubuntu server though I haven't done nothing further with it from that so I'd have to rebuild it before i can follow your steps.

I had been running naigos on an old Dell server with a new drive before that drive died so as of currently, I have zero local nagios servers running.

I did rebuild my Dell server as a KVM host this time with a new 2 TB drive. It's currently hosting two VM servers which have nothing on them atm. I thought about using one for Nagios and trying my luck one last time. The benefit of having nagios setup as a VM server is at least if it dies again I can restore a VM without having to rebuild the whole server again..

I do have one nagios server running on a VPS though but it just monitors 22 hosts or so.
User avatar
t3dus
 
Posts: 109
Joined: Thu Feb 04, 2016 3:46 pm
Location: IA

Re: How to keep nagios from killing drives?

Postby dwhitfield » Mon Jan 15, 2018 1:52 pm

Is your setup on the VPS mostly the same as what you were running locally? I'm just wondering if packaging up the configs from the VPS would be useful in tracking things down. It might also be worth checking with your VPS provider to see if they notice anything strange about the amount of disk access is your setup is using.

I didn't address the check intervals directly before because if I got the configs I could just see those. As for what to use instead of the defaults, it really just depends.

I guess one way to get at the setup other than the configs would be the instructions you are using for setup. Do you use our compile instructions, install from a repo, or use some other instructions? Our default instructions wouldn't have you set up npcd or ndo2db, but other instructions may have you do that. Both of those would be *potential* disk hogs.
dwhitfield
Former Nagios Staff
 
Posts: 4569
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN


Return to Nagios Core

Who is online

Users browsing this forum: No registered users and 31 guests