1) Use of Ramdisk
2) MySQL offload
3) RRDcached use for npcd/rrdtool
everything is working fine except few things I noticed...
A) few add-ons like BPA are not working as not able to connect to MySQL. here are the logs when I try to use bpa... There may be other add-ons facing this issue after offloading mysql.
B) rrdtool is using rrdcached but constantly trying perl module first. is there a way to stop rrdtool checking for it.[Sun Jul 01 12:56:29 2012] [error] [client 172.24.41.233] [Sun Jul 1 12:56:29 2012] nagios-bp.cgi: Error: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2), referer: http://pnagios02lxv/nagiosxi/
[Sun Jul 01 12:56:29 2012] [error] [client 172.24.41.233] Premature end of script headers: nagios-bp.cgi, referer: http://pnagios02lxv/nagiosxi/
[Sun Jul 01 12:56:31 2012] [error] [client 172.24.41.233] [Sun Jul 1 12:56:31 2012] nagios-bp.cgi: DBI connect('nagios:localhost:3306','ndoutils',...) failed: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) at /usr/local/nagiosbp/lib/ndodb.pm line 68, referer: http://pnagios02lxv/nagiosxi/
C) Seeing some errors in syslog from rrdtool update2012-07-01 10:43:46 [15844] [2] RRDs Perl Modules are not installed. Falling back to rrdtool system call.
2012-07-01 10:43:46 [15844] [2] /usr/bin/rrdtool update --daemon=unix:/var/rrdtool/rrdcached/rrdcached.sock /usr/local/nagios/share/perfdata/UWEB248NTV/Disk_Usage.rrd 1341164613:76:50
2012-07-01 10:43:46 [15844] [1] rrdtool update returns 0
2012-07-01 10:43:46 [15844] [1] 6 lines processed
2012-07-01 10:43:46 [15844] [1] /var/nagiosramdisk/spool/perfdata//1341164616.perfdata.service-PID-15844 deleted
2012-07-01 10:43:46 [15844] [1] PNP exiting (runtime 0.03363s) ...
D) I reduced the npcd and process_perfdata logging to minimum as those were other top disk users. By default the logging is set to log level 1.Jul 1 13:21:07 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/UAPP520NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/UAPP520NTV/Disk_Usage.rrd: found extra data on update argument: 21)
Jul 1 13:21:37 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USQL03NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USQL03NTV/CPU_Load.rrd: found extra data on update argument: 0)
Jul 1 13:21:37 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/OPSPCPMSIS01/_HOST_.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/OPSPCPMSIS01/_HOST_.rrd: expected 2 data source readings (got 1) from 1341173207)
Jul 1 13:22:23 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USACTX12NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USACTX12NTV/CPU_Load.rrd: found extra data on update argument: 0:0)
Jul 1 13:23:24 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USAWEB19NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USAWEB19NTV/CPU_Load.rrd: found extra data on update argument: 0)
Jul 1 13:23:24 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USQL07NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USQL07NTV/Disk_Usage.rrd: found extra data on update argument: 18)
Jul 1 13:27:28 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/USACTXW14NTV/CPU_Load.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/USACTXW14NTV/CPU_Load.rrd: found extra data on update argument: 0)
Jul 1 13:31:17 pnagios02lxv nagios: Auto-save of retention data completed successfully.
Jul 1 13:34:06 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/UIMAGE5NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/UIMAGE5NTV/Disk_Usage.rrd: expected 9 data source readings (got 3) from 1341173621)
Jul 1 13:36:55 pnagios02lxv rrdcached[26019]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/UAPP10NTV/Disk_Usage.rrd) failed with status -1. (/usr/local/nagios/share/perfdata/UAPP10NTV/Disk_Usage.rrd: found extra data on update argument: 25)
E) I used iotop to see the top IO users and can see postgres continuously writing to disk. Will it be helpful to offload postgres as well?