Hello,
We have been trying to get some Postgres server monitoring completed.
However, when we setup the monitor, the only options available in the Postgres monitor are:
Connection Status
Backend Connections
WAL Files
Where do I see DB Locks, commits, hits, fetches etc?
My assumption is that we should be able to get a similar set of stats from Postgres that we get from the MSSQL monitor.
Monitoring Postgres Server options - Can't find them
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Postgres Server options - Can't find them
The 3 postgres wizards have a handful of common options but not all. You can setup commands throught the CCM to have whatever checks you want
From the help for the plugin many of these can be achieved outside of the wizard, depending on what you are looking for.
From the help for the plugin many of these can be achieved outside of the wizard, depending on what you are looking for.
Code: Select all
# /usr/local/nagios/libexec/check_postgres.pl -h
Usage: check_postgres.pl <options>
Run various tests against one or more Postgres databases.
Returns with an exit code of 0 (success), 1 (warning), 2 (critical), or 3 (unknown)
This is version 2.21.0.
Common connection options:
-H, --host=NAME hostname(s) to connect to; defaults to none (Unix socket)
-p, --port=NUM port(s) to connect to; defaults to 5432.
-db, --dbname=NAME database name(s) to connect to; defaults to 'postgres' or 'template1'
-u --dbuser=NAME database user(s) to connect as; defaults to 'postgres'
--dbpass=PASS database password(s); use a .pgpass file instead when possible
--dbservice=NAME service name to use inside of pg_service.conf
Connection options can be grouped: --host=a,b --host=c --port=1234 --port=3344
would connect to a-1234, b-1234, and c-3344
Limit options:
-w value, --warning=value the warning threshold, range depends on the action
-c value, --critical=value the critical threshold, range depends on the action
--include=name(s) items to specifically include (e.g. tables), depends on the action
--exclude=name(s) items to specifically exclude (e.g. tables), depends on the action
--includeuser=include objects owned by certain users
--excludeuser=exclude objects owned by certain users
Other options:
--assume-standby-mode assume that server in continious WAL recovery mode
--assume-prod assume that server in production mode
--PGBINDIR=PATH path of the postgresql binaries; avoid using if possible
--PSQL=FILE (deprecated) location of the psql executable; avoid using if possible
-v, --verbose verbosity level; can be used more than once to increase the level
-h, --help display this help information
--man display the full manual
-t X, --timeout=X how long in seconds before we timeout. Defaults to 30 seconds.
--symlinks create named symlinks to the main program for each action
Actions:
Which test is determined by the --action option, or by the name of the program
archive_ready - Check the number of WAL files ready in the pg_xlog/archive_status
autovac_freeze - Checks how close databases are to autovacuum_freeze_max_age.
backends - Number of connections, compared to max_connections.
bloat - Check for table and index bloat.
checkpoint - Checks how long since the last checkpoint
cluster_id - Checks the Database System Identifier
commitratio - Report if the commit ratio of a database is too low.
connection - Simple connection check.
custom_query - Run a custom query.
database_size - Report if a database is too big.
dbstats - Returns stats from pg_stat_database: Cacti output only
disabled_triggers - Check if any triggers are disabled
disk_space - Checks space of local disks Postgres is using.
fsm_pages - Checks percentage of pages used in free space map.
fsm_relations - Checks percentage of relations used in free space map.
hitratio - Report if the hit ratio of a database is too low.
hot_standby_delay - Check the replication delay in hot standby setup
index_size - Checks the size of indexes only.
last_analyze - Check the maximum time in seconds since any one table has been analyzed.
last_autoanalyze - Check the maximum time in seconds since any one table has been autoanalyzed.
last_autovacuum - Check the maximum time in seconds since any one table has been autovacuumed.
last_vacuum - Check the maximum time in seconds since any one table has been vacuumed.
listener - Checks for specific listeners.
locks - Checks the number of locks.
logfile - Checks that the logfile is being written to correctly.
new_version_bc - Checks if a newer version of Bucardo is available.
new_version_box - Checks if a newer version of boxinfo is available.
new_version_cp - Checks if a newer version of check_postgres.pl is available.
new_version_pg - Checks if a newer version of Postgres is available.
new_version_tnm - Checks if a newer version of tail_n_mail is available.
pgagent_jobs - Check for no failed pgAgent jobs within a specified period of time.
pgb_pool_cl_active - Check the number of active clients in each pgbouncer pool.
pgb_pool_cl_waiting - Check the number of waiting clients in each pgbouncer pool.
pgb_pool_maxwait - Check the current maximum wait time for client connections in pgbouncer pools.
pgb_pool_sv_active - Check the number of active server connections in each pgbouncer pool.
pgb_pool_sv_idle - Check the number of idle server connections in each pgbouncer pool.
pgb_pool_sv_login - Check the number of login server connections in each pgbouncer pool.
pgb_pool_sv_tested - Check the number of tested server connections in each pgbouncer pool.
pgb_pool_sv_used - Check the number of used server connections in each pgbouncer pool.
pgbouncer_backends - Check how many clients are connected to pgbouncer compared to max_client_conn.
pgbouncer_checksum - Check that no pgbouncer settings have changed since the last check.
prepared_txns - Checks number and age of prepared transactions.
query_runtime - Check how long a specific query takes to run.
query_time - Checks the maximum running time of current queries.
relation_size - Checks the size of tables and indexes.
replicate_row - Verify a simple update gets replicated to another server.
same_schema - Verify that two databases have the exact same tables, columns, etc.
sequence - Checks remaining calls left in sequences.
settings_checksum - Check that no settings have changed since the last check.
slony_status - Ensure Slony is up to date via sl_status.
table_size - Checks the size of tables only.
timesync - Compare database time to local system time.
txn_idle - Checks the maximum "idle in transaction" time.
txn_time - Checks the maximum open transaction time.
txn_wraparound - See how close databases are getting to transaction ID wraparound.
version - Check for proper Postgres version.
wal_files - Check the number of WAL files in the pg_xlog directory
Re: Monitoring Postgres Server options - Can't find them
Thank you. So I worked on this again this morning, I was able to successfully configure a service for DBSTATS. However, the results came back in one line.
backends:1 commits:198509 rollbacks:544 read:2167 hit:7329299 idxscan:0 idxtupread:0 idxtupfetch:0 idxblksread:0 idxblkshit:0 seqscan:0 seqtupread:0 ret:138023970 fetch:1207958 ins:0 upd:0 del:0 dbname:postgres
Do I need to somehow create a new service for each of the stats in order to be able to create alerts? If so, how to I break each of these items out?
Thank you for your assistance! - Chris
backends:1 commits:198509 rollbacks:544 read:2167 hit:7329299 idxscan:0 idxtupread:0 idxtupfetch:0 idxblksread:0 idxblkshit:0 seqscan:0 seqtupread:0 ret:138023970 fetch:1207958 ins:0 upd:0 del:0 dbname:postgres
Do I need to somehow create a new service for each of the stats in order to be able to create alerts? If so, how to I break each of these items out?
Thank you for your assistance! - Chris
Re: Monitoring Postgres Server options - Can't find them
Am I able to re-use the commands for different servers once they are created? Where would they show up to be added?
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Postgres Server options - Can't find them
Unfortunately dbstats is not something that can be alerted upon (see full docs with --man flag)caw001sbm wrote: Do I need to somehow create a new service for each of the stats in order to be able to create alerts? If so, how to I break each of these items out?
But you can add an action such as locks and setup a service for each item.
Yes, you actually could create a service with all the setup you want and then add multiple hosts to it (if the user/password are the same) OR, you could copy the CCM service and then change it for the other host.caw001sbm wrote:Am I able to re-use the commands for different servers once they are created? Where would they show up to be added?
There is a lot of detailed information in the man
Code: Select all
/usr/local/nagios/libexec/check_postgres.pl --man
Re: Monitoring Postgres Server options - Can't find them
Thank you for the assist. I have created a few more services, and added additional hosts, but I dont see the services showing up under the host within the console.
They only seem to show up if I match the config name to the host (ie. sbmhqpgsql.sbmgroup.local).
I added the hosts to the service under the Managed Hosts box.
They only seem to show up if I match the config name to the host (ie. sbmhqpgsql.sbmgroup.local).
I added the hosts to the service under the Managed Hosts box.
Re: Monitoring Postgres Server options - Can't find them
Nevermind, I got it. Thank you.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Postgres Server options - Can't find them
great!caw001sbm wrote:Nevermind, I got it. Thank you.
Locking thread