pg_statviz
is a minimalist extension and utility pair for time series analysis and visualization
of PostgreSQL internal statistics.
Created for capturing PostgreSQL's cumulative and dynamic statistics, pg_statviz
enables deeper
time series analysis than the standard PostgreSQL statistics views. The included utility generates
visualizations for selected time ranges from the stored statistic snapshots, helping users track
PostgreSQL performance over time and potentially aiding in performance tuning and troubleshooting.
Designed with the K.I.S.S. and
UNIX philosophies in mind, pg_statviz
aims to be
a modular, minimal and unobtrusive tool that does only what it's meant for: create snapshots
of PostgreSQL statistics for visualization and analysis. To this end, a utility is provided for
retrieving and creating simple visualizations with the stored snapshots, by using
pandas and Matplotlib.
-
Configure the PostgreSQL Deb repository for your Linux distribution, as explained here for Debian and here for Ubuntu.
-
Use
apt
to install the extension for your PostgreSQL version:sudo apt install postgresql-<pg_version>-statviz
-
Configure the PostgreSQL Yum repository for your Linux distribution, as explained here.
-
Use
dnf
oryum
to install the extension for your PostgreSQL version:sudo dnf install pg_statviz_extension-<pg_version> OR sudo yum install pg_statviz_extension-<pg_version>
The extension is available on PGXN.
To install from PGXN, either download the zip file and install manually or use the PGXN Client to install:
pgxn install pg_statviz
To install manually, clone this repository locally:
git clone https://github.com/vyruss/pg_statviz.git
This will install the extension in the appropriate location for your system ($SHAREDIR/extension
):
cd pg_statviz
sudo make install
The extension can now be enabled inside the appropriate database like this, e.g. from psql
:
\c mydatabase
CREATE EXTENSION pg_statviz;
This will create the needed tables and functions under schema pgstatviz
(note the lack of
underscore in the schema name).
The visualization utility can be installed from PyPi:
pip install pg_statviz
The utility is also available in the Debian and
Ubuntu PostgreSQL Deb Repositories, and can be installed
using apt
:
sudo apt install pg-statviz
Finally, it can be found in the
PostgreSQL Yum Repository and can be installed
using dnf
or yum
:
sudo dnf install pg_statviz
OR
sudo yum install pg_statviz
Python 3.9+ is required for the visualization utility. Any recent PostgreSQL version up to and including 17 is supported.
The extension can be used by superusers or any user that has pg_monitor
role privileges. To take
a snapshot, e.g. from psql
:
SELECT pgstatviz.snapshot();
NOTICE: created pg_statviz snapshot
snapshot
-------------------------------
2024-06-27 11:04:58.055453+00
(1 row)
Older snapshots and their associated data can be removed using any time expression. For example, to remove data more than 90 days old:
DELETE FROM pgstatviz.snapshots
WHERE snapshot_tstamp < CURRENT_DATE - 90;
Or all snapshots can be removed like this:
SELECT pgstatviz.delete_snapshots();
NOTICE: truncating table "snapshots"
NOTICE: truncate cascades to table "buf"
NOTICE: truncate cascades to table "conf"
NOTICE: truncate cascades to table "conn"
NOTICE: truncate cascades to table "lock"
NOTICE: truncate cascades to table "io"
NOTICE: truncate cascades to table "wait"
NOTICE: truncate cascades to table "wal"
NOTICE: truncate cascades to table "db"
delete_snapshots
------------------
(1 row)
The pg_monitor
role can be assigned to any user:
GRANT pg_monitor TO myuser;
Periodic snapshots can be set up with any job scheduler. For example with cron
:
crontab -e -u postgres
Inside the postgres
user's crontab, add this line to take a snapshot every 15 minutes:
*/15 * * * * psql -c -d mydatabase "SELECT pgstatviz.snapshot()" >/dev/null 2>&1
Potentially very large numbers of data points can be visualized with the aid of pandas resampling, displaying the mean value over 100 plot points as a default.
The visualization utility can be called like a PostgreSQL command line tool:
pg_statviz --help
usage: pg_statviz [--help] [--version] [--dbname DBNAME] [-h HOSTNAME] [--port PORT]
[-u USERNAME] [--password] [--daterange FROM TO] [-o OUTPUTDIR]
{analyze,buf,cache,checkp,conn, io,lock,tuple,wait,wal,xact} ...
run all analysis modules
positional arguments:
{analyze,buf,cache,checkp,conn,io,lock,tuple,wait,wal,xact}
analyze run all analysis modules
buf run buffers written analysis module
cache run cache hit ratio analysis module
checkp run checkpoint analysis module
conn run connection count analysis module
io run I/O analysis module
lock run locks analysis module
tuple run tuple count analysis module
wait run wait events analysis module
wal run WAL generation analysis module
xact run transaction count analysis module
options:
--help
--version show program's version number and exit
-d DBNAME, --dbname DBNAME
database name to analyze (default: 'myuser')
-h HOSTNAME, --host HOSTNAME
database server host or socket directory (default: '/var/run/postgresql')
-p PORT, --port PORT database server port (default: '5432')
-U USERNAME, --username USERNAME
database user name (default: 'myuser')
-W, --password force password prompt (should happen automatically) (default: False)
-D FROM TO, --daterange FROM TO
date range to be analyzed in ISO 8601 format e.g. 2024-01-01T00:00
2024-01-01T23:59 (default: [])
-O OUTPUTDIR, --outputdir OUTPUTDIR
output directory (default: -)
pg_statviz conn --help
usage: pg_statviz conn [-h] [-d DBNAME] [--host HOSTNAME] [-p PORT] [-U USERNAME] [-W]
[-D FROM TO] [-O OUTPUTDIR] [-u [USERS ...]]
run connection count analysis module
options:
-h, --help show this help message and exit
-d DBNAME, --dbname DBNAME
database name to analyze (default: 'myuser')
--host HOSTNAME database server host or socket directory (default: '/var/run/postgresql')
-p PORT, --port PORT database server port (default: '5432')
-U USERNAME, --username USERNAME
database user name (default: 'myuser')
-W, --password force password prompt (should happen automatically) (default: False)
-D FROM TO, --daterange FROM TO
date range to be analyzed in ISO 8601 format e.g. 2024-01-01T00:00
2024-01-01T23:59 (default: [])
-O OUTPUTDIR, --outputdir OUTPUTDIR
output directory (default: -)
-u [USERS ...], --users [USERS ...]
user name(s) to plot in analysis (default: [])
pg_statviz buf --host localhost -d postgres -U postgres -D 2024-06-24T23:00 2024-06-26
The pg_statviz
extension stores its data in the following tables:
Table | Description |
---|---|
pgstatviz.snapshots |
Timestamped snapshots |
pgstatviz.buf |
Buffer, checkpointer and background writer data |
pgstatviz.conf |
PostgreSQL server configuration data |
pgstatviz.conn |
Connection data |
pgstatviz.db |
PostgreSQL server and database statistics |
pgstatviz.io |
I/O stats data |
pgstatviz.lock |
Locks data |
pgstatviz.wait |
Wait events data |
pgstatviz.wal |
WAL generation data |
To dump the captured data, e.g. for analysis on a different machine, run:
pg_dump -d <dbname> -a -O -t pgstatviz.* > pg_statviz_data.dump
Load it like this on the target database (which should have pg_statviz
installed) :
psql -d <other_dbname> -f pg_statviz_data.dump
Alternatively, pg_statviz
internal tables can also be exported to a tab separated values (TSV) file
for use by other tools:
psql -d <dbname> -c "COPY pgstatviz.conn TO STDOUT CSV HEADER DELIMITER E'\t'" > conn.tsv
These can then be loaded into another database like this, provided the tables exist (installing the extension will create them):
psql -d <other_dbname> -c "COPY pgstatviz.conn FROM STDIN CSV HEADER DELIMITER E'\t'" < conn.tsv