Many DBAs agree that one of the most useful extension in their arsenal is pg_repack. Pg_repack addresses a serious shortcoming in Postgres: doing VACUUM FULL online. Due to the way Postgres handles the MVCC, tables and indexes become bloated. Dead tuples are addressed by AUTOVACUUM and space will be marked free. In many situations a VACUUM FULL becomes unavoidable because AUTOVACUUM just leaves scattered and fragmented free space in the table as it is. DBA may have to do VACUMM FULL to release such free space back to disk.

Unfortunately VACUUM FULL requires an exclusive lock on the table during an operation. This can’t be performed while the table is being used. This is where pg_repack comes to the rescue of the DBA — to perform a VACUUM FULL almost fully online (there could be a momentary lock).

The popularity of this extension among DBAs led the Postgres by BigSQL project to add pg_repack as a ready to install package.

Installation

Installing pg_repack is quite easy with the pgcli command line.

$ ./pgc list Displays all installable versions of extensions. Let’s install repack13-pg96 (the package version depends on the postgres version we have installed). Installing this is pretty straightforward:

$ ./pgc install repack13-pg96

Installation of pg_repack doesn’t require either a restart or reload. As is commonly known, the pg_repack installation has 2 components: the actual extension and a client tool to invoke the pg_repack functionality. We can create the extension in the desired database as follows from the psql command interface:

postgres=# \c db1
postgres=# CREATE EXTENSION pg_repack;
CREATE EXTENSION

Test Environment

To create a test environment, run pgbench from the command line:

pgbench -U postgres -i -s 10 db1

This produced a pgbench_accounts table of 128 MB. To create a bloat, I ran the update twice.

db1=# update pgbench_accounts set abalance=abalance;

This caused the table to grow up to 384 MB. In a couple of minutes, AUTOVACUUM kicked in and cleans up all dead tuples as expected.

db1=# select n_live_tup,n_dead_tup from pg_stat_user_tables where relname='pgbench_accounts';
n_live_tup | n_dead_tup
------------+------------
997705 | 0

However the tablesize remained at 384 MB

db1=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+---------+-------------
public | pgbench_accounts | table | postgres | 384 MB |

VACUUM FULL using pg_repack

pg_repack is invoked using the included command line utility. We can repack every table in a database like so:

$ pg_repack db1
INFO: repacking table "pgbench_tellers"
INFO: repacking table "pgbench_accounts"
INFO: repacking table "pgbench_branches"

Where db1 is the name of the database.

Or we can repack individual tables:

$ pg_repack --no-order --table pgbench_accounts --table pgbench_branches db1
INFO: repacking table "pgbench_accounts"
INFO: repacking table "pgbench_branches"

After running pg_repack, the space consumption is back to 128MB

db1=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+------------+-------------
public | pgbench_accounts | table | postgres | 128 MB |