pgBackRest with PostgreSQL Sandbox on Debian / Ubuntu

pgBackRest is one of the most powerful backup solutions available for PostgreSQL. It has enterprise level features like compression, multiple channels (threads) of backup execution, incremental and differential backups etc.
The official documentation is Debian-centric in its focus. I wanted to test it out with the PostgreSQL sandbox from the BigSQL project.

Setting up PostgreSQL Sandbox and Installing pgBackRest

The BigSQL Project makes it easy to install and setup PostgreSQL and its associated components across different operating systems. In this document, we are going to look at how to set it up on Ubuntu 14.04. Linux binaries of the sandbox can be downloaded from the BigSQL download page

The sandbox installation requires only the unpacking of the downloaded file.

tar -xvf bigsql-9.5.3-5-linux64.tar.bz2
cd bigsql/

Using the command line utility (pgc) supplied with the sandbox, its very easy to initialize and start a PostgreSQL instance.

./pgc init pg95
./pgc start pg95

A PostgreSQL instance should now be up and running.
The same pgc utility can be used to install pgBackRest.

./pgc install backrest

Install Perl Dependencies

An important aspect to keep in mind is that pgBackrest is written in Perl and has many dependencies on different perl libraries and modules.
An easy way to install all the dependencies in one shot is to instruct the apt-get utility to install one of the leaf components in the dependency chain.

sudo apt-get install libdbd-pg-perl

This command should fetch all the perl dependencies of pgBackRest.

Setting Up a Backup Repository Directory

Set up a backup repository directory for pgBackRest with the following commands.

sudo mkdir /var/log/pgbackrest
sudo chmod 750 /var/log/pgbackrest

IMPORTANT for this test:

  1. pgbackrest and the postgres server process should run as the same OS user.
  2. The backup repository directory should be owned by the same OS user.

Change the ownership of the repository directory to the user under which the postgres process is running. If the user is “postgres” and the group is “postgres” then:

sudo chown -R postgres:postgres /var/log/pgbackrest

pgBackRest configuration

sudo vi /etc/pgbackrest.conf

Append the following entries to this file.

[demo]
db-path=/home/postgres/bigsql/data/pg95

[global]
repo-path=/var/log/pgbackrest

Note: if the entries already exist, modify them accordingly.

Change the ownership of this configuration file to the OS user that owns the postgres and pgbackrest process

sudo chown -R postgres:postgres /etc/pgbackrest.conf
chmod +x pgbackrest.conf

Modification Database Parameters

The archive_command needs to be modified to use pgbackrest. If the pgbackrest executable doesn’t exist in the path, please make sure that the full path is mentioned

alter system set archive_command = '/home/postgres/bigsql/backrest/bin/pgbackrest --stanza=demo archive-push %p';

A few other parameters that are also important for the proper working of pgBackRest:

alter system set archive_mode=on;
alter system set listen_addresses = '*';
alter system set max_wal_senders=3;
alter system set wal_level = 'hot_standby';

Modification of all these parameters requires a restart of the PostgreSQL instance.

./pgc restart pg95

In the event that our operating system user doesn’t exist as a superuser in our database, we need to create the user and assign superuser privileges

postgres=# create user vagrant with password 'vagrant';
postgres=# alter user vagrant with superuser;

Backing up database using pgBackRest

pgBackRest uses .pgpass file for authentication.
Add a line to .pgpass with the password of the superuser in the following format:

*:*:*:*:vagrant

once this is done, we are ready to backup the PostgreSQL instance.

backrest/bin/pgbackrest --stanza=demo --db-socket-path=/tmp --log-level-console=info backup

Restoring from backup

Imagine a scenario where the files in your data directory are corrupt or lost and you want to restore it from backup.
The first step is to bring down the PostgreSQL instance. This should release all file descriptors pointing to the current data directory.

Clean up the Data directory:
Before restoring the backup make sure that the data directory is clean and is stored on a reliable medium. The full path to the new data directory should be the same as the previous one (we can override this default, but for the sake of simplicity lets assume that the location remains the same).

Run the pgBackRest “restore” command to restore the data directory from the latest backup.

backrest/bin/pgbackrest --stanza=demo --db-socket-path=/tmp --log-level-console=info restore

Now we should be able to start up the PostgreSQL instance with the restored data directory.

./pgc start pg95

Our PostgreSQL cluster is now back online from the backup we restored.

My PL/pgSQL code is stuck, but where?

In the short time that I am enhancing the PL profiler (https://bitbucket.org/openscg/plprofiler) I have been asked multiple times if it can be abused as a debugging aid. Not directly. The conversation goes something like this:

Q: When my PL code is stuck somewhere, can I turn on profiling and see where?
A: No.
Q: Will this feature be added?
A: No.

Of course would that be a useful feature. I don't argue that. And it seems to be that this is precisely how Oracle users find out where their PL/SQL code gets stuck or moves like molasses. However, the reason why I am against adding this is because a profiler, or even parts of it, should not be enabled 24x7 on a production database server for the purpose of eventually using it to extract some debugging information some day. There are other ways to get that information and I will show you one of them.

If a program is stuck somewhere, one uses a debugger to get a backtrace. This works with C programs, like the database server, provided that symbol information is available. In that backtrace (and some variables) we also find the PL/pgSQL backtrace. Having symbol information available is also useful in case of a program crash, to find out why it crashed by loading the core dump into the debugger. 

Every line of PL code, that is executed, goes through the PL/pgSQL executor function exec_stmt(). At that place we find the current PL code's line in a variable. Likewise every function execution goes through either plpgsql_exec_function() or plpgsql_exec_trigger(). In those stack frames we find the OID of the function as well as its signature (name and call argument types).

Doing this eventually several times for a deeply nested PL/pgSQL program is tedious and no fun. So here is a little script called plbacktrace (https://github.com/wieck/plbacktrace.git) that does all of that. It is invoked with the PID of the PostgreSQL backend and will output information like



[postgres@db1 tmp]$ ./plbacktrace.py 13239
fn_oid=105672 lineno=5 func="life_get_cell_pl(life_board,integer,integer)"
fn_oid=105673 lineno=12 func="life_num_neighbors_pl(life_board,integer,integer)"
fn_oid=105675 lineno=11 func="life_tick_pl(life_board)"
fn_oid=105674 lineno=22 func="life_pl(integer,integer,boolean)"
 I will explain another time why I have a database server playing Conway's Game of Life written in PL/pgSQL. 


Regards, Jan

Compiling new Postgres (9.6.x) on newest LTS Ubuntu (16.04)

It is great to confirm that the latest version of Postgres cleanly builds on the latest Ubuntu without any errors or warnings.   Hmmm, I wonder now if GGC 5 makes anything measurably run faster??

Pre-Requisites:

$ sudo apt-get install build-essential  libreadline-dev  zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev

With this complete, you can then do an industrial strength build of PostgreSQL as per below:


$ tar -xvf postgresql-9.6.x.tar.bz2

$ cd postgresql-9.6.x

$ ./configure --with-libxml --with-libxslt --with-openssl

$ make

MySQL Foreign Data Wrapper : A quick tour

Data centers are no longer dominated by a single DBMS. Many companies have heterogeneous environments and may want their Postgres database to talk to other database systems. Foreign Data Wrappers can be the right solution for many scenarios. The BigSQL Project provides a well tested, ready to use MySQL FDW with Postgres. This makes life easy for a DevOps or DataCenter person.

Here is a quick tour on how to configure Foreign Data Wrappers for MySQL, so that Postgres can query a MySQL table. For this quick guide, I use a CentOS Linux machine. This, or a similar setup, should work fine on all other operating systems.

Setting up a test MySQL server for the test

In this demo I’m going to create a table in MySQL  which should be available to Postgres though the FDW.
The FDW can talk to any MySQL distribution including Oracle’s MySQL, Percona Server or MariaDB. I’m going to use MariaDB, which is more community friendly.

Install MariaDB Server and Start the service

$ sudo yum install mariadb-server.x86_64
$ sudo systemctl start mariadb

Connect as root user of mariadb and create a database

$ mysql -uroot
MariaDB [(none)]> create database postgres;

Connect to Database and create a table

MariaDB [(none)]> use postgres;
MariaDB [postgres]> create table t1m(id int,name varchar(30));

Insert some data in the table:

MariaDB [postgres]> insert into t1m values (1,'abc');
Query OK, 1 row affected (0.04 sec)

MariaDB [postgres]> insert into t1m values (2,'def');
Query OK, 1 row affected (0.00 sec)

MariaDB [postgres]> insert into t1m values (3,'hij');
Query OK, 1 row affected (0.03 sec)

Setting up Postgres Database

Install Postgres

For this test, I’m going to use the Postgres DevOps Sandbox from the BigSQL project.
Download the Sandbox from BigSQL
Since this is a sandbox, you just need to unpack it

$ tar -xvf bigsql-9.5.3-5-linux64.tar.bz2

Install MySQL FDW

Go to the unpacked directory and invoke the bigsql command line tool to install MySQL FDW

$ cd bigsql
$ ./pgc list
Category | Component | Version | Status | Port | Updates
PostgreSQL pg92 9.2.17-5 NotInstalled
PostgreSQL pg93 9.3.13-5 NotInstalled
PostgreSQL pg94 9.4.8-5 NotInstalled
PostgreSQL pg95 9.5.3-5 NotInitialized
Extensions cassandra_fdw3-pg95 3.0.0-1 NotInstalled
Extensions hadoop_fdw2-pg95 2.5.0-1 NotInstalled
Extensions mysql_fdw2-pg95 2.1.2-1 NotInstalled
Extensions oracle_fdw1-pg95 1.4.0-1 NotInstalled
Extensions orafce3-pg95 3.3.0-1 NotInstalled
Extensions pgtsql9-pg95 9.5-1 NotInstalled
Extensions pljava15-pg95 1.5.0-1 NotInstalled
Extensions plv814-pg95 1.4.8-1 NotInstalled
Extensions postgis22-pg95 2.2.2-2 NotInstalled
Extensions slony22-pg95 2.2.5-2 NotInstalled
Extensions tds_fdw1-pg95 1.0.7-1 NotInstalled
Servers bam2 1.5.0 NotInstalled
Servers cassandra30 3.0.6 NotInstalled
Servers hadoop26 2.6.4 NotInstalled
Servers hive2 2.0.1 NotInstalled
Servers pgbouncer17 1.7.2-1 NotInstalled
Servers pgha2 2.1b NotInstalled
Servers pgstudio2 2.0.1-2 NotInstalled
Servers spark16 1.6.1 NotInstalled
Servers tomcat8 8.0.35 NotInstalled
Servers zookeeper34 3.4.8 NotInstalled
Applications backrest 1.02 NotInstalled
Applications birt 4.5.0 NotInstalled
Applications ora2pg 17.4 NotInstalled
Applications pgbadger 8.1 NotInstalled
Frameworks java8 8u92 NotInstalled
$ ./pgc install mysql_fdw2-pg95
['mysql_fdw2-pg95']
Get:1 http://s3.amazonaws.com/pgcentral mysql_fdw2-pg95-2.1.2-1-linux64
Unpacking mysql_fdw2-pg95-2.1.2-1-linux64.tar.bz2

Note:- We can use the same command line tool to initalize a new postgres cluster

$ ./pgc init pg95

## Initializing pg95 #######################

Superuser Password [password]:
Confirm Password:
Giving current user permission to data dir

Initializing Postgres DB at:
-D "/home/vagrant/bigsql/data/pg95"

Using PostgreSQL Port 5432

Password securely remembered in the file: /home/vagrant/.pgpass

to load this postgres into your environment, source the env file:
/home/vagrant/bigsql/pg95/pg95.env

Create the extension in the postgres database

create extension mysql_fdw;

Create foreign server

postgres=# CREATE SERVER mysql_svr
FOREIGN DATA WRAPPER mysql_fdw
OPTIONS (host 'localhost', port '3306');
CREATE SERVER
postgres=#

Create foreign table

postgres=# CREATE FOREIGN TABLE mysql_tab (
postgres(# id int,
postgres(# name varchar(30)
postgres(# )
postgres-# SERVER mysql_svr
postgres-# OPTIONS (dbname 'postgres', table_name 't1m');
CREATE FOREIGN TABLE
postgres=#

Create user mapping

postgres=# CREATE USER MAPPING FOR PUBLIC
postgres-# SERVER mysql_svr
postgres-# OPTIONS (username 'root');

(if your user is having password authentication to mysql, you have to pass that also in the format (username ‘username’, password ‘password’))

Now everything is set, You can test by querying the table.

postgres=# select * from mysql_tab;
id | name
----+-------
1 | abc
2 | def
3 | hij
(3 rows)


Note:- MySQL FDW for Postgres requires MySQL Client Libraries. Please make sure that libmysqlclient.so is there in the LD_LIBARY_PATH. if this file name is something different like “libmysqlclient.so.18.0.0″, you may have to create a softlink with name “libmysqlclient.so”

Welcome Affan

Affan Salman joins the team
East Brunswick, NJ, March 1, 2016

OpenSCG, a leading provider of subscriptions and services for PostgreSQL, announced today that Affan Salman has joined its team of top-tier PostgreSQL talent as Senior Database Architect. Affan is the primary original author of EnterpriseDB’s Oracle compatibility from 10 years ago and has spent much of the last six years rounding out his database internals credentials by becoming a Cassandra, Spark & Hadoop rock star.

“We are very excited that Affan has re-joined the family,” said Denis Lussier, CEO of OpenSCG. “Affan put EnterpriseDB on the map for Oracle compatibility back in the day. Having him as a key part of the OpenSCG team will enable us to move faster, and further enhance our strategic plans for the future.” As part of his new responsibilities, Salman is expected to contribute to BigSQL, a new PostgreSQL Distribution that OpenSCG is sponsoring.

“I was thrilled to join the OpenSCG/BigSQL team,” said Salman. “I have worked with the management team for over a decade, and was looking for a balance between new product development and working closely with customers on data-driven business problems. My new role offers the chance to do both.”

About OpenSCG
Open Source Consulting Group, Inc. (OpenSCG) is a leading provider of subscriptions and services for PostgreSQL, the world’s most advanced open source database. The OpenSCG PostgreSQL team is one of the largest and most experienced in the world, and the management team is respected and trusted across the global Postgres community. OpenSCG offers PostgreSQL consulting and migration services, as well as subscription services that provide 24×7 global support and remote DBA Services.

Helpful PostgreSQL Logging and Defaults

I use PostgreSQL every single day.  I develop against multiple versions of the database (4 versions running in parallel as I write), I have apps that use it and, my daily DBA-ing.  The biggest question I get from newbies and veterans a like is: “What are your PostgreSQL defaults?”

If you follow postgres, you already know that the default configuration (postgresql.conf) is very conservative from a resource (CPU, IOPS, Disk Space) perspective.  Over the last 10 years, we [the community] have developed some straightforward and easy to understand formulas that will help you tune… shared_buffers for example.  The item that always gets left out though is logging.  As a developer, I’m always looking for ways to see “How the database is answering the questions I ask”.  When I get a new postgres instance set up, I have a personal (somewhere in the cobwebs) checklist that I run.  Some are based on the purpose of the deployment (shared_buffers, work_mem, etc…), some are things that I always set.  Aside from memory, the biggest set of “standard” items I set are all related to logging.  I’m big on monitoring (my pg_stat_activity patch was accepted back for 9.2) and having the right detail presented to me is important.

TL;DR

 logging_collector = on
 log_filename = ‘postgresql-%a.log’
log_truncate_on_rotation=on
 log_line_prefix = ‘%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ‘
 log_checkpoints = on
 log_autovacuum_min_duration = 0
 log_temp_files = 0
 log_lock_waits = on
the goal here is to have postgres ‘bubble-up’ details about what’s going on to us so

Read more

Jan Wieck joins the team!

East Brunswick, NJ, April 21, 2016

OpenSCG, a leading provider of subscriptions and services for PostgreSQL, announced today that Jan Wieck has joined its team of top-tier PostgreSQL talent as Senior Database Architect. Wieck is a long time contributor to the PostgreSQL Global Development Group, and has authored many key Postgres technologies over the years, including Stored Procedures (PL/pgSQL), Foreign Keys, TOAST (unlimited row size) and the Slony Replication System. He has also focused for years on stabilization and performance enhancements for PostgreSQL and related data technologies.

“We are very excited that Jan has joined us,” said Denis Lussier, CEO of OpenSCG. “My original vision for Oracle compatibility was based in large part on Jan’s elegant stored procedure design. Having him as a key part of the OpenSCG team will enable us to move faster, and further enhance our strategic plans for the future.” As part of his new responsibilities, Wieck is expected to contribute to BigSQL, a new PostgreSQL Distribution that OpenSCG is sponsoring.

“I was thrilled to join OpenSCG,” said Wieck. “I have known and respected the management team for over a decade, and was looking for a balance between new product development and working closely with customers on data-driven business problems. My new role offers the chance to do both.”

About OpenSCG

Open Source Consulting Group, Inc. (OpenSCG) is a leading provider of subscriptions and services for PostgreSQL, the world’s most advanced open source database. The OpenSCG PostgreSQL team is one of the largest and most experienced in the world, and the management team is respected and trusted across the global Postgres community. OpenSCG offers PostgreSQL consulting and migration services, as well as subscription services that provide 24×7 global support and remote DBA Services.

 

How to install and configure the Oracle Foreign Data Wrapper

How to install and configure the Oracle Foreign Data Wrapper

Preinstall Checklist:

   – If you have not already, please download and install “Microsoft Visual C++ 2015 Redistributal” and Java Runtime Environment (version 7 or 8)

Oracle Instant Client install and config:

          01. Download and unzip into the same directory:

                   instantclient-basic-windows

                   instantclient-odbc-windows

                   instantclient-sqlplus-windows

          02. Create “Oracle Instant Client” in Program Files and copy instantclient_12_1 into it.

          03. Double click on “odbc_install.exe”

          04. Create system variables:

                   name: TNS_ADMIN

                   value: C:\Program Files\Oracle Instant Client\instantclient_12_1

                   name: ORACLE_HOME

                   value: C:\Program Files\Oracle Instant Client\instantclient_12_1

          05. Edit a system variable:

                   name: PATH

                   value: C:\Program Files\Oracle Instant Client\instantclient_12_1

          06. (optional). Attempt to run “sqlplus” from the command line. If all is properly configured, you should be prompted to enter a username/password. You can exit “sqlplus” at this point if you                  wish and continue with this guide.

NOTE 1: For Windows 7, to edit system variables, right click on “My Computer”, then click on “Advanced System Settings” in the left of the screen. A popup window titled “System Properties” will appear. Click on the “Advanced” tab and look to the bottom of the window. Above “OK” and “Cancel”, there will be an option titled “Environment Variables”. Click on it and find the pane “System Variables” in the newly created popup window. Here you can create and edit your system variables.

NOTE 2: If you recieve an error concerning “MSvcr100.dll” being missing, it may be necesarry to copy this dll out of your /Java/VERSION/bin folder and into your oracle instant client directory

Assuming PostgreSQL has previously been installed and properly configured,

Configure the Oracle FDW

          01. Open up a PSQL shell, and enter:

                   “CREATE EXTENSION oracle_fdw;”

          Expected Output: CREATE EXTENSION

Example configuration

This instructions below are an example of how to setup PostgreSQL to connect to an oracle database through the Oracle Foreign Data Wrapper. Your specific settings will be different, but the syntax should remain the same. For our purposes, I will show you how to connect to an oracle database with the “SCOTT” schema. This installed. This part of the how-to assumes your oracle database is already set up and configured to accept remote database connections.

          01. CREATE SCHEMA scott;

                   Expected Output: CREATE SCHEMA

          02. CREATE SERVER oracle FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver ‘//192.168.1.254/YOURSERVERNAME’ );

                   Expected Output: CREATE SERVER

          03. CREATE USER MAPPING FOR postgres SERVER oracle OPTIONS (user ‘SCOTT’, password ‘tiger’);

                   Expected Output: CREATE USER MAPPING

          04. IMPORT FOREIGN SCHEMA “SCOTT” from SERVER oracle INTO scott;

                   Expected Output: IMPORT FOREIGN SCHEMA

          05. set search_path=’scott';

                   Expected Output: SET

Now we can explore the imported data

Win-task-manager

Building PostgreSQL on Windows for Performance

Before PostgreSQL 9.1, adding additional features like data types was non-standard and awkward, but 9.1 brought extensions. By adding extensions as well as the plumbing for things like foreign data wrappers, the use of extensions have exploded over the last few years. This is great for people who are using PostgreSQL on Linux or similar type environments, but the people in Windows have been pretty much out of luck. Most people need to hope the extension they want to use is popular enough that someone skilled with Windows build environments decided to release the extension on Windows. It just was not as simple as Linux where you just run a “make install”.

For years, there has been the ability to replicate the Linux build experience on Windows with the Mingw project, but historically the resulting binaries have been significantly slower than the binaries produced by the Microsoft compiler. Since I last played with compliers on Windows, I have been pleasantly surprised that the open source build tools have caught up with the commercial tools and in some ways surpassed them.

To check the performance of the resulting binaries from the compilers, I tried a couple of CPU intensive tests. The first one was just a SELECT only pgbench test. The second was running the square roots of random numbers and then sorting the results. For the Windows binaries built with Microsoft Visual Studio, I used the PostgreSQL Windows installer from the PostgreSQL community site at http://www.postgresql.org/download/windows/. The Mingw-w64 build came from the bigsql.org site at http://www.bigsql.org/se/postgresql/packages.jsp

The version strings of the 2 binaries are below.

test=# select version();
 version
-------------------------------------------------------------
 PostgreSQL 9.5.1, compiled by Visual C++ build 1800, 64-bit
(1 row)

test=# select version();
 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.1 on x86_64-pc-mingw64, compiled by gcc.exe (Rev5, Built by MSYS2 project) 4.9.2, 64-bit
(1 row)

For these tests, the only tuning of the postgresql.conf file was to set shared_buffers to 2GB. All other settings were left at the defaults.

The SELECT only pgbench test was conducted with the following command:

pgbench -n -S -c 10 -t 100000 -U postgres test
MSVC 34246 TPS
GCC 34706 TPS

That is a minor 1.3% performance improvement. That is not a ground breaking gain, but the gcc built binaries were consistently faster.

The second test was a bit more CPU intensive running the following query:

select sqrt((random() * g)::numeric) from generate_series(1, 1000) g order by 1;

This was put in a text file and run through pgbench with the following command:

pgbench.exe -n -f test.sql -c 8 -t 10000 -U postgres test
MSVC 610 TPS
GCC 634 TPS

This resulted in a larger performance gain of 3.8%.

I started this testing with trying to find an easier way to build extensions on Windows with hopes of not taking too much of a performance hit, but ended up finding that the easier way was actually a littler faster. I am thinking its time to ditch those proprietary build tools and stick with open source.

Video on FDW by Jim

Video – Behold! The Power of Foreign Data Wrappers!

If you’ve never heard of PostgreSQL’s “foreign data wrappers” — you’ve been missing out on a very powerful tool to connect and integrate heterogenous databases. By leveraging FDW’s, you can put PostgreSQL at the center of a integrated database federation and read and write data from other PostgreSQL servers, other relational databases (Oracle, MySQL), many NoSQL systems (MongoDB, Hadoop) and even Twitter!

Heck – FDW’s are so powerful that MongoDB just released their BI connector as a PostgreSQL FDW!! (excellent write up by John De Goes of SlamData)

At this year’s PostgresOpen, Jim Mlodgenski from OpenSCG gave a great talk about Federated PostgreSQL using FDW’s. Watch this video to learn the basics of what FDW’s are, how they work, and how you can easily combine data from PostgreSQL, other DBMS’s, and NoSQL db’s all in a single SQL query.

Thanks to Daily Tech Video for highlighting this talk in their post [Video 378] Jim Mlodgenski: PostgreSQL Federation.