In December of 2011, I blogged about the increasing need for
parallelism in the Postgres backend. (Client applications have always been able to do parallelism with subprocesses, and with threads
since 2003).

Thirteen months later, I have added two parallel code paths (1,
2) to
pg_upgrade. I have learned a few things in the process:

  • parallelism can produce dramatic speed improvements (4x vs 4%)
  • adding parallelism isn’t difficult to code, even for MS Windows
  • only certain tasks can benefit from parallelism

Using pg_upgrade as an example, parallelism can yield a 10x performance improvement, and it only took me a few weeks to accomplish.
However, to get 10x improvement, you have to have multiple large databases, and be using multiple tablespaces — others will see more
moderate gains. Fortunately, I have also improved pg_upgrade performance by
5x even without parallelism.

Continue Reading »