Tuesday, November 15, 2011

xlogdump 0.5.1 released

The latest xlogdump has been released.

This version allows users to lookup object names from object ids for built-in database objects by reading the lookup table file instead of the system catalog.
----------------------------------------------------------------------
2011-11-15  Satoshi Nagayasu <satoshi.nagayasu@gmail.com>
        * Version 0.5.1
        * Allows to lookup the object names for the build-in database objects
          by reading `oid2name.txt' file, instead of reading the system
          catalogs through a database connection,
        * Added '-f, --file' option to read an oid2name cache file,
        * Added '-g, --gen_oid2name' option to generate an oid2name file
          to cache oids and object names in the system catalogs.
----------------------------------------------------------------------

Friday, October 28, 2011

Performance impact of the pg_stat_statements module (Updated: Nov-04-2011)

When I tweeted that I thought pg_stat_statements should be integrated to the core a few weeks ago, Josh responded me that it would be up to submitting a patch and figuring out the performace overhead.

So, before starting a patch work, I've decided to figure out the performance impact of the pg_stat_statements module.

The right pic is the results from pgbench, averages from 5 times runs each, with and without pg_stat_statements module.

Saturday, October 22, 2011

Loading data into UNLOGGED tables, and considering ALTER TABLE

UNLOGGED table has been introduced in 9.1, and it would improve INSERT/UPDATE/DELETE performance in several situations. At this time, I'm curious how it would work well on data loading.

To load large amount of rows into PostgreSQL, WAL logging would be a performance penalty, especially under the situation that have several tables on different spindles, because WAL logging forces data loading being serialized.

I have created a small function, called "enable_logging()", to switch table persistence mode for existing tables between (regular) logging and UNLOGGED.

bool enable_logging(Oid relid, bool mode);

You can see the source at github, and an example at the bottom of this entry.

Friday, October 21, 2011

Index-only scans and heap block reads

As you may know, the Index-only scans feature has been committed to the PostgreSQL repository and it's available to all the developers.

I'm very curious about such performance feature, and always want to understand what would change the things better and how it works. So, I just tried the index-only scans to determine what would be changed.

The right figure shows PostgreSQL index scans and determining tuple visibilities. It tells us why each index scan needs to access heap tuples, and how the Index-only scans solve the problem.

Because of the PostgreSQL storage design, tuple visibility information is only stored in the tuple header itself. So, to determine an index item visibility, the corresponding heap tuple header needs to be examined even it is an index scan.

Friday, October 07, 2011

pgbench on UNLOGGED table(s), Round 2

As one pointed out in a comment in the last post, I have tried additional pgbench runs for "synchronous_commit = off" and "fsync = off".

The results are shown in the right pic.

It shows that "sync_commit = off" and "fsync = off" are almost the same, and these are faster as much as "UNLOGGED", except a tiny behind.

I think the tiny behind means the difference between "sync_commit = off (or fsync = off)" and "UNLOGGED" that the WAL records would be eventually written, or not.

Tuesday, October 04, 2011

pgbench on UNLOGGED table(s)

To satisfy my curiosity, I just tried a quick run for pgbench with UNLOGGED tables.

Under the pgbench transaction model, a kind of tpc-b, changing the history table attribute to "UNLOGGED" did not contribute to the performance.

However, changing all tables to "UNLOGGED" improved the performance *substantially*. :)

The results shown in the right image are averages of running pgbench 10 times each.

Here is the steps to make this test:

Monday, September 26, 2011

xlogdump 0.5.0 released

Hi, pg folks,

xlogdump is a tool for extracting data from WAL segment files. I'm pleased to announce the latest version of xlogdump, 0.5.0, after a few weeks from the previous release 0.4.0.

I think xlogdump would be helpful to understand PostgreSQL behaviors, and you can enjoy brand-new features in this version, so please try it if you're interested in it.

In this version 0.5.0, xlogdump is able to show not only xlog records itself, but also the statistics of xlog records as the right pic.

Sunday, September 04, 2011

xlogdump 0.4.0

I'm pleased to announce the latest release of xlogdump. xlogdump is a tool for extracting data from WAL segment files.

Here is xlogdump README:
https://github.com/snaga/xlogdump/blob/master/README.xlogdump

xlogdump was originally developed by Tom Lane and Diogo Biazus around five years ago, but not be maintained these few years.

Tuesday, March 01, 2011

PostgreSQL Query Cache - "pqc"

I would like to introduce a new open source software, PostgreSQL Query Cache, which enables to improve query performance extremely (10x~100x) by caching query results in front of backends.

PostgreSQL Query Cache:
  • waits connections on the different port from the clients.
  • delegates queries in front of the backends, like a proxy.
  • intercepts and caches SELECT query results.
  • also manages lifecycle of the query cache.