Friday, October 28, 2011

Performance impact of the pg_stat_statements module (Updated: Nov-04-2011)

When I tweeted that I thought pg_stat_statements should be integrated to the core a few weeks ago, Josh responded me that it would be up to submitting a patch and figuring out the performace overhead.

So, before starting a patch work, I've decided to figure out the performance impact of the pg_stat_statements module.

The right pic is the results from pgbench, averages from 5 times runs each, with and without pg_stat_statements module.

Saturday, October 22, 2011

Loading data into UNLOGGED tables, and considering ALTER TABLE

UNLOGGED table has been introduced in 9.1, and it would improve INSERT/UPDATE/DELETE performance in several situations. At this time, I'm curious how it would work well on data loading.

To load large amount of rows into PostgreSQL, WAL logging would be a performance penalty, especially under the situation that have several tables on different spindles, because WAL logging forces data loading being serialized.

I have created a small function, called "enable_logging()", to switch table persistence mode for existing tables between (regular) logging and UNLOGGED.

bool enable_logging(Oid relid, bool mode);

You can see the source at github, and an example at the bottom of this entry.

Friday, October 21, 2011

Index-only scans and heap block reads

As you may know, the Index-only scans feature has been committed to the PostgreSQL repository and it's available to all the developers.

I'm very curious about such performance feature, and always want to understand what would change the things better and how it works. So, I just tried the index-only scans to determine what would be changed.

The right figure shows PostgreSQL index scans and determining tuple visibilities. It tells us why each index scan needs to access heap tuples, and how the Index-only scans solve the problem.

Because of the PostgreSQL storage design, tuple visibility information is only stored in the tuple header itself. So, to determine an index item visibility, the corresponding heap tuple header needs to be examined even it is an index scan.

Friday, October 07, 2011

pgbench on UNLOGGED table(s), Round 2

As one pointed out in a comment in the last post, I have tried additional pgbench runs for "synchronous_commit = off" and "fsync = off".

The results are shown in the right pic.

It shows that "sync_commit = off" and "fsync = off" are almost the same, and these are faster as much as "UNLOGGED", except a tiny behind.

I think the tiny behind means the difference between "sync_commit = off (or fsync = off)" and "UNLOGGED" that the WAL records would be eventually written, or not.

Tuesday, October 04, 2011

pgbench on UNLOGGED table(s)

To satisfy my curiosity, I just tried a quick run for pgbench with UNLOGGED tables.

Under the pgbench transaction model, a kind of tpc-b, changing the history table attribute to "UNLOGGED" did not contribute to the performance.

However, changing all tables to "UNLOGGED" improved the performance *substantially*. :)

The results shown in the right image are averages of running pgbench 10 times each.

Here is the steps to make this test: