Friday, October 28, 2011

Performance impact of the pg_stat_statements module (Updated: Nov-04-2011)

When I tweeted that I thought pg_stat_statements should be integrated to the core a few weeks ago, Josh responded me that it would be up to submitting a patch and figuring out the performace overhead.

So, before starting a patch work, I've decided to figure out the performance impact of the pg_stat_statements module.

The right pic is the results from pgbench, averages from 5 times runs each, with and without pg_stat_statements module.

With enabling pg_stat_statements module, the pgbench score was 468.2tps. And without (loading) the pg_stat_statements module, pgbench score was 470.3tps. The difference (overhead) was just 0.5%.

Is this impact possible (or acceptable) to integrate into the core? Is it time to think of integrating the pg_stat_statements to the core?

In addition, this test was executed on the following server:
  • NEC Express5800/GT110b
  • Dual Core Celeron @ 2.27GHz
  • Physical Memory 12GB
  • WDC WD1602ABYS-19B7A0 (SATA)
  • Red Hat Enterprise Linux 6.0
and with the following PostgreSQL configuration:
  • Shared Buffers : 2048MB
  • Checkpoint Segments : 32
  • WAL Buffers : 1024kB
  • Pgbench: Scale factor 10, 1,000 Transactions * 32 Clients
  • custom_variable_classes = 'pg_stat_statements'
  • pg_stat_statements.max = 10000
  • pg_stat_statements.track = all
 when enabling the pg_stat_statements.

Updated: Nov-04-2011

As Maris and Tomáš mentioned to eliminate the I/O bound situation in the comments, I tried additional benchmarks with the pgbench "Select only" mode.

As a result, I observed that pg_stat_statements has almost 10% performance impact (9680.8 tps v.s. 10612.1 tps) in "Select only" transaction processing that ran without writing WAL and locking to update page blocks in the shared buffers.

So, now is the time again to consider integration of pg_stat_statements to the core. What do you think of it?


    Maris said...

    Was synchronous commit on? If the system was IO bound, that may not reflect very well what would happen on more serious servers that have disk controllers with batteries.

    Satoshi Nagayasu said...

    Thanks for the comment.

    Yes, sync commit was on, and I think I should have an additional try on UNLOGGED tables again to eliminate I/O bound situation.

    Tomáš Vondra said...

    If you want to eliminate the IO bound, use "pgbench -S" i.e. a read only test with small database (so that it fits into the memory).

    And you have to run that long enough to warm up the cache (try to run it repeatedly, until the results are stable).

    Satoshi Nagayasu said...

    Hi Tomáš. Exactly. You remind me the easiest way to run pgbench without the IO bound. Thanks!

    Satoshi Nagayasu said...

    Thanks you all for the comments.

    I just added the results from pgbench "select only" runs.