Saturday, October 22, 2011

Loading data into UNLOGGED tables, and considering ALTER TABLE

UNLOGGED table has been introduced in 9.1, and it would improve INSERT/UPDATE/DELETE performance in several situations. At this time, I'm curious how it would work well on data loading.

To load large amount of rows into PostgreSQL, WAL logging would be a performance penalty, especially under the situation that have several tables on different spindles, because WAL logging forces data loading being serialized.

I have created a small function, called "enable_logging()", to switch table persistence mode for existing tables between (regular) logging and UNLOGGED.

bool enable_logging(Oid relid, bool mode);

You can see the source at github, and an example at the bottom of this entry.

By using this function, DBA can disable WAL logging while loading data into the table to improve loading performance, and switch it back after loading to keep durability of the table.

According to my experiments (pic on the right), loading 10,000,000 rows into pgbench_accounts table with COPY command took 109 seconds with WAL logging, and 53 seconds under UNLOGGED mode. The COPY performance was doubled.

So, providing this persistence switching feature, as one of ALTER TABLE options, would help DBA to load data faster, especially having multiple tablespaces with different spindles.

Now, my next question is:

"Should we bring it into the core? If so, which syntax should we use to bring it into ALTER TABLE?".

Any comments?


Example:

snaga=# SELECT oid,relname,relpersistence FROM pg_class WHERE relname LIKE 'pgbench%';
  oid  |        relname        | relpersistence
-------+-----------------------+----------------
 16685 | pgbench_accounts      | p
 16696 | pgbench_accounts_pkey | p
 16679 | pgbench_branches      | p
 16692 | pgbench_branches_pkey | p
 16688 | pgbench_history       | p
 16682 | pgbench_tellers       | p
 16694 | pgbench_tellers_pkey  | p
(7 rows)

Time: 0.646 ms
snaga=# SELECT enable_logging( (SELECT oid FROM pg_class WHERE relname='pgbench_accounts'), false);
 enable_logging
----------------
 t
(1 row)

Time: 20.560 ms
snaga=# SELECT enable_logging( (SELECT oid FROM pg_class WHERE relname='pgbench_branches'), false);
 enable_logging
----------------
 t
(1 row)

Time: 6.292 ms
snaga=# SELECT enable_logging( (SELECT oid FROM pg_class WHERE relname='pgbench_history'), false);
 enable_logging
----------------
 t
(1 row)

Time: 4.900 ms
snaga=# SELECT enable_logging( (SELECT oid FROM pg_class WHERE relname='pgbench_tellers'), false);
 enable_logging
----------------
 t
(1 row)

Time: 4.479 ms
snaga=#  SELECT oid,relname,relpersistence FROM pg_class WHERE relname LIKE 'pgbench%';
  oid  |        relname        | relpersistence
-------+-----------------------+----------------
 16679 | pgbench_branches      | u
 16692 | pgbench_branches_pkey | u
 16688 | pgbench_history       | u
 16682 | pgbench_tellers       | u
 16694 | pgbench_tellers_pkey  | u
 16685 | pgbench_accounts      | u
 16696 | pgbench_accounts_pkey | u
(7 rows)

Time: 0.860 ms
snaga=#

6 comments:

infofarmer said...

I'd rather see special options for COPY and other functions to make them faster at a cost of reliability.

Robert Haas said...

This is unsafe. The BM_PERMANENT bit needs to be set or cleared on each buffer header in shared buffers, and on an unlogged to logged conversion, you must write and fsync any dirty pages in shared buffers. Even with those changes, it will only with wal_level=minimal; this approach fails completely if archiving or hot standby is in use.

Satoshi Nagayasu said...

Thanks for the comments. I just realized that I need deep dive into the WAL implementation...

Satoshi Nagayasu said...

BTW, do you have any ideas or suggestions to implement this feature? I'm very interested in enabling logging/unlogging switching to improve data loading performance.

Matt said...

When logging is enabled, the contents of the table should be appeneded to the log (just as if you renamed the table, created a logged table, and did an INSERT INTO) so that the all clusters participating in replication have the same data. Otherwise, the data in your converted table does not exist on the slaves.

Satoshi Nagayasu said...

Finally, I got what you all pointed out. Thanks for the comments.

Yeah, a period that lacks XLOG records would be a problem for recovery from the archive logs.

It also would be another problem under the replication environment because not writing XLOG means that the table could not be replicated.

However, I think that dealing with unlogging and archive logs must be a trade off chosen by DBA. DBA can have several options how to load and take a backup for their system and situation.

BTW, I guess some needs for replicating unlogged tables would be raised from users soon, because "unlogging" and "replicating" are basically different things (from user's viewpoint). The reason why these are combined with is just because of the current implementation of PostgreSQL.

So, I'm still interested in how to make "unlogging" more useful for users in the future, my experiment was lacking many things to be considered though. :)