Saturday, May 15, 2010

5 steps to implement a PostgreSQL replication system

Yesterday, I tried to build a PostgreSQL master-slave replication system with new feature "Streaming Replication (SR)", which is coming in the next PostgreSQL major release 9.0 as a built-in replication solution, and it worked well on my servers.

I feel it is very easy to configure, so I have decided to make some memos here for building a PostgreSQL master-slave replication system.

Here are 5 steps to implement a PostgreSQL replication system.
  1. Build and install binaries.
  2. Initialize a database cluster and duplicate it.
  3. Configure the master node.
  4. Configure the slave node(s).
  5. Start the master and slave node(s).
Ok, let's start.

(This photo was taken at the PostgreSQL 9 TestFest Japan.)

How it works

Taking a minute to understand how SR works before building it could be very helpful to you.
  • SR enables to have a single master (read-write) node, and multiple slave (read-only) nodes.
  • The master node sends transactional log (WAL) records generated on the master to the slave node(s).
  • The slave is waiting for the log records and applying them continuously. (hot standby mode)
  • During the hot standby mode, the slave can also serve to client applications to process read-only queries.
So you have to configure one master node, and one or more slave nodes to implement the SR.

Step 1. Build and install binaries

At first, you have to install PostgreSQL binaries on each node. To install them, you can use build commands as usual.
$ ./configure –-prefix=/usr/local/pgsql90b1
$ make ; make check
$ su
# make install
Step 2. Initialize a database cluster and duplicate it

In the 2nd step, you have to initialize a database cluster (database directory) on the master node, and duplicate it on the slave node(s).

After initializing a database cluster on the master node, you have to make a base backup to duplicate it on the slave(s). "A base backup" means a whole dump of the database cluster which is taken in the PostgreSQL manner. If you are not familiar with the PostgreSQL base backup, see "24.3.2. Making a Base Backup" in the PostgreSQL official manual for more details.
So the protocol is shown in below. ($PGDATA is a database cluster directory you can specify as you like.)
master$ initdb –D $PGDATA –-no-locale –-encoding=UTF8
master$ pg_ctl –D $PGDATA start
master$ psql –c "SELECT pg_start_backup('initial backup for SR')" template1
master$ tar cvf pg_base_backup.tar $PGDATA
master$ psql –c "SELECT pg_stop_backup()" template1
After taking the base backup on the master node, you have to duplicate (copy and extract) it on the slave node. In addition, you have to remove "postmaster.pid" file on the slave.
slave$ tar xvf pg_base_backup.tar
slave$ rm –f $PGDATA/postmaster.pid
Step 3. Configure the master node

On the master node, you have to configure two files, "postgresql.conf" and "pg_hba.conf".

postgresql.conf

Here is 6 entries in postgresql.conf to enable the master node.
listen_addresses = '*' # to accept a connection from the slave
wal_level = hot_standby     # to generate WAL records for SR purpose.
archive_mode = on           # to enable the archiving log mode.
archive_command = 'cp %p /home/snaga/pgdata90b1/pg_xlogarch/%f'
                            # to specify the log archiving command.
max_wal_senders = 5         # to specify the max number of the slave(s).
wal_keep_segments = 32      # to specify the number of the previous WAL files to hold on the master.
See the official manual for more details.

pg_hba.conf

You have to add an entry to accept a connection from the slave in pg_hba.conf. The database name must be "replication" here, and you have to specify IP addresses of the slave nodes.
host   replication    all    10.0.2.42/32        trust

Step 4. Configure the slave node(s)

You also have to modify two files on the slave node, "postgresql.conf" and "recovery.conf".

postgresql.conf

You have only one entry in postgresql.conf to modify.
hot_standby = on       # to allow read-only queries on the slave (standby) node.
recovery.conf

To enable the standby mode on the slave node, you have to make a "recovery.conf" file as below.
standby_mode = 'on'      # to enable the standby (read-only) mode.
primary_conninfo = 'host=10.0.2.41 port=5432 user=snaga'
                         # to specify a connection info to the master node.
trigger_file = '/tmp/pg_failover_trigger'
                         # to specify a trigger file to recognize a fail over.
restore_command = 'cp /home/snaga/pgdata90b/pg_xlogarch/%f "%p"'
                         # to specify a recovery command.
Finally, the configuration is finished, and you can start both master and slave nodes.

Step 5. Start the master and slave node(s)

Start your master PostgreSQL and slave PostgreSQL. After starting both, you may find a server log record on each.

On the master node, you may find a server log record as below.
LOG: replication connection authorized: user=snaga host=10.0.2.42 port=55811
And on the slave node, you can find a record as below.
LOG: streaming replication successfully connected to primary
If you can find them, congratulations! Well done, and it's time to take a break with some coffee. :-)

Enjoy your PostgreSQL!

More resources

9 comments:

Raghavendra said...

Its good presentation, but sorry to say Step 2 is wrong. You cannot take the pg_start_backup() until you set the cluster in archive log mode.

Please check once.

Satoshi Nagayasu said...

Hi, thanks for your comment.

Yes, you're right. I forgot the order of things what I did, and I should correct that point in a few days.

Thank you very much again for your feedback.

Unknown said...

Thanks for this post. Distilling the process down to a shortish article was very helpful for me setting it up at work.

Nenominal said...

Nice job! I'm not a Postgresql expert so I was looking for this kind of implementation! Thank you for taking your time and post it on your blog :)

@tutmosisII said...

Hi Satoshi!

I have problems whith my ph_hba.conf in the master (i think so),in the standby i get this error:

cp: cannot stat `/bkInicial/walMaster/00000001000000000000001F': No such file or directory
2011-06-14 14:19:16 COT FATAL: could not connect to the primary server: FATAL: no pg_hba.conf entry for replication connection from host "::1", user "detectca", SSL on
FATAL: no pg_hba.conf entry for replication connection from host "::1", user "detectca", SSL off

=============================
configuration of pg_hba.conf
==============================

host replication all 192.168.243.73/32 trust

Coudl you help me please.

Thanks

kientt said...

Uhm, I have one quesiton: How to synchronize data the folder /home/snaga/pgdata90b1/pg_xlogarch/ on master and slave?

kientt said...

Thank for the useful tool. However i have one question: I deploy on master and slave machine, so how can i synchronize the folder /home/snaga/pgdata90b1/pg_xlogarch/ on the 2 machines

kientt said...

Thank for the useful tool. However i have one question: I deploy on master and slave machine, so how can i synchronize the folder /home/snaga/pgdata90b1/pg_xlogarch/ on the 2 machines

Anonymous said...

Hi- i am having a tough time finding any documents on how to cleanly dismantle hotstandby replication between primary and standby. Any clues?