Replication in the PostgreSQL database system is a process where data is automatically copied from one server (known as the primary or master server) to one or more additional servers (known as secondary or slave servers). This mechanism is crucial for ensuring high availability, data backup, and distributing read workload across servers. PostgreSQL supports several replication methods, including logical replication and streaming replication.
Basic Steps for Setting Up Streaming Replication
Streaming replication in PostgreSQL sends changes (WAL - Write-Ahead Logging) from the primary server to the secondary server(s) in real-time. To set it up, several steps need to be taken:
1. Preparation on the Primary Server
-
Configure PostgreSQL.conf: Open the
postgresql.conf
configuration file on the primary server and set:wal_level
toreplica
orlogical
.max_wal_senders
to the number of secondary servers plus a few for backup and other purposes.wal_keep_segments
to a value that determines how many WAL files the system should retain for replication purposes.archive_mode
toon
andarchive_command
to the command for archiving WAL files (if WAL archiving is needed).
-
Configure pg_hba.conf: Add rules to the
pg_hba.conf
file on the primary server that allow secondary servers to connect for replication. A rule might look like this:host replication all secondary_server_ip/32 md5
2. Initialization of the Secondary Server
-
Before starting the secondary server, initialize its data directory using the
pg_basebackup
tool executed on the primary server. The command may look like this:pg_basebackup -h primary_server_ip -D /var/lib/postgresql/data -U replication_user -P -v -X stream
Configure the Secondary Server: On the secondary server, create a file named standby.signal
in the main data directory to indicate that the server should operate in replication mode. Also, set primary_conninfo
in the postgresql.auto.conf
or recovery.conf
file (depends on the PostgreSQL version), providing the connection details to the primary server, for example:
primary_conninfo = 'host=primary_server_ip port=5432 user=replication_user password=password'
3. Start and Monitor Replication
- After configuring both servers, start the secondary server. Replication should begin automatically.
- Monitoring the replication status can be done using
psql
commands on the primary server, such asselect * from pg_stat_replication;
.
Setting up replication in PostgreSQL requires careful preparation and configuration of both the primary and secondary servers. It's important to properly configure the configuration files and secure communication between servers. Once replication is successfully set up, you'll have a system that is more resilient to outages and provides better data availability.