PGMP Logical Replication in PostgreSQL 15
PGMP supports the real-time logical replication of PostgreSQL tables into ETS out of the box on Erlang/OTP 25. It includes the new features: row filters and column lists, which are part of the recent PostgreSQL 15 release.
Logical replication is a method of replicating data objects and their changes, based upon their replication identity. In pgmp tables with a single (primary) or multiple columns (composite) keys are supported.
Logical replication uses a publish and subscribe model with one or more subscribers subscribing to one or more publications on a publisher node. PGMP acts as a subscriber to one or publications offered by a PostgreSQL database.
The process initiated by pgmp creates a replication slot for each subscribed publication. As part of this a transactional identifier is created by the PostgreSQL database. A snapshot of the published data at that point in time is created by pgmp into a ETS replica. Once that is done, the changes on the publisher are sent to pgmp as they occur in real-time via a socket. The subscriber applies the data in the same order as the publisher so that transactional consistency is guaranteed for publications within a single subscription.
Enabling Logical Replication In PostgreSQL
If you have an existing PostgreSQL 15 database, your
postgresql.conf
must contain to enable replication:
wal_level = logical
For the remainder of this article we will be using PostgreSQL 15 running via docker:
docker run \
--rm \
--name postgres \
--detach \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=postgres \
postgres:15 \
-c wal_level=logical
You can then run a SQL shell with:
docker exec \
--interactive \
--tty \
postgres \
psql \
postgres \
postgres
You should see the following prompt:
psql (15.0 (Debian 15.0-1.pgdg110+1))
Type "help" for help.
postgres=#
Building PGMP
We will be using Erlang/OTP 25 running via docker:
docker run \
--rm \
--interactive \
--tty \
erlang:25 \
/bin/bash
You should see an interactive prompt:
root@df8e90c4cefd:/#
Where df8e90c4cefd
will be different for your container. Clone
pgmp with the following command:
git clone https://github.com/shortishly/pgmp.git
Building:
cd pgmp && make
PostgreSQL 15 New Logical Replication Features
The new logical replication features in the recent PostgreSQL 15 release, are: row filters and column lists. Their usage with pgmp are described in this section.
Column Lists
From PostgreSQL 15, each publication can specify which columns of each table are replicated to subscribers.
Create a table t1
to be used in the following example:
create table t1 (id int,
a text,
b text,
c text,
d text,
e text,
primary key(id));
Create a publication p1
, with a column list to reduce the number of
columns that will be replicated:
create publication p1
for table t1 (id, a, b, d);
Insert some test data into the newly created t1
table:
insert into t1
values
(1, 'a-1', 'b-1', 'c-1', 'd-1', 'e-1'),
(2, 'a-2', 'b-2', 'c-2', 'd-2', 'e-2'),
(3, 'a-3', 'b-3', 'c-3', 'd-3', 'e-3');
Start pgmp with replication of the p1
publication:
PGMP_REPLICATION_LOGICAL_PUBLICATION_NAMES=p1 \
PGMP_DATABASE_HOSTNAME=host.docker.internal \
PGMP_DATABASE_USER=postgres \
PGMP_DATABASE_PASSWORD=postgres \
make shell
As part of the replication process pgmp introspects the table
metadata for t1
determining that column id
is the primary
key. This is the column that is used as the primary key in
ETS replica of t1
. The contents of the replicated
table:
2> lists:sort(ets:tab2list(t1)).
[{1, <<"a-1">>, <<"b-1">>, <<"d-1">>},
{2, <<"a-2">>, <<"b-2">>, <<"d-2">>},
{3, <<"a-3">>, <<"b-3">>, <<"d-3">>}]
The rows from the initial replication snapshot are in an
ETS replica called t1
. Note that only the columns
listed in the publication are present.
Any changes on the PostgreSQL database will now be replicated in real-time to the ETS replica.
Back to PostgreSQL, inserting more data will be replicated in real-time:
insert into t1
values (4, 'a-4', 'b-4', 'c-4', 'd-4', 'e-4');
Meanwhile in pgmp:
3> lists:sort(ets:tab2list(t1)).
[{1,<<"a-1">>,<<"b-1">>,<<"d-1">>},
{2,<<"a-2">>,<<"b-2">>,<<"d-2">>},
{3,<<"a-3">>,<<"b-3">>,<<"d-3">>},
{4,<<"a-4">>,<<"b-4">>,<<"d-4">>}]
Row Filters
By default, all data from all published tables will be replicated to the appropriate subscribers. The replicated data can be reduced by using a row filter. A user might choose to use row filters for behavioral, security or performance reasons. If a published table sets a row filter, a row is replicated only if its data satisfies the row filter expression. This allows a set of tables to be partially replicated. The row filter is defined per table.
Create a table t2
to be used in the following example:
create table t2 (a int,
b int,
c text,
primary key(a, c));
Creating a publication p2
with a row filter:
create publication p2
for table t2
where (a > 5 and c = 'NSW');
With some initial data:
insert into t2
values
(2, 102, 'NSW'),
(3, 103, 'QLD'),
(4, 104, 'VIC'),
(5, 105, 'ACT'),
(6, 106, 'NSW'),
(7, 107, 'NT'),
(8, 108, 'QLD'),
(9, 109, 'NSW');
Start pgmp with replication of the p2
publication:
PGMP_REPLICATION_LOGICAL_PUBLICATION_NAMES=p2 \
PGMP_DATABASE_HOSTNAME=host.docker.internal \
PGMP_DATABASE_USER=postgres \
PGMP_DATABASE_PASSWORD=postgres \
make shell
Only those rows that match the filter are replicated where (a > 5 and
c = 'NSW')
:
1> lists:sort(ets:tab2list(t2)).
[{{6, <<"NSW">>}, 109},
{{9, <<"NSW">>}, 109}]
Updates are streamed in real-time into ETS:
update t2 SET b = 999 where a = 6;
2> lists:sort(ets:tab2list(t2)).
[{{6, <<"NSW">>}, 999},
{{9, <<"NSW">>}, 109}]
Changing data so that it meets the row filter will be replicated in real-time into ETS:
update t2 set a = 555 where a = 2;
3> lists:sort(ets:tab2list(t2)).
[{{6, <<"NSW">>}, 999},
{{9, <<"NSW">>}, 109},
{{555, <<"NSW">>}, 102}]
Similarly changing data so that it no longer meets the row filter will be replicated in real-time into ETS:
update t2 set c = 'VIC' where a = 9;
4> lists:sort(ets:tab2list(t2)).
[{{6, <<"NSW">>}, 999},
{{555, <<"NSW">>}, 102}]
This article has shown two new features of logical replication in PostgreSQL 15 being used by the pgmp in Erlang/OTP: row filters and column lists.