PostgreSQL Edge Cache
pgec is a real-time in memory database replication cache, with a memcached and REST API. It supports column lists and row filters with the latest features of replication in PostgreSQL 15.
Using a local postgres
via docker
to demonstrate:
docker run \
--rm \
--pull always \
--name postgres \
--detach \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=postgres \
postgres:15 \
-c wal_level=logical
It is important that PostgreSQL is running using
logical replication enabled (the wal_level=logical
) which is used by
pgec to remain synchronised with your data as it changes.
An interactive SQL shell so that we can create some data:
docker exec \
--interactive \
--tty \
postgres \
psql \
postgres \
postgres
PostgreSQL 15 New Logical Replication Features
The new logical replication features in the recent PostgreSQL 15 release, are: row filters and column lists.
Column Lists
From PostgreSQL 15, each publication can specify which columns of each table are replicated to subscribers.
Create a table t1
to be used in the following example:
create table t1 (id int,
a text,
b text,
c text,
d text,
e text,
primary key(id));
Create a publication p1
, with a column list to reduce the number of
columns that will be replicated:
create publication p1
for table t1 (id, a, b, d);
Insert some test data into the newly created t1
table:
insert into t1
values
(1, 'a-1', 'b-1', 'c-1', 'd-1', 'e-1'),
(2, 'a-2', 'b-2', 'c-2', 'd-2', 'e-2'),
(3, 'a-3', 'b-3', 'c-3', 'd-3', 'e-3');
Row Filters
By default, all data from all published tables will be replicated to the appropriate subscribers. The replicated data can be reduced by using a row filter. A user might choose to use row filters for behavioural, security or performance reasons. If a published table sets a row filter, a row is replicated only if its data satisfies the row filter expression. This allows a set of tables to be partially replicated. The row filter is defined per table.
Create a table t2
to be used in the following example:
create table t2 (a int,
b int,
c text,
primary key(a));
Creating a publication p2
with a row filter:
create publication p2
for table t2
where (a > 5);
With some initial data:
insert into t2
values
(2, 102, 'NSW'),
(3, 103, 'QLD'),
(4, 104, 'VIC'),
(5, 105, 'ACT'),
(6, 106, 'NSW'),
(7, 107, 'NT'),
(8, 108, 'QLD'),
(9, 109, 'NSW');
In Memory Database Replication Cache
pgec is a real-time in memory database replication cache, with a memcached and REST API.
Start pgec connecting to our database, replicating the two publications that we have just created:
docker run \
--detach \
--name pgec \
--pull always \
--publish 8080:80 \
--publish 11211:11211 \
-e PGMP_REPLICATION_LOGICAL_PUBLICATION_NAMES=p1,p2 \
-e PGMP_DATABASE_USER=postgres \
-e PGMP_DATABASE_PASSWORD=postgres \
-e PGMP_DATABASE_HOSTNAME=host.docker.internal \
ghcr.io/shortishly/pgec:develop
pgec will act as a real-time in memory database replication cache, with a memcached and REST API, for the publications we have just created.
The replication process creates a transaction checkpoint ensuring data integrity. Once the initial data has been collected, streaming replication starts, receiving changes that have been applied subsequent to the checkpoint, ensuring no loss of data. Real-time streaming replication continues keeping pgec up to date.
memcached
We can make memcached requests to get data from pgec on port 11211.
The keys used have the format: publication.table.key
. To get the key
“1” from table “t1” and publication: “p1”. We would use get p1.t1.1
as follows:
telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
get p1.t1.1
VALUE p1.t1.1 0 38
{"a": "a-1", "b": "b-1", "d": "d-1", "id": 1}
END
get p2.t2.6
VALUE p2.t2.6 0 25
{"a": 6, "b": 106, "c": "NSW"}
END
REST API
Taking a look at the t1
table via the JSON API:
curl http://localhost:8080/p1/t1
{"rows": [{"a": "a-1", "b": "b-1", "d": "d-1", "id": 1},
{"a": "a-2", "b": "b-2", "d": "d-2", "id": 2},
{"a": "a-3", "b": "b-3", "d": "d-3", "id": 3}]}
Note that columns “c” and “e” are not included in either the memcached or REST response because they are not part of publication p1.
Retrieve an individual row by simply supplying the key:
curl http://localhost:8080/p1/t1/3
{"a": "a-3", "b": "b-3", "d": "d-3", "id": 3}
Real-time Streaming Replication
Data changes are reflected in pgec through real-time streaming replication.
Updated data is streamed into pgec in real-time. Row filters are applied, adding or removing rows where necessary.
The existing row (3, 103, 'QLD')
in t2
does not match the row
filter, because the primary key is not greater than 5. Lets update it
so that it is included in replication.
Before the change:
telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
get p2.t2.11
END
Apply the change:
update t2 set a = 11 where a = 3;
Checking the result:
telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
get p2.t2.11
VALUE p2.t2.11 0 26
{"a":11,"b":103,"c":"QLD"}
END