Replicate consistent snapshot to a second PG instance using FDWs

2022-01-05 Thread Thomas Michael Engelke



We have a setup where we run 2 locations with both locations containing 
a full HA setup (using Patroni, etcd, Consul). Each location has 2 PG 
servers + 1 witness node. At certain time intervals or on demand, our 
customer would want to send the data from one location to the other over 
the wire, either in full (overwriting the existing data) or selected 
tables/rows (incrementally).


To simplify the used technology I am thinking of implementing all of 
that in PostgreSQL itself; the alternative would be to code something, 
probably in Python, that executes pg_dump/pg_restore using configuration 
created specifically for the use case.


My idea is to execute everything in SQL, using fdw to create fdw tables 
for each table that needs to be transferred (on the target server B1):


DROP TABLE tablename;
CREATE TABLE tablename_fdw SERVER A1;
CREATE TABLE tablename LIKE tablename_fdw INCLUDING ALL;

Then just pull the data over:

INSERT INTO tablename SELECT * FROM tablename_fdw;

Please ignore obvious possible performance optimizations.

However, the copy of the data needs to be consistent, so I would need to 
have to copy a snapshot over. For a direct connection all is well 
understood, but working with remote tables using fdw here not all is 
well understood (by me).


I found a talk from PGCon 2020 from 2nd Quadrant where they state:

"Open a foreign transaction when FDW access the remote first time within 
the local transaction"
"Foreign transaction uses SERIALIZABLE when the local transaction has 
SERIALIZABLE, otherwise use REPEATABLE READ"
"This ensures that if a query performs multiple table scans on the 
remote server, it will get snapshot-consistent results for all the scans"


Is this what I am looking for? Can I reliable query one table after the 
other over multiple fdw tables from the same server and get snapshot 
consistency in the same way a connection to one instance would grant me?





ICU Collations and Collation Updates

2025-04-14 Thread Thomas Michael Engelke
Good morning,

long time reader, first time writer.

Where I currently work my colleagues used libc collations before I
arrived. While using libc collations, they stumbled upon the collation
update problem after SLES updates (15.4 to 15.5) (collation version
difference for database and operating system) (paraphrased, don't have
the english message at the hand).

For an easy solution I suggested to switch to ICU collations. While
documenting the problem for older systems I realized that I did not
know enough about the problem to document why ICU collations would
solve this problem.

After reading https://www.postgresql.org/docs/17/collation.html this is
how I understand it:

When initdb creates a cluster the OS available collations are copied to
the database as database objects, listable using

select * from pg_collation;

Now, an OS collation update as part of the OS update will change the
collations available on the OS level, but not the collations that the
database uses.

Is my understanding correct then in that this way the database
collations never change, unless a manual intervention reinitialises the
collations and reindexes the database (or appropriate indexes)? How
does that process compare to other RDBMS?

Are regular collation updates deemed unnecessary for long running
database installations? Or do you people have maintenance workflows
that incorporate regular collation updates to the databases?

Thanks,

Thomas