RE: Initial COPY of Logical Replication is too slow

Hayato Kuroda (Fujitsu) Thu, 26 Mar 2026 01:36:31 -0700

Dear Sawada-san,
(Sending again because blocked by some rules)

I ran the performance testing independently for the 0001 patch. Overall 
performance looked
very nice, new function spent O(1) time based on the total number of tables.
It seems good enough.


Source code:
----------------
HEAD (4287c50f) + v4-0001 patch.

Setup:
---------
A database cluster was set up with shared_buffers=100GB. Several tables were
defined on the public schema, and same number of tables were on the sch1.
Total number of tables were {50, 500, 5000, 50000}.
A publication included a schema sch1 and all public tables individually.

Attached script setup the same. The suffix is changed to .txt to pass the rule.

Workload Run:
--------------------
I ran two types of SQLs and measured the execution time via \timing metacommand.
Cases were emulated which tablesync worker would do.

Case 1: old SQL
```
SELECT DISTINCT
  (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)
   THEN NULL ELSE gpt.attrs END)
  FROM pg_publication p,
  LATERAL pg_get_publication_tables(p.pubname) gpt,
  pg_class c
 WHERE gpt.relid = 17885 AND c.oid = gpt.relid
   AND p.pubname IN ( 'pub' );
```

Case 2: new SQL
```
SELECT DISTINCT
  (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)
   THEN NULL ELSE gpt.attrs END)
  FROM pg_publication p,
  LATERAL pg_get_publication_tables(p.pubname, 16535) gpt,
  pg_class c
 WHERE c.oid = gpt.relid
   AND p.pubname IN ( 'pub' );
```

Result Observations:
---------------
Attached bar graph shows the result. A logarithmic scale is used for the 
execution
time (y-axis) to see both small/large scale case. The spent time became 
approximately
10x longer for 500->5000, and 5000->50000, in case of old SQL is used.
Apart from that, the spent time for the new SQL is mostly the stable based on 
the
number of tables.

Detailed Result:
--------------
Each cell are the median of 10 runs.

Total tables    Execution time for the old SQL was done [ms]    Execution time 
for the old SQL was done [ms]
50              5.77                                            4.19
500             15.75                                           4.28
5000            120.39                                          4.22
50000           1741.89                                         4.60
500000          73287.16                                        4.95


Also, here is a small code comment. I think we can have an Assert at the
begining of the pg_get_publication_tables(), something like below.

```
@@ -1392,6 +1392,9 @@ pg_get_publication_tables(FunctionCallInfo fcinfo, 
ArrayType *pubnames,
        FuncCallContext *funcctx;
        List       *table_infos = NIL;
 
+       Assert((pubnames && (!pubname && !OidIsValid(target_relid))) ||
+                  (!pubnames && (pubname && OidIsValid(target_relid))));
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#!/bin/bash

####################
### Declarations ###
####################

## Publisher-related params
PORT_PUB=6633
DATA_PUB=data_pub
LOG_PUB=pub.log

## Number of runs
NUMRUN=1

## Measurement params
NUMTABLES=25000

# Setup an instance with above parameters.
function setup () {
    ################
    ### clean up ###
    ################
    pg_ctl stop -D $DATA_PUB -w
    rm -rf $DATA_PUB $LOG_PUB

    #######################
    ### setup publisher ###
    #######################
    initdb -D data_pub -U postgres
    cat << EOF >> data_pub/postgresql.conf
port=$PORT_PUB
autovacuum = false
shared_buffers = '100GB'
max_wal_size = 20GB
min_wal_size = 10GB
wal_level = logical
EOF
    pg_ctl -D $DATA_PUB start -w -l $LOG_PUB

    (
        echo "CREATE SCHEMA sch1;"
        echo "SELECT 'CREATE TABLE tab_' || generate_series(1, $NUMTABLES) || 
'(id int primary key)' ; \gexec"
        echo "SELECT 'CREATE TABLE sch1.tab_' || generate_series(1, $NUMTABLES) 
|| '(id int primary key)' ; \gexec"
        echo "CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch1;"
        echo "SELECT 'ALTER PUBLICATION pub ADD TABLE tab_' || 
generate_series(1, $NUMTABLES) ; \gexec"
    ) | psql -U postgres -p $PORT_PUB
}

setup

RE: Initial COPY of Logical Replication is too slow

Reply via email to