ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Allan Barrielle
Hello guys, I'm facing a problem. Currently I'm working on a Data
transformation Pipeline on Postgres. The strategy is,

We select every tables in a given schema ( 50 tables ), we apply some case
when, translation, enum and load it into a different new schema with a
CREATE TABLE SCHEMA_2.table_1 AS SELECT * FROM SCHEMA_1.TABLE_1, then we do
it again about 3 more times and everytime it’s a new schema, new table. We
only keep and don’t drop the schema1.

To orchestrate the whole, we've got a bunch of .sql files that we run by
using psql directly. That's our "strategy".

So we're copying a lot of data, but it allows us to debug, and investigate
business bugs, because we can plug us into schema 2,3 and search why it's
an issue.

All is fine, and can work great.
But sometimes, some queries that used to take about 20 secs to complete can
suddenly end in 5mins.
Important all queries have the same shape -> CREATE TABLE SELECT AS *(a bit
of transform) FROM TABLE). No update, nothing, it’s dead simple.

We are just trying to copy a table from schema1, to schema2, to schema3 and
finally schema3. That’s it.
The thing to understand here is schema2, schema3 are dropped at every
pipeline transformation, so everytime we run the script, it drops
everything from schema2 to the final stage.

We tuned the config a little bit, and we tried kind of everything (
synchronous_commit, wal, vacuum )
Nothing works, it’s very random, some query won’t simply work ( even after
hours ).

We use different machines, different config, and different datasets.

The only thing that makes it work every time, in 100% cases, is to put a
sleep(10sec) between each schema.
So we select 50 tables, we create a new schema with it, then we sleep 10
sec then we do again the same query but with the freshly created schema and
we create a third schema, sleep 10s and again..

And that makes the whole pipeline successful each time.

So, It seems it's a background process inside postgres, that should ingest
a lot of data, and we have to give him time to take a rest, like a
bg_writers  or something else ?
I disabled autovacuum=off . Same.
Why does the query never end even after hours ? Why there is no log about
where the query is stuck.
To be clear, if I kill the stuck query and run again it will work.

I don't know much about what's going on inside Postgres, which randomly
takes a lot of time, with the same code, same data.

PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

Thank you so much for your time..


Strange execution plan

2021-07-08 Thread Joel Frid
Hi, 



On my production environment (PostgreSQL 13.3), one of my queries runs very 
slow, about 2 minutes.

I noticed that it does not use an execution plan that I anticapited it would.



The query is :



SELECT t.*

FROM test t 

WHERE  t."existe" IS true

and  t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'

ORDER BY t."id" DESC 

LIMIT 100 OFFSET 0



I know PostgreSQL is not very good at performing well with pagination and 
offsets but all my queries must end with "LIMIT 100 OFFSET 0", "LIMIT 100 
OFFSET 1", ...



If I display actual Execution Plan, I get this :



Limit  (cost=0.43..1164.55 rows=100 width=632) (actual 
time=7884.056..121297.756 rows=1 loops=1)

  Buffers: shared hit=5311835 read=585741 dirtied=32

  ->  Index Scan Backward using test_pk on test  (cost=0.43..141104.29 
rows=12121 width=632) (actual time=7884.053..121297.734 rows=1 loops=1)

    Filter: ((existe IS TRUE) AND (json_data @> '{"book": {"title": "In 
Search of Lost Time"}}'::jsonb))

    Rows Removed by Filter: 1215681

    Buffers: shared hit=5311835 read=585741 dirtied=32

Planning:

  Buffers: shared hit=1

Planning Time: 0.283 ms

Execution Time: 121297.878 ms



The query runs very slow from limit 1 to 1147. 

If I change limit value to 1148, this query runs quite fast ( 0.190 ms) with a 
nice execution plan :



SELECT t.*

FROM test t 

WHERE  t."existe" IS true

and  t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'

ORDER BY t."id" DESC 

LIMIT 1148 OFFSET 0



Limit  (cost=13220.53..13223.40 rows=1148 width=632) (actual time=0.138..0.140 
rows=1 loops=1)

  Buffers: shared hit=17

  ->  Sort  (cost=13220.53..13250.84 rows=12121 width=632) (actual 
time=0.137..0.138 rows=1 loops=1)

    Sort Key: id DESC

    Sort Method: quicksort  Memory: 27kB

    Buffers: shared hit=17

    ->  Bitmap Heap Scan on test  (cost=119.73..12543.88 rows=12121 
width=632) (actual time=0.125..0.127 rows=1 loops=1)

  Recheck Cond: (json_data @> '{"book": {"title": "In Search of 
Lost Time"}}'::jsonb)

  Filter: (existe IS TRUE)

  Heap Blocks: exact=1

  Buffers: shared hit=17

  ->  Bitmap Index Scan on test_json_data_idx  (cost=0.00..116.70 
rows=12187 width=0) (actual time=0.112..0.113 rows=1 loops=1)

    Index Cond: (json_data @> '{"book": {"title": "In Search of 
Lost Time"}}'::jsonb)

    Buffers: shared hit=16

Planning:

  Buffers: shared hit=1

Planning Time: 0.296 ms

Execution Time: 0.190 ms



Would you have any suggestions why Postgres chooses a so bad query plan ?





Server :

--

CPU Model : AMD EPYC 7281 16-Core Processor

CPU Cores : 4

CPU Frequency : 2096.060 MHz

CPU Cache : 512 KB

Total Disk    : 888.1 GB (473.0 GB Used)

Total Mem : 11973 MB (4922 MB Used)

Total Swap    : 0 MB (0 MB Used)

OS    : Debian GNU/Linux 10

Arch  : x86_64 (64 Bit)

Kernel    : 5.10.28

Virtualization    : Dedicated

--

I/O Speed(1st run)    : 132 MB/s

I/O Speed(2nd run)    : 204 MB/s

I/O Speed(3rd run)    : 197 MB/s

Average I/O speed : 177.7 MB/s





Postgresql.conf :

max_connections = 100

shared_buffers = 3840MB

huge_pages = on

work_mem = 9830kB

maintenance_work_mem = 960MB

effective_io_concurrency = 200

max_worker_processes = 3

max_parallel_maintenance_workers = 2

max_parallel_workers_per_gather = 2

max_parallel_workers = 3

max_wal_size = 4GB

min_wal_size = 1GB

checkpoint_completion_target = 0.9

effective_cache_size = 11520MB

default_statistics_target = 100

shared_preload_libraries = 'pg_stat_statements'

pg_stat_statements.max = 1

pg_stat_statements.track = all





Table test : I have just over 1.2 million records on this table

CREATE TABLE test (

"source" varchar NOT NULL,

existe bool NULL,

json_data jsonb NULL

row_updated timestamp NOT NULL DEFAULT clock_timestamp(),

row_inserted timestamp NOT NULL DEFAULT clock_timestamp(),

id uuid NOT NULL,

CONSTRAINT test_pk PRIMARY KEY (id)

);

CREATE INDEX test_existe_idx ON test USING btree (existe);

CREATE INDEX test_id_idx ON test USING btree (id);

CREATE INDEX test_json_datae_idx ON test USING gin (json_data jsonb_path_ops);

CREATE INDEX test_row_inserted_idx ON test USING btree (row_inserted);

CREATE INDEX test_row_updated_idx ON production.test USING btree (row_updated);

CREATE INDEX test_source_idx ON production.test USING btree (source);





select * from pg_stat_all_tables where relname = 'test' :



relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del

16692|dev   |test   |1816    |724038305   |31413   |36863713 |1215682  
|23127    |0   



n_tup_hot_upd|n_live_tup|n_dead_tup|

Strange execution plan

2021-07-08 Thread Joel Frid
Hi, 



On my production environment (PostgreSQL 13.3), one of my queries runs very 
slow, about 2 minutes.

I noticed that it does not use an execution plan that I anticapited it would.



The query is :



SELECT t.*

FROM test t 

WHERE  t."existe" IS true

and  t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'

ORDER BY t."id" DESC 

LIMIT 100 OFFSET 0



I know PostgreSQL is not very good at performing well with pagination and 
offsets but all my queries must end with "LIMIT 100 OFFSET 0", "LIMIT 100 
OFFSET 1", ...



If I display actual Execution Plan, I get this :



Limit  (cost=0.43..1164.55 rows=100 width=632) (actual 
time=7884.056..121297.756 rows=1 loops=1)

  Buffers: shared hit=5311835 read=585741 dirtied=32

  ->  Index Scan Backward using test_pk on test  (cost=0.43..141104.29 
rows=12121 width=632) (actual time=7884.053..121297.734 rows=1 loops=1)

    Filter: ((existe IS TRUE) AND (json_data @> '{"book": {"title": "In 
Search of Lost Time"}}'::jsonb))

    Rows Removed by Filter: 1215681

    Buffers: shared hit=5311835 read=585741 dirtied=32

Planning:

  Buffers: shared hit=1

Planning Time: 0.283 ms

Execution Time: 121297.878 ms



The query runs very slow from limit 1 to 1147. 

If I change limit value to 1148, this query runs quite fast ( 0.190 ms) with a 
nice execution plan :



SELECT t.*

FROM test t 

WHERE  t."existe" IS true

and  t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'

ORDER BY t."id" DESC 

LIMIT 1148 OFFSET 0



Limit  (cost=13220.53..13223.40 rows=1148 width=632) (actual time=0.138..0.140 
rows=1 loops=1)

  Buffers: shared hit=17

  ->  Sort  (cost=13220.53..13250.84 rows=12121 width=632) (actual 
time=0.137..0.138 rows=1 loops=1)

    Sort Key: id DESC

    Sort Method: quicksort  Memory: 27kB

    Buffers: shared hit=17

    ->  Bitmap Heap Scan on test  (cost=119.73..12543.88 rows=12121 
width=632) (actual time=0.125..0.127 rows=1 loops=1)

  Recheck Cond: (json_data @> '{"book": {"title": "In Search of 
Lost Time"}}'::jsonb)

  Filter: (existe IS TRUE)

  Heap Blocks: exact=1

  Buffers: shared hit=17

  ->  Bitmap Index Scan on test_json_data_idx  (cost=0.00..116.70 
rows=12187 width=0) (actual time=0.112..0.113 rows=1 loops=1)

    Index Cond: (json_data @> '{"book": {"title": "In Search of 
Lost Time"}}'::jsonb)

    Buffers: shared hit=16

Planning:

  Buffers: shared hit=1

Planning Time: 0.296 ms

Execution Time: 0.190 ms



Would you have any suggestions why Postgres chooses a so bad query plan ?





Server :

--

CPU Model : AMD EPYC 7281 16-Core Processor

CPU Cores : 4

CPU Frequency : 2096.060 MHz

CPU Cache : 512 KB

Total Disk    : 888.1 GB (473.0 GB Used)

Total Mem : 11973 MB (4922 MB Used)

Total Swap    : 0 MB (0 MB Used)

OS    : Debian GNU/Linux 10

Arch  : x86_64 (64 Bit)

Kernel    : 5.10.28

Virtualization    : Dedicated

--

I/O Speed(1st run)    : 132 MB/s

I/O Speed(2nd run)    : 204 MB/s

I/O Speed(3rd run)    : 197 MB/s

Average I/O speed : 177.7 MB/s





Postgresql.conf :

max_connections = 100

shared_buffers = 3840MB

huge_pages = on

work_mem = 9830kB

maintenance_work_mem = 960MB

effective_io_concurrency = 200

max_worker_processes = 3

max_parallel_maintenance_workers = 2

max_parallel_workers_per_gather = 2

max_parallel_workers = 3

max_wal_size = 4GB

min_wal_size = 1GB

checkpoint_completion_target = 0.9

effective_cache_size = 11520MB

default_statistics_target = 100

shared_preload_libraries = 'pg_stat_statements'

pg_stat_statements.max = 1

pg_stat_statements.track = all





Table test : I have just over 1.2 million records on this table

CREATE TABLE test (

"source" varchar NOT NULL,

existe bool NULL,

json_data jsonb NULL

row_updated timestamp NOT NULL DEFAULT clock_timestamp(),

row_inserted timestamp NOT NULL DEFAULT clock_timestamp(),

id uuid NOT NULL,

CONSTRAINT test_pk PRIMARY KEY (id)

);

CREATE INDEX test_existe_idx ON test USING btree (existe);

CREATE INDEX test_id_idx ON test USING btree (id);

CREATE INDEX test_json_datae_idx ON test USING gin (json_data jsonb_path_ops);

CREATE INDEX test_row_inserted_idx ON test USING btree (row_inserted);

CREATE INDEX test_row_updated_idx ON production.test USING btree (row_updated);

CREATE INDEX test_source_idx ON production.test USING btree (source);





select * from pg_stat_all_tables where relname = 'test' :



relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del

16692|dev   |test   |1816    |724038305   |31413   |36863713 |1215682  
|23127    |0   



n_tup_hot_upd|n_live_tup|n_dead_tup|

Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Justin Pryzby
On Thu, Jul 08, 2021 at 01:00:28AM +0200, Allan Barrielle wrote:
> All is fine, and can work great.
> But sometimes, some queries that used to take about 20 secs to complete can
> suddenly end in 5mins.
> Important all queries have the same shape -> CREATE TABLE SELECT AS *(a bit
> of transform) FROM TABLE). No update, nothing, it’s dead simple.
> 
> Nothing works, it’s very random, some query won’t simply work ( even after
> hours ).

When it doesn't work, you could check SELECT * FROM pg_stat_activity, and
SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on.

> Important all queries have the same shape -> CREATE TABLE SELECT AS *(a bit
> of transform) FROM TABLE). No update, nothing, it’s dead simple.
> We are just trying to copy a table from schema1, to schema2, to schema3 and
> finally schema3. That’s it.

Is it true that the SELECTs have no joins in them ?

Did this ever work better or differently under different versions of postgres ?

> Why does the query never end even after hours ? Why there is no log about
> where the query is stuck.

Please send your nondefault config.
https://wiki.postgresql.org/wiki/Server_Configuration

Also enable logging (I just added this to the wiki).
https://wiki.postgresql.org/wiki/Slow_Query_Questions#Enable_Logging

It'd be very useful to get "explain analyze" for a working query and for a
stuck query.  It sound like the stuck query never finishes, so maybe the second
part is impossible (?)

But it'd be good to get at least "explain" output.  You'd have to edit your sql
script to run an "explain" before each query, and run it, logging the ouput,
until you capture the plan for a stuck query.  Save the output and send here,
along with the query plan for a working query.

-- 
Justin




Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Imre Samu
> We use different machines, different config, and different datasets.
> ...
>  PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

Is It possible to upgrade and test with PG 12.7?

IMHO: lot of changes:
* https://www.postgresql.org/docs/12/release-12-5.html
* https://www.postgresql.org/docs/12/release-12-6.html
* https://www.postgresql.org/docs/12/release-12-7.html

Just rule out the possibility that it has been already fixed

Regards,
 Imre


Allan Barrielle  ezt írta (időpont: 2021. júl.
8., Cs, 11:26):

> Hello guys, I'm facing a problem. Currently I'm working on a Data
> transformation Pipeline on Postgres. The strategy is,
>
> We select every tables in a given schema ( 50 tables ), we apply some case
> when, translation, enum and load it into a different new schema with a
> CREATE TABLE SCHEMA_2.table_1 AS SELECT * FROM SCHEMA_1.TABLE_1, then we do
> it again about 3 more times and everytime it’s a new schema, new table. We
> only keep and don’t drop the schema1.
>
> To orchestrate the whole, we've got a bunch of .sql files that we run by
> using psql directly. That's our "strategy".
>
> So we're copying a lot of data, but it allows us to debug, and investigate
> business bugs, because we can plug us into schema 2,3 and search why it's
> an issue.
>
> All is fine, and can work great.
> But sometimes, some queries that used to take about 20 secs to complete
> can suddenly end in 5mins.
> Important all queries have the same shape -> CREATE TABLE SELECT AS *(a
> bit of transform) FROM TABLE). No update, nothing, it’s dead simple.
>
> We are just trying to copy a table from schema1, to schema2, to schema3
> and finally schema3. That’s it.
> The thing to understand here is schema2, schema3 are dropped at every
> pipeline transformation, so everytime we run the script, it drops
> everything from schema2 to the final stage.
>
> We tuned the config a little bit, and we tried kind of everything (
> synchronous_commit, wal, vacuum )
> Nothing works, it’s very random, some query won’t simply work ( even after
> hours ).
>
> We use different machines, different config, and different datasets.
>
> The only thing that makes it work every time, in 100% cases, is to put a
> sleep(10sec) between each schema.
> So we select 50 tables, we create a new schema with it, then we sleep 10
> sec then we do again the same query but with the freshly created schema and
> we create a third schema, sleep 10s and again..
>
> And that makes the whole pipeline successful each time.
>
> So, It seems it's a background process inside postgres, that should ingest
> a lot of data, and we have to give him time to take a rest, like a
> bg_writers  or something else ?
> I disabled autovacuum=off . Same.
> Why does the query never end even after hours ? Why there is no log about
> where the query is stuck.
> To be clear, if I kill the stuck query and run again it will work.
>
> I don't know much about what's going on inside Postgres, which randomly
> takes a lot of time, with the same code, same data.
>
> PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled
> by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
>
> Thank you so much for your time..
>
>
>


Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Justin Pryzby
On Thu, Jul 08, 2021 at 03:49:12PM +0200, Allan Barrielle wrote:
> > Is it true that the SELECTs have no joins in them ?
> 
> Yes there is a lot of LEFT JOIN.
> 
> > It'd be very useful to get "explain analyze" for a working query and for a
> > stuck query.  It sound like the stuck query never finishes, so maybe the 
> > second
> > part is impossible (?)
> 
> We run an explain analysis and we see some very interesting stuff going on.
> It seems without explicitly adding a `ANALYZE`, the query has a cost of
> over billions, so the query is not stuck but took forever.
> When I run the same scripts with an ANALYZE right before running the query,
> the query is exec is 50secondes and the cost is normal

It sounds like sometimes autoanalyze processes important tables being queried,
but sometimes it doesn't.

Since there are JOINs involved, you should analyze the tables after populating
them and before querying them.  The same as if it were a temp table, or
anything else.

> The configuration is tuned by aws aurora, [...]

> fsync,off
> full_page_writes,off

really?

> vacuum_cleanup_index_scale_factor,0.1

also interesting




Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Allan Barrielle
Hello,

> Is it true that the SELECTs have no joins in them ?

Yes there is a lot of LEFT JOIN.

> When it doesn't work, you could check SELECT * FROM pg_stat_activity, and
>SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on.

I can't see any blocking queries blocking pg_locks, pg_blocking_pids.

> It'd be very useful to get "explain analyze" for a working query and for a
> stuck query.  It sound like the stuck query never finishes, so maybe the
second
> part is impossible (?)

We run an explain analysis and we see some very interesting stuff going on.
It seems without explicitly adding a `ANALYZE`, the query has a cost of
over billions, so the query is not stuck but took forever.
When I run the same scripts with an ANALYZE right before running the query,
the query is exec is 50secondes and the cost is normal

Explain analyze WITHOUT ANALYZE https://explain.depesz.com/s/RaSr
Explain analyze same query WITH ANALYZE BEFORE
https://explain.depesz.com/s/tYVl

The configuration is tuned by aws aurora, but this issue happens also with
a default config.

allow_system_table_mods,off
application_name,DataGrip 2021.1.3
archive_command,(disabled)
archive_mode,off
archive_timeout,5min
array_nulls,on
authentication_timeout,1min
autovacuum,on
autovacuum_analyze_scale_factor,0.05
autovacuum_analyze_threshold,50
autovacuum_freeze_max_age,2
autovacuum_max_workers,12
autovacuum_multixact_freeze_max_age,4
autovacuum_naptime,5s
autovacuum_vacuum_cost_delay,1ms
autovacuum_vacuum_cost_limit,1200
autovacuum_vacuum_scale_factor,0.1
autovacuum_vacuum_threshold,50
autovacuum_work_mem,-1
backend_flush_after,0
backslash_quote,safe_encoding
bgwriter_delay,200ms
bgwriter_flush_after,0
bgwriter_lru_maxpages,100
bgwriter_lru_multiplier,2
bonjour,off
bytea_output,hex
check_function_bodies,on
checkpoint_completion_target,0.9
checkpoint_flush_after,0
checkpoint_timeout,15min
checkpoint_warning,30s
client_encoding,UTF8
client_min_messages,notice
commit_delay,0
commit_siblings,5
constraint_exclusion,partition
cpu_index_tuple_cost,0.005
cpu_operator_cost,0.0025
cpu_tuple_cost,0.01
cursor_tuple_fraction,0.1
DateStyle,"ISO, MDY"
db_user_namespace,off
deadlock_timeout,1s
debug_pretty_print,on
debug_print_parse,off
debug_print_plan,off
debug_print_rewritten,off
default_statistics_target,500
default_text_search_config,pg_catalog.simple
default_transaction_deferrable,off
default_transaction_isolation,read committed
default_transaction_read_only,off
dynamic_library_path,$libdir
effective_cache_size,4GB
effective_io_concurrency,600
enable_bitmapscan,on
enable_gathermerge,on
enable_hashagg,on
enable_hashjoin,on
enable_indexonlyscan,on
enable_indexscan,on
enable_material,on
enable_mergejoin,on
enable_nestloop,on
enable_parallel_append,on
enable_parallel_hash,on
enable_partition_pruning,on
enable_partitionwise_aggregate,off
enable_partitionwise_join,off
enable_seqscan,on
enable_sort,on
enable_tidscan,on
escape_string_warning,on
event_source,PostgreSQL
exit_on_error,off
extra_float_digits,3
force_parallel_mode,off
from_collapse_limit,8
fsync,off
full_page_writes,off
geqo,on
geqo_effort,5
geqo_generations,0
geqo_pool_size,0
geqo_seed,0
geqo_selection_bias,2
geqo_threshold,12
gin_fuzzy_search_limit,0
gin_pending_list_limit,4MB
hot_standby,off
hot_standby_feedback,on
huge_pages,try
idle_in_transaction_session_timeout,25min
ignore_checksum_failure,off
ignore_system_indexes,off
IntervalStyle,postgres
jit,off
jit_above_cost,10
jit_debugging_support,off
jit_dump_bitcode,off
jit_expressions,on
jit_inline_above_cost,50
jit_optimize_above_cost,50
jit_profiling_support,off
jit_provider,llvmjit
jit_tuple_deforming,on
join_collapse_limit,8
lc_monetary,C
lc_numeric,C
lc_time,C
listen_addresses,*
lock_timeout,0
lo_compat_privileges,off
maintenance_work_mem,2GB
max_connections,100
max_files_per_process,1000
max_locks_per_transaction,256
max_logical_replication_workers,4
max_parallel_maintenance_workers,12
max_parallel_workers,12
max_parallel_workers_per_gather,6
max_pred_locks_per_page,2
max_pred_locks_per_relation,-2
max_pred_locks_per_transaction,64
max_prepared_transactions,0
max_replication_slots,10
max_stack_depth,6MB
max_standby_archive_delay,30s
max_standby_streaming_delay,14s
max_sync_workers_per_subscription,2
max_wal_senders,0
max_wal_size,8GB
max_worker_processes,12
min_parallel_index_scan_size,512kB
min_parallel_table_scan_size,8MB
min_wal_size,2GB
old_snapshot_threshold,-1
operator_precedence_warning,off
parallel_leader_participation,off
parallel_setup_cost,1000
parallel_tuple_cost,0.1
password_encryption,md5
port,5432
post_auth_delay,0
pre_auth_delay,0
quote_all_identifiers,off
random_page_cost,1.1
restart_after_crash,on
row_security,on
search_path,public
seq_page_cost,1
session_replication_role,origin
shared_buffers,1GB
standard_conforming_strings,on
statement_timeout,0
superuser_reserved_connections,3
synchronize_seqscans,on
synchronous_commit,on
syslog_facility,local0
syslog_ident,postgres
syslog_sequence_nu

Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Allan Barrielle
On a different machine, we use 12.7. Still same issue

On Thu, Jul 8, 2021 at 3:49 PM Allan Barrielle 
wrote:

> Hello,
>
> > Is it true that the SELECTs have no joins in them ?
>
> Yes there is a lot of LEFT JOIN.
>
> > When it doesn't work, you could check SELECT * FROM pg_stat_activity,
> and
> >SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on.
>
> I can't see any blocking queries blocking pg_locks, pg_blocking_pids.
>
> > It'd be very useful to get "explain analyze" for a working query and for
> a
> > stuck query.  It sound like the stuck query never finishes, so maybe the
> second
> > part is impossible (?)
>
> We run an explain analysis and we see some very interesting stuff going on.
> It seems without explicitly adding a `ANALYZE`, the query has a cost of
> over billions, so the query is not stuck but took forever.
> When I run the same scripts with an ANALYZE right before running the
> query, the query is exec is 50secondes and the cost is normal
>
> Explain analyze WITHOUT ANALYZE https://explain.depesz.com/s/RaSr
> Explain analyze same query WITH ANALYZE BEFORE
> https://explain.depesz.com/s/tYVl
>
> The configuration is tuned by aws aurora, but this issue happens also with
> a default config.
>
> allow_system_table_mods,off
> application_name,DataGrip 2021.1.3
> archive_command,(disabled)
> archive_mode,off
> archive_timeout,5min
> array_nulls,on
> authentication_timeout,1min
> autovacuum,on
> autovacuum_analyze_scale_factor,0.05
> autovacuum_analyze_threshold,50
> autovacuum_freeze_max_age,2
> autovacuum_max_workers,12
> autovacuum_multixact_freeze_max_age,4
> autovacuum_naptime,5s
> autovacuum_vacuum_cost_delay,1ms
> autovacuum_vacuum_cost_limit,1200
> autovacuum_vacuum_scale_factor,0.1
> autovacuum_vacuum_threshold,50
> autovacuum_work_mem,-1
> backend_flush_after,0
> backslash_quote,safe_encoding
> bgwriter_delay,200ms
> bgwriter_flush_after,0
> bgwriter_lru_maxpages,100
> bgwriter_lru_multiplier,2
> bonjour,off
> bytea_output,hex
> check_function_bodies,on
> checkpoint_completion_target,0.9
> checkpoint_flush_after,0
> checkpoint_timeout,15min
> checkpoint_warning,30s
> client_encoding,UTF8
> client_min_messages,notice
> commit_delay,0
> commit_siblings,5
> constraint_exclusion,partition
> cpu_index_tuple_cost,0.005
> cpu_operator_cost,0.0025
> cpu_tuple_cost,0.01
> cursor_tuple_fraction,0.1
> DateStyle,"ISO, MDY"
> db_user_namespace,off
> deadlock_timeout,1s
> debug_pretty_print,on
> debug_print_parse,off
> debug_print_plan,off
> debug_print_rewritten,off
> default_statistics_target,500
> default_text_search_config,pg_catalog.simple
> default_transaction_deferrable,off
> default_transaction_isolation,read committed
> default_transaction_read_only,off
> dynamic_library_path,$libdir
> effective_cache_size,4GB
> effective_io_concurrency,600
> enable_bitmapscan,on
> enable_gathermerge,on
> enable_hashagg,on
> enable_hashjoin,on
> enable_indexonlyscan,on
> enable_indexscan,on
> enable_material,on
> enable_mergejoin,on
> enable_nestloop,on
> enable_parallel_append,on
> enable_parallel_hash,on
> enable_partition_pruning,on
> enable_partitionwise_aggregate,off
> enable_partitionwise_join,off
> enable_seqscan,on
> enable_sort,on
> enable_tidscan,on
> escape_string_warning,on
> event_source,PostgreSQL
> exit_on_error,off
> extra_float_digits,3
> force_parallel_mode,off
> from_collapse_limit,8
> fsync,off
> full_page_writes,off
> geqo,on
> geqo_effort,5
> geqo_generations,0
> geqo_pool_size,0
> geqo_seed,0
> geqo_selection_bias,2
> geqo_threshold,12
> gin_fuzzy_search_limit,0
> gin_pending_list_limit,4MB
> hot_standby,off
> hot_standby_feedback,on
> huge_pages,try
> idle_in_transaction_session_timeout,25min
> ignore_checksum_failure,off
> ignore_system_indexes,off
> IntervalStyle,postgres
> jit,off
> jit_above_cost,10
> jit_debugging_support,off
> jit_dump_bitcode,off
> jit_expressions,on
> jit_inline_above_cost,50
> jit_optimize_above_cost,50
> jit_profiling_support,off
> jit_provider,llvmjit
> jit_tuple_deforming,on
> join_collapse_limit,8
> lc_monetary,C
> lc_numeric,C
> lc_time,C
> listen_addresses,*
> lock_timeout,0
> lo_compat_privileges,off
> maintenance_work_mem,2GB
> max_connections,100
> max_files_per_process,1000
> max_locks_per_transaction,256
> max_logical_replication_workers,4
> max_parallel_maintenance_workers,12
> max_parallel_workers,12
> max_parallel_workers_per_gather,6
> max_pred_locks_per_page,2
> max_pred_locks_per_relation,-2
> max_pred_locks_per_transaction,64
> max_prepared_transactions,0
> max_replication_slots,10
> max_stack_depth,6MB
> max_standby_archive_delay,30s
> max_standby_streaming_delay,14s
> max_sync_workers_per_subscription,2
> max_wal_senders,0
> max_wal_size,8GB
> max_worker_processes,12
> min_parallel_index_scan_size,512kB
> min_parallel_table_scan_size,8MB
> min_wal_size,2GB
> old_snapshot_threshold,-1
> operator_precedence_warning,off
> parallel_leader_participation,off
> parallel_setup_cost,1000
> parallel_t

Re: ETL - sql orchestrator is stuck when there is not sleep() between queries

2021-07-08 Thread Allan Barrielle
fsync is off and full_page_writes is off because the script works one time.
We create the db, we load the data, then we dump the data and kill the db.
No need to handle servers crashed or anything like that.

0.1 vacuum_cleanup_index_scale_factor is the default value.

On Thu, Jul 8, 2021 at 4:06 PM Justin Pryzby  wrote:

> On Thu, Jul 08, 2021 at 03:49:12PM +0200, Allan Barrielle wrote:
> > > Is it true that the SELECTs have no joins in them ?
> >
> > Yes there is a lot of LEFT JOIN.
> >
> > > It'd be very useful to get "explain analyze" for a working query and
> for a
> > > stuck query.  It sound like the stuck query never finishes, so maybe
> the second
> > > part is impossible (?)
> >
> > We run an explain analysis and we see some very interesting stuff going
> on.
> > It seems without explicitly adding a `ANALYZE`, the query has a cost of
> > over billions, so the query is not stuck but took forever.
> > When I run the same scripts with an ANALYZE right before running the
> query,
> > the query is exec is 50secondes and the cost is normal
>
> It sounds like sometimes autoanalyze processes important tables being
> queried,
> but sometimes it doesn't.
>
> Since there are JOINs involved, you should analyze the tables after
> populating
> them and before querying them.  The same as if it were a temp table, or
> anything else.
>
> > The configuration is tuned by aws aurora, [...]
>
> > fsync,off
> > full_page_writes,off
>
> really?
>
> > vacuum_cleanup_index_scale_factor,0.1
>
> also interesting
>


ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification

2021-07-08 Thread Nagaraj Raj
Hello everyone,
I have a scenario where wanted to add PK on partition to make sure to monitor 
unique values for two columns values. but as PG required to partition column 
should be part of PK. How can we make sure actual two columns need to be unique 
values.
and also while insert into table need be use 'on conflict'.
create table t (id int, pid int , name name , dt date) partition by 
range(dt);--create unique index on t(id,pid);--alter table t  add constraint uk 
unique  (id);--create unique index on t(id,pid);alter table t  add constraint 
uk unique  (id,pid,dt);
create table t1 partition of t for values from ('2020-01-01') to 
('2020-02-01');alter table t1  add constraint uk1 unique  (id,pid);create table 
t2 partition of t for values from ('2020-02-01') to ('2020-03-01');alter table 
t2  add constraint uk2 unique  (id,pid);create table t4 partition of t for 
values from ('2020-03-01') to ('2020-04-01');alter table t4  add constraint uk3 
unique  (id,pid);create table t3 partition of t for values from ('2020-04-01') 
to ('2020-05-01');alter table t3  add constraint uk4 unique  (id,pid);
insert into t(id,pid,name,dt) values (1,2,'raj','2020-01-01')on conflict 
(id,pid) do nothing;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT 
specification

https://dbfiddle.uk/?rdbms=postgres_11&fiddle=36b3eb0d51f8bff4b5d445a77d688d88


Thanks,Rj

Re: Strange execution plan

2021-07-08 Thread Manuel Weitzman



> On 08-07-2021, at 04:26, Joel Frid  wrote:
> 
> Hi, 
> 
> On my production environment (PostgreSQL 13.3), one of my queries runs very 
> slow, about 2 minutes.
> I noticed that it does not use an execution plan that I anticapited it would.
> 
> The query is :
> 
> SELECT t.*
> FROM test t 
> WHERE  t."existe" IS true
> and  t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
> ORDER BY t."id" DESC 
> LIMIT 100 OFFSET 0
> 
> I know PostgreSQL is not very good at performing well with pagination and 
> offsets but all my queries must end with "LIMIT 100 OFFSET 0", "LIMIT 100 
> OFFSET 1", ...

I don't think any database performs well with LIMIT+OFFSET.
Using OFFSET requires doing a linear scan discarding all rows up to
the row in position OFFSET, then the scan continues for LIMIT rows.
The greater the value of OFFSET, the slowest the query will perform,
in general.
I'd recommend you using cursors for pagination in general (I know it
may not be possible for you, just wanted to explain as it could be
useful).

> If I display actual Execution Plan, I get this :
> 
> Limit  (cost=0.43..1164.55 rows=100 width=632) (actual 
> time=7884.056..121297.756 rows=1 loops=1)
>   Buffers: shared hit=5311835 read=585741 dirtied=32
>   ->  Index Scan Backward using test_pk on test  (cost=0.43..141104.29 
> rows=12121 width=632) (actual time=7884.053..121297.734 rows=1 loops=1)
> Filter: ((existe IS TRUE) AND (json_data @> '{"book": {"title": "In 
> Search of Lost Time"}}'::jsonb))
> Rows Removed by Filter: 1215681
> Buffers: shared hit=5311835 read=585741 dirtied=32
> Planning:
>   Buffers: shared hit=1
> Planning Time: 0.283 ms
> Execution Time: 121297.878 ms
> 
> The query runs very slow from limit 1 to 1147. 
> If I change limit value to 1148, this query runs quite fast ( 0.190 ms) with 
> a nice execution plan :
> 
> SELECT t.*
> FROM test t 
> WHERE  t."existe" IS true
> and  t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
> ORDER BY t."id" DESC 
> LIMIT 1148 OFFSET 0
> 
> Limit  (cost=13220.53..13223.40 rows=1148 width=632) (actual 
> time=0.138..0.140 rows=1 loops=1)
>   Buffers: shared hit=17
>   ->  Sort  (cost=13220.53..13250.84 rows=12121 width=632) (actual 
> time=0.137..0.138 rows=1 loops=1)
> Sort Key: id DESC
> Sort Method: quicksort  Memory: 27kB
> Buffers: shared hit=17
> ->  Bitmap Heap Scan on test  (cost=119.73..12543.88 rows=12121 
> width=632) (actual time=0.125..0.127 rows=1 loops=1)
>   Recheck Cond: (json_data @> '{"book": {"title": "In Search of 
> Lost Time"}}'::jsonb)
>   Filter: (existe IS TRUE)
>   Heap Blocks: exact=1
>   Buffers: shared hit=17
>   ->  Bitmap Index Scan on test_json_data_idx  (cost=0.00..116.70 
> rows=12187 width=0) (actual time=0.112..0.113 rows=1 loops=1)
> Index Cond: (json_data @> '{"book": {"title": "In Search 
> of Lost Time"}}'::jsonb)
> Buffers: shared hit=16
> Planning:
>   Buffers: shared hit=1
> Planning Time: 0.296 ms
> Execution Time: 0.190 ms
> 
> Would you have any suggestions why Postgres chooses a so bad query plan ?

I can guess a bit about this,

One of the perks of the statistics collector is that it doesn't
collect too many statistics to properly estimate the number of rows
that will match the "@>" operator, as that is quite hard to do. In
general it over-estimates how many rows will match.
As you can see on both plans it estimates that 12121 rows will match
the "@>" clause, even if only 1 actually match.

This means that the planner is estimating that the cost of using
test_json_data_idx and then executing top-k heapsort (to get the
latest 100 rows ordered by id) is far greater than iterating over the
test_pk index where the first 100 rows that match will be the ones you
need and already sorted.
In practice, iterating the test_pk index has to read over a lot of
rows that didn't match the "@>" operator, as the actual number of rows
that match isn't as large as initially expected.

Once you increase the LIMIT value to a number high enough the cost of
iterating over any of the indexes becomes similar to the planner,
after that threshold it chooses to switch the plan.


So, some suggestions to improve the execution of that query:

Option A: Use a common table expression to "force" the usage of
test_json_data_idx

WITH json_matching_rows AS (
SELECT t.*
FROM test ti
WHERE t.json_data @> '{"book":{"title":"In Search of Lost Time"}}'
)
SELECT t.*
FROM json_matching_rows t
WHERE  t."existe" IS true
ORDER BY t."id" DESC
LIMIT 100 OFFSET 0;


Option B: Use the extension pg_hint_plan to hint the usage of
test_json_data_idx


Option C: Create a functional index for the book title and tweak the
query to use it.
This can also be a composite index (to have the values sorted by id
already) and partial (to only include rows where "existe" i

Re: Strange execution plan

2021-07-08 Thread Manuel Weitzman



> On 08-07-2021, at 17:13, Manuel Weitzman  wrote:
> 
> I'd recommend you using cursors for pagination in general (I know it
> may not be possible for you, just wanted to explain as it could be
> useful).

By the way, I mean cursor pagination as the general concept.
I'm not talking about Postgres cursors.

Best regards,
Manuel Weitzman




temporary file log lines

2021-07-08 Thread MichaelDBA

Hi all,

I got a question about PG log lines with temporary file info like this:

case 1: log line with no contextual info
2021-07-07 20:28:15 UTC:10.100.11.95(50274):myapp@mydb:[35200]:LOG: 
temporary file: path "base/pgsql_tmp/pgsql_tmp35200.0", size 389390336


case 2: log line with contextual info
2021-07-07 20:56:18 UTC:172.16.193.118(56080):myapp@mydb:[22418]:LOG: 
temporary file: path "base/pgsql_tmp/pgsql_tmp22418.0", size 1048576000
2021-07-07 20:56:18 
UTC:172.16.193.118(56080):myapp@mydb:[22418]:CONTEXT:  PL/pgSQL function 
memory.f_memory_usage(boolean) line 13 at RETURN QUERY


There are at least 2 cases where stuff can spill over to disk:
* queries that don't fit in work_mem, and
* temporary tables that don't fit in temp_buffers

Question, if log_temp_files is turned on (=0), then how can you tell 
from where the temporary log line comes from?
I see a pattern where work_mem spill overs have a CONTEXT line that 
immediately follows the LOG LINE with keyword, temporary. See case 2 above.


For other LOG lines with keyword, temporary, there is no such pattern. 
Could those be the ones caused by temp_buffer spill overs to disk?  case 
1 above.


I really want to tune temp_buffers, but I would like to be able to 
detect when temporary tables are spilling over to disk, so that I can 
increase temp_buffers.


Any help would be appreciated.

Regards,
Michael Vitale





Re: Partition column should be part of PK

2021-07-08 Thread Justin Pryzby
If I'm not wrong, this is the same thing you asked 2 week ago.

If so, why not continue the conversation on the same thread, and why not
reference the old thread ?

I went to the effort to find the old conversation.
https://www.postgresql.org/message-id/[email protected]

If declaratively partitioned tables and partitioned indexes don't do what you
wanted, then you should consider not using them for this.

On Fri, Jun 25, 2021 at 03:10:07AM +, Nagaraj Raj wrote:
> we have some partitioned tables with inherence and planning to migrate them 
> to the declaration.
> Table DDL:
> CREATE TABLE c_account_p
> (
>     billing_account_guid character varying(40)  NOT NULL,
>     ingestion_process_id bigint NOT NULL DEFAULT '-1'::integer,
>     load_dttm timestamp(6) without time zone NOT NULL,
>     ban integer NOT NULL,
>     CONSTRAINT billing_account_pkey PRIMARY KEY (billing_account_guid, ban)
> ) PARTITION by RANGE(load_dttm);
> When I try the create table, it's throwing below error:
> ERROR:  insufficient columns in the PRIMARY KEY constraint definition
> DETAIL:  PRIMARY KEY constraint on table "l_billing_account_p" lacks column 
> "load_dttm" which is part of the partition key.
> SQL state: 0A000
> Is it mandatory/necessary that the partition column should be a primary key? 
> cause if I include load_dttm as PK then it's working fine.db<>fiddle
> If the partition column should be supposed to be a PK, it's challenging to 
> create a partition by range with the date column, cause the load_dttm column 
> chances to have duplicate if data loaded COPY or INSERT.INSERT INTO 
> c_account_p SELECT * from c_account_p_bkp ON CONFLICT 
> (billing_account_guid,ban,load_dttm) DO UPDATE SET 'some stuff..'
> If I receive billing_account_guid, ban combination with different load_dttm 
> then it will end up with duplicate keys.
> Could some please help me to understand this scenario?
> Thanks.




Re: Partition column should be part of PK

2021-07-08 Thread Alvaro Herrera
On 2021-Jul-08, Justin Pryzby wrote:

> If I'm not wrong, this is the same thing you asked 2 week ago.
> 
> If so, why not continue the conversation on the same thread, and why not
> reference the old thread ?
> 
> I went to the effort to find the old conversation.
> https://www.postgresql.org/message-id/[email protected]

Actually, it looks like you're replying to the same email you replied to
two weeks ago.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




RE: Partition column should be part of PK

2021-07-08 Thread Mike Sofen
I believe this thread qualifies for the funniest thread of 2021 (so far).  And 
yes, this is a top post.  :-)

Mike Sofen

-Original Message-
From: Alvaro Herrera  
Sent: Thursday, July 08, 2021 3:29 PM
To: Justin Pryzby 
Cc: Nagaraj Raj ; [email protected]
Subject: Re: Partition column should be part of PK

On 2021-Jul-08, Justin Pryzby wrote:

> If I'm not wrong, this is the same thing you asked 2 week ago.
> 
> If so, why not continue the conversation on the same thread, and why 
> not reference the old thread ?
> 
> I went to the effort to find the old conversation.
> https://www.postgresql.org/message-id/20210625042228.GJ29179@telsasoft
> .com

Actually, it looks like you're replying to the same email you replied to two 
weeks ago.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/







Re: Strange execution plan

2021-07-08 Thread Manuel Weitzman



> On 08-07-2021, at 17:13, Manuel Weitzman  wrote:
> 
> Option A: Use a common table expression to "force" the usage of
> test_json_data_idx
> 
>WITH json_matching_rows AS (
>SELECT t.*
>FROM test ti
>WHERE t.json_data @> '{"book":{"title":"In Search of Lost Time"}}'
>)
>SELECT t.*
>FROM json_matching_rows t
>WHERE  t."existe" IS true
>ORDER BY t."id" DESC
>LIMIT 100 OFFSET 0;
> 

The first query line should be
WITH MATERIALIZED json_matching_rows AS (

I had forgotten that Postgres 12 removed the optimization barrier on
common table expressions.
To introduce it again the MATERIALIZED clause is needed.

Apparently I need to work on reviewing my emails properly before
sending them.


Best regards,
Manuel Weitzman




Re: Partition column should be part of PK

2021-07-08 Thread Nagaraj Raj
My apologies for making confusion with new thread. Yes its same issue related 
to earlier post.
I was trying to figure out  how to ensure unique values for columns 
(billing_account_guid, ban). If i add partition key to constraint , it wont be 
possible what im looking for.
My use case as below 
INSERT INTO t1 SELECT * from t2 ON CONFLICT (billing_account_guid,ban) DO 
UPDATE SET something…

Or
INSERT INTO t1 SELECT * from t2 ON CONFLICT constraint (pk or 
uk)(billing_account_guid,ban) DO UPDATE SET something…

Thanks

Sent from Yahoo Mail for iPhone


On Thursday, July 8, 2021, 7:12 PM, Mike Sofen  wrote:

I believe this thread qualifies for the funniest thread of 2021 (so far).  And 
yes, this is a top post.  :-)

Mike Sofen

-Original Message-
From: Alvaro Herrera  
Sent: Thursday, July 08, 2021 3:29 PM
To: Justin Pryzby 
Cc: Nagaraj Raj ; [email protected]
Subject: Re: Partition column should be part of PK

On 2021-Jul-08, Justin Pryzby wrote:

> If I'm not wrong, this is the same thing you asked 2 week ago.
> 
> If so, why not continue the conversation on the same thread, and why 
> not reference the old thread ?
> 
> I went to the effort to find the old conversation.
> https://www.postgresql.org/message-id/20210625042228.GJ29179@telsasoft
> .com

Actually, it looks like you're replying to the same email you replied to two 
weeks ago.

-- 
Álvaro Herrera              Valdivia, Chile  —  https://www.EnterpriseDB.com/










Re: Partition column should be part of PK

2021-07-08 Thread Christophe Pettus



> On Jul 8, 2021, at 20:32, Nagaraj Raj  wrote:
> 
> My apologies for making confusion with new thread. Yes its same issue related 
> to earlier post.
> 
> I was trying to figure out  how to ensure unique values for columns 
> (billing_account_guid, ban). If i add partition key to constraint , it wont 
> be possible what im looking for.
> 
> My use case as below 
> 
> INSERT INTO t1 SELECT * from t2 ON CONFLICT (billing_account_guid,ban) DO 
> UPDATE SET something…
> 
> Or
> 
> INSERT INTO t1 SELECT * from t2 ON CONFLICT constraint (pk or 
> uk)(billing_account_guid,ban) DO UPDATE SET something…

Right now, PostgreSQL does not support unique indexes on partitioned tables 
(that operate across all partitions) unless the partition key is included in 
the index definition.  If it didn't have that requirement, it would have to 
independently (and in a concurrency-supported way) scan every partition 
individually to see if there is a duplicate key violation in any of the 
partitions, and the machinery to do that does not exist right now.

If the goal is to make sure there is only one (billing_account_guid, ban, date) 
combination across the entire partition set, you can create an index unique 
index on the partitioned set as (billing_account_guid, ban, date), and INSERT 
... ON CONFLICT DO NOTHING works properly then.

If the goal is to make sure there is only one (billing_account_uid, ban) in any 
partition regardless of date, you'll need to do something more sophisticated to 
make sure that two sessions don't insert an (billing_account_uid, ban) value 
into two different partitions.  This isn't a great fit for table partitioning, 
and you might want to reconsider if partitioning the table is the right answer 
here.  If you *must* have table partitioning, a possible algorithm is:

-- Start a transaction
-- Hash the (billing_account_uid, ban) key into a 64 bit value.
-- Use that 64 bit value as a key to a call to pg_advisory_xact_lock() [1] to, 
in essence, create a signal to any other transaction attempting to insert that 
pair that it is being modified.
-- SELECT on that pair to make sure one does not already exist.
-- If one does not, do the INSERT.
-- Commit, which releases the advisory lock.

This doesn't provide quite the same level of uniqueness that a cross-partition 
index would, but if this is the only code path that does the INSERT, it should 
keep duplicate from showing up in different partitions.

[1] 
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS