ETL - sql orchestrator is stuck when there is not sleep() between queries
Hello guys, I'm facing a problem. Currently I'm working on a Data transformation Pipeline on Postgres. The strategy is, We select every tables in a given schema ( 50 tables ), we apply some case when, translation, enum and load it into a different new schema with a CREATE TABLE SCHEMA_2.table_1 AS SELECT * FROM SCHEMA_1.TABLE_1, then we do it again about 3 more times and everytime it’s a new schema, new table. We only keep and don’t drop the schema1. To orchestrate the whole, we've got a bunch of .sql files that we run by using psql directly. That's our "strategy". So we're copying a lot of data, but it allows us to debug, and investigate business bugs, because we can plug us into schema 2,3 and search why it's an issue. All is fine, and can work great. But sometimes, some queries that used to take about 20 secs to complete can suddenly end in 5mins. Important all queries have the same shape -> CREATE TABLE SELECT AS *(a bit of transform) FROM TABLE). No update, nothing, it’s dead simple. We are just trying to copy a table from schema1, to schema2, to schema3 and finally schema3. That’s it. The thing to understand here is schema2, schema3 are dropped at every pipeline transformation, so everytime we run the script, it drops everything from schema2 to the final stage. We tuned the config a little bit, and we tried kind of everything ( synchronous_commit, wal, vacuum ) Nothing works, it’s very random, some query won’t simply work ( even after hours ). We use different machines, different config, and different datasets. The only thing that makes it work every time, in 100% cases, is to put a sleep(10sec) between each schema. So we select 50 tables, we create a new schema with it, then we sleep 10 sec then we do again the same query but with the freshly created schema and we create a third schema, sleep 10s and again.. And that makes the whole pipeline successful each time. So, It seems it's a background process inside postgres, that should ingest a lot of data, and we have to give him time to take a rest, like a bg_writers or something else ? I disabled autovacuum=off . Same. Why does the query never end even after hours ? Why there is no log about where the query is stuck. To be clear, if I kill the stuck query and run again it will work. I don't know much about what's going on inside Postgres, which randomly takes a lot of time, with the same code, same data. PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit Thank you so much for your time..
Strange execution plan
Hi,
On my production environment (PostgreSQL 13.3), one of my queries runs very
slow, about 2 minutes.
I noticed that it does not use an execution plan that I anticapited it would.
The query is :
SELECT t.*
FROM test t
WHERE t."existe" IS true
and t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
ORDER BY t."id" DESC
LIMIT 100 OFFSET 0
I know PostgreSQL is not very good at performing well with pagination and
offsets but all my queries must end with "LIMIT 100 OFFSET 0", "LIMIT 100
OFFSET 1", ...
If I display actual Execution Plan, I get this :
Limit (cost=0.43..1164.55 rows=100 width=632) (actual
time=7884.056..121297.756 rows=1 loops=1)
Buffers: shared hit=5311835 read=585741 dirtied=32
-> Index Scan Backward using test_pk on test (cost=0.43..141104.29
rows=12121 width=632) (actual time=7884.053..121297.734 rows=1 loops=1)
Filter: ((existe IS TRUE) AND (json_data @> '{"book": {"title": "In
Search of Lost Time"}}'::jsonb))
Rows Removed by Filter: 1215681
Buffers: shared hit=5311835 read=585741 dirtied=32
Planning:
Buffers: shared hit=1
Planning Time: 0.283 ms
Execution Time: 121297.878 ms
The query runs very slow from limit 1 to 1147.
If I change limit value to 1148, this query runs quite fast ( 0.190 ms) with a
nice execution plan :
SELECT t.*
FROM test t
WHERE t."existe" IS true
and t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
ORDER BY t."id" DESC
LIMIT 1148 OFFSET 0
Limit (cost=13220.53..13223.40 rows=1148 width=632) (actual time=0.138..0.140
rows=1 loops=1)
Buffers: shared hit=17
-> Sort (cost=13220.53..13250.84 rows=12121 width=632) (actual
time=0.137..0.138 rows=1 loops=1)
Sort Key: id DESC
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=17
-> Bitmap Heap Scan on test (cost=119.73..12543.88 rows=12121
width=632) (actual time=0.125..0.127 rows=1 loops=1)
Recheck Cond: (json_data @> '{"book": {"title": "In Search of
Lost Time"}}'::jsonb)
Filter: (existe IS TRUE)
Heap Blocks: exact=1
Buffers: shared hit=17
-> Bitmap Index Scan on test_json_data_idx (cost=0.00..116.70
rows=12187 width=0) (actual time=0.112..0.113 rows=1 loops=1)
Index Cond: (json_data @> '{"book": {"title": "In Search of
Lost Time"}}'::jsonb)
Buffers: shared hit=16
Planning:
Buffers: shared hit=1
Planning Time: 0.296 ms
Execution Time: 0.190 ms
Would you have any suggestions why Postgres chooses a so bad query plan ?
Server :
--
CPU Model : AMD EPYC 7281 16-Core Processor
CPU Cores : 4
CPU Frequency : 2096.060 MHz
CPU Cache : 512 KB
Total Disk : 888.1 GB (473.0 GB Used)
Total Mem : 11973 MB (4922 MB Used)
Total Swap : 0 MB (0 MB Used)
OS : Debian GNU/Linux 10
Arch : x86_64 (64 Bit)
Kernel : 5.10.28
Virtualization : Dedicated
--
I/O Speed(1st run) : 132 MB/s
I/O Speed(2nd run) : 204 MB/s
I/O Speed(3rd run) : 197 MB/s
Average I/O speed : 177.7 MB/s
Postgresql.conf :
max_connections = 100
shared_buffers = 3840MB
huge_pages = on
work_mem = 9830kB
maintenance_work_mem = 960MB
effective_io_concurrency = 200
max_worker_processes = 3
max_parallel_maintenance_workers = 2
max_parallel_workers_per_gather = 2
max_parallel_workers = 3
max_wal_size = 4GB
min_wal_size = 1GB
checkpoint_completion_target = 0.9
effective_cache_size = 11520MB
default_statistics_target = 100
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 1
pg_stat_statements.track = all
Table test : I have just over 1.2 million records on this table
CREATE TABLE test (
"source" varchar NOT NULL,
existe bool NULL,
json_data jsonb NULL
row_updated timestamp NOT NULL DEFAULT clock_timestamp(),
row_inserted timestamp NOT NULL DEFAULT clock_timestamp(),
id uuid NOT NULL,
CONSTRAINT test_pk PRIMARY KEY (id)
);
CREATE INDEX test_existe_idx ON test USING btree (existe);
CREATE INDEX test_id_idx ON test USING btree (id);
CREATE INDEX test_json_datae_idx ON test USING gin (json_data jsonb_path_ops);
CREATE INDEX test_row_inserted_idx ON test USING btree (row_inserted);
CREATE INDEX test_row_updated_idx ON production.test USING btree (row_updated);
CREATE INDEX test_source_idx ON production.test USING btree (source);
select * from pg_stat_all_tables where relname = 'test' :
relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del
16692|dev |test |1816 |724038305 |31413 |36863713 |1215682
|23127 |0
n_tup_hot_upd|n_live_tup|n_dead_tup|
Strange execution plan
Hi,
On my production environment (PostgreSQL 13.3), one of my queries runs very
slow, about 2 minutes.
I noticed that it does not use an execution plan that I anticapited it would.
The query is :
SELECT t.*
FROM test t
WHERE t."existe" IS true
and t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
ORDER BY t."id" DESC
LIMIT 100 OFFSET 0
I know PostgreSQL is not very good at performing well with pagination and
offsets but all my queries must end with "LIMIT 100 OFFSET 0", "LIMIT 100
OFFSET 1", ...
If I display actual Execution Plan, I get this :
Limit (cost=0.43..1164.55 rows=100 width=632) (actual
time=7884.056..121297.756 rows=1 loops=1)
Buffers: shared hit=5311835 read=585741 dirtied=32
-> Index Scan Backward using test_pk on test (cost=0.43..141104.29
rows=12121 width=632) (actual time=7884.053..121297.734 rows=1 loops=1)
Filter: ((existe IS TRUE) AND (json_data @> '{"book": {"title": "In
Search of Lost Time"}}'::jsonb))
Rows Removed by Filter: 1215681
Buffers: shared hit=5311835 read=585741 dirtied=32
Planning:
Buffers: shared hit=1
Planning Time: 0.283 ms
Execution Time: 121297.878 ms
The query runs very slow from limit 1 to 1147.
If I change limit value to 1148, this query runs quite fast ( 0.190 ms) with a
nice execution plan :
SELECT t.*
FROM test t
WHERE t."existe" IS true
and t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
ORDER BY t."id" DESC
LIMIT 1148 OFFSET 0
Limit (cost=13220.53..13223.40 rows=1148 width=632) (actual time=0.138..0.140
rows=1 loops=1)
Buffers: shared hit=17
-> Sort (cost=13220.53..13250.84 rows=12121 width=632) (actual
time=0.137..0.138 rows=1 loops=1)
Sort Key: id DESC
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=17
-> Bitmap Heap Scan on test (cost=119.73..12543.88 rows=12121
width=632) (actual time=0.125..0.127 rows=1 loops=1)
Recheck Cond: (json_data @> '{"book": {"title": "In Search of
Lost Time"}}'::jsonb)
Filter: (existe IS TRUE)
Heap Blocks: exact=1
Buffers: shared hit=17
-> Bitmap Index Scan on test_json_data_idx (cost=0.00..116.70
rows=12187 width=0) (actual time=0.112..0.113 rows=1 loops=1)
Index Cond: (json_data @> '{"book": {"title": "In Search of
Lost Time"}}'::jsonb)
Buffers: shared hit=16
Planning:
Buffers: shared hit=1
Planning Time: 0.296 ms
Execution Time: 0.190 ms
Would you have any suggestions why Postgres chooses a so bad query plan ?
Server :
--
CPU Model : AMD EPYC 7281 16-Core Processor
CPU Cores : 4
CPU Frequency : 2096.060 MHz
CPU Cache : 512 KB
Total Disk : 888.1 GB (473.0 GB Used)
Total Mem : 11973 MB (4922 MB Used)
Total Swap : 0 MB (0 MB Used)
OS : Debian GNU/Linux 10
Arch : x86_64 (64 Bit)
Kernel : 5.10.28
Virtualization : Dedicated
--
I/O Speed(1st run) : 132 MB/s
I/O Speed(2nd run) : 204 MB/s
I/O Speed(3rd run) : 197 MB/s
Average I/O speed : 177.7 MB/s
Postgresql.conf :
max_connections = 100
shared_buffers = 3840MB
huge_pages = on
work_mem = 9830kB
maintenance_work_mem = 960MB
effective_io_concurrency = 200
max_worker_processes = 3
max_parallel_maintenance_workers = 2
max_parallel_workers_per_gather = 2
max_parallel_workers = 3
max_wal_size = 4GB
min_wal_size = 1GB
checkpoint_completion_target = 0.9
effective_cache_size = 11520MB
default_statistics_target = 100
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 1
pg_stat_statements.track = all
Table test : I have just over 1.2 million records on this table
CREATE TABLE test (
"source" varchar NOT NULL,
existe bool NULL,
json_data jsonb NULL
row_updated timestamp NOT NULL DEFAULT clock_timestamp(),
row_inserted timestamp NOT NULL DEFAULT clock_timestamp(),
id uuid NOT NULL,
CONSTRAINT test_pk PRIMARY KEY (id)
);
CREATE INDEX test_existe_idx ON test USING btree (existe);
CREATE INDEX test_id_idx ON test USING btree (id);
CREATE INDEX test_json_datae_idx ON test USING gin (json_data jsonb_path_ops);
CREATE INDEX test_row_inserted_idx ON test USING btree (row_inserted);
CREATE INDEX test_row_updated_idx ON production.test USING btree (row_updated);
CREATE INDEX test_source_idx ON production.test USING btree (source);
select * from pg_stat_all_tables where relname = 'test' :
relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del
16692|dev |test |1816 |724038305 |31413 |36863713 |1215682
|23127 |0
n_tup_hot_upd|n_live_tup|n_dead_tup|
Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
On Thu, Jul 08, 2021 at 01:00:28AM +0200, Allan Barrielle wrote: > All is fine, and can work great. > But sometimes, some queries that used to take about 20 secs to complete can > suddenly end in 5mins. > Important all queries have the same shape -> CREATE TABLE SELECT AS *(a bit > of transform) FROM TABLE). No update, nothing, it’s dead simple. > > Nothing works, it’s very random, some query won’t simply work ( even after > hours ). When it doesn't work, you could check SELECT * FROM pg_stat_activity, and SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on. > Important all queries have the same shape -> CREATE TABLE SELECT AS *(a bit > of transform) FROM TABLE). No update, nothing, it’s dead simple. > We are just trying to copy a table from schema1, to schema2, to schema3 and > finally schema3. That’s it. Is it true that the SELECTs have no joins in them ? Did this ever work better or differently under different versions of postgres ? > Why does the query never end even after hours ? Why there is no log about > where the query is stuck. Please send your nondefault config. https://wiki.postgresql.org/wiki/Server_Configuration Also enable logging (I just added this to the wiki). https://wiki.postgresql.org/wiki/Slow_Query_Questions#Enable_Logging It'd be very useful to get "explain analyze" for a working query and for a stuck query. It sound like the stuck query never finishes, so maybe the second part is impossible (?) But it'd be good to get at least "explain" output. You'd have to edit your sql script to run an "explain" before each query, and run it, logging the ouput, until you capture the plan for a stuck query. Save the output and send here, along with the query plan for a working query. -- Justin
Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
> We use different machines, different config, and different datasets. > ... > PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit Is It possible to upgrade and test with PG 12.7? IMHO: lot of changes: * https://www.postgresql.org/docs/12/release-12-5.html * https://www.postgresql.org/docs/12/release-12-6.html * https://www.postgresql.org/docs/12/release-12-7.html Just rule out the possibility that it has been already fixed Regards, Imre Allan Barrielle ezt írta (időpont: 2021. júl. 8., Cs, 11:26): > Hello guys, I'm facing a problem. Currently I'm working on a Data > transformation Pipeline on Postgres. The strategy is, > > We select every tables in a given schema ( 50 tables ), we apply some case > when, translation, enum and load it into a different new schema with a > CREATE TABLE SCHEMA_2.table_1 AS SELECT * FROM SCHEMA_1.TABLE_1, then we do > it again about 3 more times and everytime it’s a new schema, new table. We > only keep and don’t drop the schema1. > > To orchestrate the whole, we've got a bunch of .sql files that we run by > using psql directly. That's our "strategy". > > So we're copying a lot of data, but it allows us to debug, and investigate > business bugs, because we can plug us into schema 2,3 and search why it's > an issue. > > All is fine, and can work great. > But sometimes, some queries that used to take about 20 secs to complete > can suddenly end in 5mins. > Important all queries have the same shape -> CREATE TABLE SELECT AS *(a > bit of transform) FROM TABLE). No update, nothing, it’s dead simple. > > We are just trying to copy a table from schema1, to schema2, to schema3 > and finally schema3. That’s it. > The thing to understand here is schema2, schema3 are dropped at every > pipeline transformation, so everytime we run the script, it drops > everything from schema2 to the final stage. > > We tuned the config a little bit, and we tried kind of everything ( > synchronous_commit, wal, vacuum ) > Nothing works, it’s very random, some query won’t simply work ( even after > hours ). > > We use different machines, different config, and different datasets. > > The only thing that makes it work every time, in 100% cases, is to put a > sleep(10sec) between each schema. > So we select 50 tables, we create a new schema with it, then we sleep 10 > sec then we do again the same query but with the freshly created schema and > we create a third schema, sleep 10s and again.. > > And that makes the whole pipeline successful each time. > > So, It seems it's a background process inside postgres, that should ingest > a lot of data, and we have to give him time to take a rest, like a > bg_writers or something else ? > I disabled autovacuum=off . Same. > Why does the query never end even after hours ? Why there is no log about > where the query is stuck. > To be clear, if I kill the stuck query and run again it will work. > > I don't know much about what's going on inside Postgres, which randomly > takes a lot of time, with the same code, same data. > > PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled > by gcc (Debian 8.3.0-6) 8.3.0, 64-bit > > Thank you so much for your time.. > > >
Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
On Thu, Jul 08, 2021 at 03:49:12PM +0200, Allan Barrielle wrote: > > Is it true that the SELECTs have no joins in them ? > > Yes there is a lot of LEFT JOIN. > > > It'd be very useful to get "explain analyze" for a working query and for a > > stuck query. It sound like the stuck query never finishes, so maybe the > > second > > part is impossible (?) > > We run an explain analysis and we see some very interesting stuff going on. > It seems without explicitly adding a `ANALYZE`, the query has a cost of > over billions, so the query is not stuck but took forever. > When I run the same scripts with an ANALYZE right before running the query, > the query is exec is 50secondes and the cost is normal It sounds like sometimes autoanalyze processes important tables being queried, but sometimes it doesn't. Since there are JOINs involved, you should analyze the tables after populating them and before querying them. The same as if it were a temp table, or anything else. > The configuration is tuned by aws aurora, [...] > fsync,off > full_page_writes,off really? > vacuum_cleanup_index_scale_factor,0.1 also interesting
Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
Hello, > Is it true that the SELECTs have no joins in them ? Yes there is a lot of LEFT JOIN. > When it doesn't work, you could check SELECT * FROM pg_stat_activity, and >SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on. I can't see any blocking queries blocking pg_locks, pg_blocking_pids. > It'd be very useful to get "explain analyze" for a working query and for a > stuck query. It sound like the stuck query never finishes, so maybe the second > part is impossible (?) We run an explain analysis and we see some very interesting stuff going on. It seems without explicitly adding a `ANALYZE`, the query has a cost of over billions, so the query is not stuck but took forever. When I run the same scripts with an ANALYZE right before running the query, the query is exec is 50secondes and the cost is normal Explain analyze WITHOUT ANALYZE https://explain.depesz.com/s/RaSr Explain analyze same query WITH ANALYZE BEFORE https://explain.depesz.com/s/tYVl The configuration is tuned by aws aurora, but this issue happens also with a default config. allow_system_table_mods,off application_name,DataGrip 2021.1.3 archive_command,(disabled) archive_mode,off archive_timeout,5min array_nulls,on authentication_timeout,1min autovacuum,on autovacuum_analyze_scale_factor,0.05 autovacuum_analyze_threshold,50 autovacuum_freeze_max_age,2 autovacuum_max_workers,12 autovacuum_multixact_freeze_max_age,4 autovacuum_naptime,5s autovacuum_vacuum_cost_delay,1ms autovacuum_vacuum_cost_limit,1200 autovacuum_vacuum_scale_factor,0.1 autovacuum_vacuum_threshold,50 autovacuum_work_mem,-1 backend_flush_after,0 backslash_quote,safe_encoding bgwriter_delay,200ms bgwriter_flush_after,0 bgwriter_lru_maxpages,100 bgwriter_lru_multiplier,2 bonjour,off bytea_output,hex check_function_bodies,on checkpoint_completion_target,0.9 checkpoint_flush_after,0 checkpoint_timeout,15min checkpoint_warning,30s client_encoding,UTF8 client_min_messages,notice commit_delay,0 commit_siblings,5 constraint_exclusion,partition cpu_index_tuple_cost,0.005 cpu_operator_cost,0.0025 cpu_tuple_cost,0.01 cursor_tuple_fraction,0.1 DateStyle,"ISO, MDY" db_user_namespace,off deadlock_timeout,1s debug_pretty_print,on debug_print_parse,off debug_print_plan,off debug_print_rewritten,off default_statistics_target,500 default_text_search_config,pg_catalog.simple default_transaction_deferrable,off default_transaction_isolation,read committed default_transaction_read_only,off dynamic_library_path,$libdir effective_cache_size,4GB effective_io_concurrency,600 enable_bitmapscan,on enable_gathermerge,on enable_hashagg,on enable_hashjoin,on enable_indexonlyscan,on enable_indexscan,on enable_material,on enable_mergejoin,on enable_nestloop,on enable_parallel_append,on enable_parallel_hash,on enable_partition_pruning,on enable_partitionwise_aggregate,off enable_partitionwise_join,off enable_seqscan,on enable_sort,on enable_tidscan,on escape_string_warning,on event_source,PostgreSQL exit_on_error,off extra_float_digits,3 force_parallel_mode,off from_collapse_limit,8 fsync,off full_page_writes,off geqo,on geqo_effort,5 geqo_generations,0 geqo_pool_size,0 geqo_seed,0 geqo_selection_bias,2 geqo_threshold,12 gin_fuzzy_search_limit,0 gin_pending_list_limit,4MB hot_standby,off hot_standby_feedback,on huge_pages,try idle_in_transaction_session_timeout,25min ignore_checksum_failure,off ignore_system_indexes,off IntervalStyle,postgres jit,off jit_above_cost,10 jit_debugging_support,off jit_dump_bitcode,off jit_expressions,on jit_inline_above_cost,50 jit_optimize_above_cost,50 jit_profiling_support,off jit_provider,llvmjit jit_tuple_deforming,on join_collapse_limit,8 lc_monetary,C lc_numeric,C lc_time,C listen_addresses,* lock_timeout,0 lo_compat_privileges,off maintenance_work_mem,2GB max_connections,100 max_files_per_process,1000 max_locks_per_transaction,256 max_logical_replication_workers,4 max_parallel_maintenance_workers,12 max_parallel_workers,12 max_parallel_workers_per_gather,6 max_pred_locks_per_page,2 max_pred_locks_per_relation,-2 max_pred_locks_per_transaction,64 max_prepared_transactions,0 max_replication_slots,10 max_stack_depth,6MB max_standby_archive_delay,30s max_standby_streaming_delay,14s max_sync_workers_per_subscription,2 max_wal_senders,0 max_wal_size,8GB max_worker_processes,12 min_parallel_index_scan_size,512kB min_parallel_table_scan_size,8MB min_wal_size,2GB old_snapshot_threshold,-1 operator_precedence_warning,off parallel_leader_participation,off parallel_setup_cost,1000 parallel_tuple_cost,0.1 password_encryption,md5 port,5432 post_auth_delay,0 pre_auth_delay,0 quote_all_identifiers,off random_page_cost,1.1 restart_after_crash,on row_security,on search_path,public seq_page_cost,1 session_replication_role,origin shared_buffers,1GB standard_conforming_strings,on statement_timeout,0 superuser_reserved_connections,3 synchronize_seqscans,on synchronous_commit,on syslog_facility,local0 syslog_ident,postgres syslog_sequence_nu
Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
On a different machine, we use 12.7. Still same issue On Thu, Jul 8, 2021 at 3:49 PM Allan Barrielle wrote: > Hello, > > > Is it true that the SELECTs have no joins in them ? > > Yes there is a lot of LEFT JOIN. > > > When it doesn't work, you could check SELECT * FROM pg_stat_activity, > and > >SELECT pg_blocking_pids(pid), * FROM pg_locks, to see what's going on. > > I can't see any blocking queries blocking pg_locks, pg_blocking_pids. > > > It'd be very useful to get "explain analyze" for a working query and for > a > > stuck query. It sound like the stuck query never finishes, so maybe the > second > > part is impossible (?) > > We run an explain analysis and we see some very interesting stuff going on. > It seems without explicitly adding a `ANALYZE`, the query has a cost of > over billions, so the query is not stuck but took forever. > When I run the same scripts with an ANALYZE right before running the > query, the query is exec is 50secondes and the cost is normal > > Explain analyze WITHOUT ANALYZE https://explain.depesz.com/s/RaSr > Explain analyze same query WITH ANALYZE BEFORE > https://explain.depesz.com/s/tYVl > > The configuration is tuned by aws aurora, but this issue happens also with > a default config. > > allow_system_table_mods,off > application_name,DataGrip 2021.1.3 > archive_command,(disabled) > archive_mode,off > archive_timeout,5min > array_nulls,on > authentication_timeout,1min > autovacuum,on > autovacuum_analyze_scale_factor,0.05 > autovacuum_analyze_threshold,50 > autovacuum_freeze_max_age,2 > autovacuum_max_workers,12 > autovacuum_multixact_freeze_max_age,4 > autovacuum_naptime,5s > autovacuum_vacuum_cost_delay,1ms > autovacuum_vacuum_cost_limit,1200 > autovacuum_vacuum_scale_factor,0.1 > autovacuum_vacuum_threshold,50 > autovacuum_work_mem,-1 > backend_flush_after,0 > backslash_quote,safe_encoding > bgwriter_delay,200ms > bgwriter_flush_after,0 > bgwriter_lru_maxpages,100 > bgwriter_lru_multiplier,2 > bonjour,off > bytea_output,hex > check_function_bodies,on > checkpoint_completion_target,0.9 > checkpoint_flush_after,0 > checkpoint_timeout,15min > checkpoint_warning,30s > client_encoding,UTF8 > client_min_messages,notice > commit_delay,0 > commit_siblings,5 > constraint_exclusion,partition > cpu_index_tuple_cost,0.005 > cpu_operator_cost,0.0025 > cpu_tuple_cost,0.01 > cursor_tuple_fraction,0.1 > DateStyle,"ISO, MDY" > db_user_namespace,off > deadlock_timeout,1s > debug_pretty_print,on > debug_print_parse,off > debug_print_plan,off > debug_print_rewritten,off > default_statistics_target,500 > default_text_search_config,pg_catalog.simple > default_transaction_deferrable,off > default_transaction_isolation,read committed > default_transaction_read_only,off > dynamic_library_path,$libdir > effective_cache_size,4GB > effective_io_concurrency,600 > enable_bitmapscan,on > enable_gathermerge,on > enable_hashagg,on > enable_hashjoin,on > enable_indexonlyscan,on > enable_indexscan,on > enable_material,on > enable_mergejoin,on > enable_nestloop,on > enable_parallel_append,on > enable_parallel_hash,on > enable_partition_pruning,on > enable_partitionwise_aggregate,off > enable_partitionwise_join,off > enable_seqscan,on > enable_sort,on > enable_tidscan,on > escape_string_warning,on > event_source,PostgreSQL > exit_on_error,off > extra_float_digits,3 > force_parallel_mode,off > from_collapse_limit,8 > fsync,off > full_page_writes,off > geqo,on > geqo_effort,5 > geqo_generations,0 > geqo_pool_size,0 > geqo_seed,0 > geqo_selection_bias,2 > geqo_threshold,12 > gin_fuzzy_search_limit,0 > gin_pending_list_limit,4MB > hot_standby,off > hot_standby_feedback,on > huge_pages,try > idle_in_transaction_session_timeout,25min > ignore_checksum_failure,off > ignore_system_indexes,off > IntervalStyle,postgres > jit,off > jit_above_cost,10 > jit_debugging_support,off > jit_dump_bitcode,off > jit_expressions,on > jit_inline_above_cost,50 > jit_optimize_above_cost,50 > jit_profiling_support,off > jit_provider,llvmjit > jit_tuple_deforming,on > join_collapse_limit,8 > lc_monetary,C > lc_numeric,C > lc_time,C > listen_addresses,* > lock_timeout,0 > lo_compat_privileges,off > maintenance_work_mem,2GB > max_connections,100 > max_files_per_process,1000 > max_locks_per_transaction,256 > max_logical_replication_workers,4 > max_parallel_maintenance_workers,12 > max_parallel_workers,12 > max_parallel_workers_per_gather,6 > max_pred_locks_per_page,2 > max_pred_locks_per_relation,-2 > max_pred_locks_per_transaction,64 > max_prepared_transactions,0 > max_replication_slots,10 > max_stack_depth,6MB > max_standby_archive_delay,30s > max_standby_streaming_delay,14s > max_sync_workers_per_subscription,2 > max_wal_senders,0 > max_wal_size,8GB > max_worker_processes,12 > min_parallel_index_scan_size,512kB > min_parallel_table_scan_size,8MB > min_wal_size,2GB > old_snapshot_threshold,-1 > operator_precedence_warning,off > parallel_leader_participation,off > parallel_setup_cost,1000 > parallel_t
Re: ETL - sql orchestrator is stuck when there is not sleep() between queries
fsync is off and full_page_writes is off because the script works one time. We create the db, we load the data, then we dump the data and kill the db. No need to handle servers crashed or anything like that. 0.1 vacuum_cleanup_index_scale_factor is the default value. On Thu, Jul 8, 2021 at 4:06 PM Justin Pryzby wrote: > On Thu, Jul 08, 2021 at 03:49:12PM +0200, Allan Barrielle wrote: > > > Is it true that the SELECTs have no joins in them ? > > > > Yes there is a lot of LEFT JOIN. > > > > > It'd be very useful to get "explain analyze" for a working query and > for a > > > stuck query. It sound like the stuck query never finishes, so maybe > the second > > > part is impossible (?) > > > > We run an explain analysis and we see some very interesting stuff going > on. > > It seems without explicitly adding a `ANALYZE`, the query has a cost of > > over billions, so the query is not stuck but took forever. > > When I run the same scripts with an ANALYZE right before running the > query, > > the query is exec is 50secondes and the cost is normal > > It sounds like sometimes autoanalyze processes important tables being > queried, > but sometimes it doesn't. > > Since there are JOINs involved, you should analyze the tables after > populating > them and before querying them. The same as if it were a temp table, or > anything else. > > > The configuration is tuned by aws aurora, [...] > > > fsync,off > > full_page_writes,off > > really? > > > vacuum_cleanup_index_scale_factor,0.1 > > also interesting >
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification
Hello everyone,
I have a scenario where wanted to add PK on partition to make sure to monitor
unique values for two columns values. but as PG required to partition column
should be part of PK. How can we make sure actual two columns need to be unique
values.
and also while insert into table need be use 'on conflict'.
create table t (id int, pid int , name name , dt date) partition by
range(dt);--create unique index on t(id,pid);--alter table t add constraint uk
unique (id);--create unique index on t(id,pid);alter table t add constraint
uk unique (id,pid,dt);
create table t1 partition of t for values from ('2020-01-01') to
('2020-02-01');alter table t1 add constraint uk1 unique (id,pid);create table
t2 partition of t for values from ('2020-02-01') to ('2020-03-01');alter table
t2 add constraint uk2 unique (id,pid);create table t4 partition of t for
values from ('2020-03-01') to ('2020-04-01');alter table t4 add constraint uk3
unique (id,pid);create table t3 partition of t for values from ('2020-04-01')
to ('2020-05-01');alter table t3 add constraint uk4 unique (id,pid);
insert into t(id,pid,name,dt) values (1,2,'raj','2020-01-01')on conflict
(id,pid) do nothing;
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT
specification
https://dbfiddle.uk/?rdbms=postgres_11&fiddle=36b3eb0d51f8bff4b5d445a77d688d88
Thanks,Rj
Re: Strange execution plan
> On 08-07-2021, at 04:26, Joel Frid wrote:
>
> Hi,
>
> On my production environment (PostgreSQL 13.3), one of my queries runs very
> slow, about 2 minutes.
> I noticed that it does not use an execution plan that I anticapited it would.
>
> The query is :
>
> SELECT t.*
> FROM test t
> WHERE t."existe" IS true
> and t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
> ORDER BY t."id" DESC
> LIMIT 100 OFFSET 0
>
> I know PostgreSQL is not very good at performing well with pagination and
> offsets but all my queries must end with "LIMIT 100 OFFSET 0", "LIMIT 100
> OFFSET 1", ...
I don't think any database performs well with LIMIT+OFFSET.
Using OFFSET requires doing a linear scan discarding all rows up to
the row in position OFFSET, then the scan continues for LIMIT rows.
The greater the value of OFFSET, the slowest the query will perform,
in general.
I'd recommend you using cursors for pagination in general (I know it
may not be possible for you, just wanted to explain as it could be
useful).
> If I display actual Execution Plan, I get this :
>
> Limit (cost=0.43..1164.55 rows=100 width=632) (actual
> time=7884.056..121297.756 rows=1 loops=1)
> Buffers: shared hit=5311835 read=585741 dirtied=32
> -> Index Scan Backward using test_pk on test (cost=0.43..141104.29
> rows=12121 width=632) (actual time=7884.053..121297.734 rows=1 loops=1)
> Filter: ((existe IS TRUE) AND (json_data @> '{"book": {"title": "In
> Search of Lost Time"}}'::jsonb))
> Rows Removed by Filter: 1215681
> Buffers: shared hit=5311835 read=585741 dirtied=32
> Planning:
> Buffers: shared hit=1
> Planning Time: 0.283 ms
> Execution Time: 121297.878 ms
>
> The query runs very slow from limit 1 to 1147.
> If I change limit value to 1148, this query runs quite fast ( 0.190 ms) with
> a nice execution plan :
>
> SELECT t.*
> FROM test t
> WHERE t."existe" IS true
> and t.json_data" @> '{"book":{"title":"In Search of Lost Time"}}'
> ORDER BY t."id" DESC
> LIMIT 1148 OFFSET 0
>
> Limit (cost=13220.53..13223.40 rows=1148 width=632) (actual
> time=0.138..0.140 rows=1 loops=1)
> Buffers: shared hit=17
> -> Sort (cost=13220.53..13250.84 rows=12121 width=632) (actual
> time=0.137..0.138 rows=1 loops=1)
> Sort Key: id DESC
> Sort Method: quicksort Memory: 27kB
> Buffers: shared hit=17
> -> Bitmap Heap Scan on test (cost=119.73..12543.88 rows=12121
> width=632) (actual time=0.125..0.127 rows=1 loops=1)
> Recheck Cond: (json_data @> '{"book": {"title": "In Search of
> Lost Time"}}'::jsonb)
> Filter: (existe IS TRUE)
> Heap Blocks: exact=1
> Buffers: shared hit=17
> -> Bitmap Index Scan on test_json_data_idx (cost=0.00..116.70
> rows=12187 width=0) (actual time=0.112..0.113 rows=1 loops=1)
> Index Cond: (json_data @> '{"book": {"title": "In Search
> of Lost Time"}}'::jsonb)
> Buffers: shared hit=16
> Planning:
> Buffers: shared hit=1
> Planning Time: 0.296 ms
> Execution Time: 0.190 ms
>
> Would you have any suggestions why Postgres chooses a so bad query plan ?
I can guess a bit about this,
One of the perks of the statistics collector is that it doesn't
collect too many statistics to properly estimate the number of rows
that will match the "@>" operator, as that is quite hard to do. In
general it over-estimates how many rows will match.
As you can see on both plans it estimates that 12121 rows will match
the "@>" clause, even if only 1 actually match.
This means that the planner is estimating that the cost of using
test_json_data_idx and then executing top-k heapsort (to get the
latest 100 rows ordered by id) is far greater than iterating over the
test_pk index where the first 100 rows that match will be the ones you
need and already sorted.
In practice, iterating the test_pk index has to read over a lot of
rows that didn't match the "@>" operator, as the actual number of rows
that match isn't as large as initially expected.
Once you increase the LIMIT value to a number high enough the cost of
iterating over any of the indexes becomes similar to the planner,
after that threshold it chooses to switch the plan.
So, some suggestions to improve the execution of that query:
Option A: Use a common table expression to "force" the usage of
test_json_data_idx
WITH json_matching_rows AS (
SELECT t.*
FROM test ti
WHERE t.json_data @> '{"book":{"title":"In Search of Lost Time"}}'
)
SELECT t.*
FROM json_matching_rows t
WHERE t."existe" IS true
ORDER BY t."id" DESC
LIMIT 100 OFFSET 0;
Option B: Use the extension pg_hint_plan to hint the usage of
test_json_data_idx
Option C: Create a functional index for the book title and tweak the
query to use it.
This can also be a composite index (to have the values sorted by id
already) and partial (to only include rows where "existe" i
Re: Strange execution plan
> On 08-07-2021, at 17:13, Manuel Weitzman wrote: > > I'd recommend you using cursors for pagination in general (I know it > may not be possible for you, just wanted to explain as it could be > useful). By the way, I mean cursor pagination as the general concept. I'm not talking about Postgres cursors. Best regards, Manuel Weitzman
temporary file log lines
Hi all, I got a question about PG log lines with temporary file info like this: case 1: log line with no contextual info 2021-07-07 20:28:15 UTC:10.100.11.95(50274):myapp@mydb:[35200]:LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp35200.0", size 389390336 case 2: log line with contextual info 2021-07-07 20:56:18 UTC:172.16.193.118(56080):myapp@mydb:[22418]:LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp22418.0", size 1048576000 2021-07-07 20:56:18 UTC:172.16.193.118(56080):myapp@mydb:[22418]:CONTEXT: PL/pgSQL function memory.f_memory_usage(boolean) line 13 at RETURN QUERY There are at least 2 cases where stuff can spill over to disk: * queries that don't fit in work_mem, and * temporary tables that don't fit in temp_buffers Question, if log_temp_files is turned on (=0), then how can you tell from where the temporary log line comes from? I see a pattern where work_mem spill overs have a CONTEXT line that immediately follows the LOG LINE with keyword, temporary. See case 2 above. For other LOG lines with keyword, temporary, there is no such pattern. Could those be the ones caused by temp_buffer spill overs to disk? case 1 above. I really want to tune temp_buffers, but I would like to be able to detect when temporary tables are spilling over to disk, so that I can increase temp_buffers. Any help would be appreciated. Regards, Michael Vitale
Re: Partition column should be part of PK
If I'm not wrong, this is the same thing you asked 2 week ago. If so, why not continue the conversation on the same thread, and why not reference the old thread ? I went to the effort to find the old conversation. https://www.postgresql.org/message-id/[email protected] If declaratively partitioned tables and partitioned indexes don't do what you wanted, then you should consider not using them for this. On Fri, Jun 25, 2021 at 03:10:07AM +, Nagaraj Raj wrote: > we have some partitioned tables with inherence and planning to migrate them > to the declaration. > Table DDL: > CREATE TABLE c_account_p > ( > billing_account_guid character varying(40) NOT NULL, > ingestion_process_id bigint NOT NULL DEFAULT '-1'::integer, > load_dttm timestamp(6) without time zone NOT NULL, > ban integer NOT NULL, > CONSTRAINT billing_account_pkey PRIMARY KEY (billing_account_guid, ban) > ) PARTITION by RANGE(load_dttm); > When I try the create table, it's throwing below error: > ERROR: insufficient columns in the PRIMARY KEY constraint definition > DETAIL: PRIMARY KEY constraint on table "l_billing_account_p" lacks column > "load_dttm" which is part of the partition key. > SQL state: 0A000 > Is it mandatory/necessary that the partition column should be a primary key? > cause if I include load_dttm as PK then it's working fine.db<>fiddle > If the partition column should be supposed to be a PK, it's challenging to > create a partition by range with the date column, cause the load_dttm column > chances to have duplicate if data loaded COPY or INSERT.INSERT INTO > c_account_p SELECT * from c_account_p_bkp ON CONFLICT > (billing_account_guid,ban,load_dttm) DO UPDATE SET 'some stuff..' > If I receive billing_account_guid, ban combination with different load_dttm > then it will end up with duplicate keys. > Could some please help me to understand this scenario? > Thanks.
Re: Partition column should be part of PK
On 2021-Jul-08, Justin Pryzby wrote: > If I'm not wrong, this is the same thing you asked 2 week ago. > > If so, why not continue the conversation on the same thread, and why not > reference the old thread ? > > I went to the effort to find the old conversation. > https://www.postgresql.org/message-id/[email protected] Actually, it looks like you're replying to the same email you replied to two weeks ago. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
RE: Partition column should be part of PK
I believe this thread qualifies for the funniest thread of 2021 (so far). And yes, this is a top post. :-) Mike Sofen -Original Message- From: Alvaro Herrera Sent: Thursday, July 08, 2021 3:29 PM To: Justin Pryzby Cc: Nagaraj Raj ; [email protected] Subject: Re: Partition column should be part of PK On 2021-Jul-08, Justin Pryzby wrote: > If I'm not wrong, this is the same thing you asked 2 week ago. > > If so, why not continue the conversation on the same thread, and why > not reference the old thread ? > > I went to the effort to find the old conversation. > https://www.postgresql.org/message-id/20210625042228.GJ29179@telsasoft > .com Actually, it looks like you're replying to the same email you replied to two weeks ago. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Re: Strange execution plan
> On 08-07-2021, at 17:13, Manuel Weitzman wrote:
>
> Option A: Use a common table expression to "force" the usage of
> test_json_data_idx
>
>WITH json_matching_rows AS (
>SELECT t.*
>FROM test ti
>WHERE t.json_data @> '{"book":{"title":"In Search of Lost Time"}}'
>)
>SELECT t.*
>FROM json_matching_rows t
>WHERE t."existe" IS true
>ORDER BY t."id" DESC
>LIMIT 100 OFFSET 0;
>
The first query line should be
WITH MATERIALIZED json_matching_rows AS (
I had forgotten that Postgres 12 removed the optimization barrier on
common table expressions.
To introduce it again the MATERIALIZED clause is needed.
Apparently I need to work on reviewing my emails properly before
sending them.
Best regards,
Manuel Weitzman
Re: Partition column should be part of PK
My apologies for making confusion with new thread. Yes its same issue related to earlier post. I was trying to figure out how to ensure unique values for columns (billing_account_guid, ban). If i add partition key to constraint , it wont be possible what im looking for. My use case as below INSERT INTO t1 SELECT * from t2 ON CONFLICT (billing_account_guid,ban) DO UPDATE SET something… Or INSERT INTO t1 SELECT * from t2 ON CONFLICT constraint (pk or uk)(billing_account_guid,ban) DO UPDATE SET something… Thanks Sent from Yahoo Mail for iPhone On Thursday, July 8, 2021, 7:12 PM, Mike Sofen wrote: I believe this thread qualifies for the funniest thread of 2021 (so far). And yes, this is a top post. :-) Mike Sofen -Original Message- From: Alvaro Herrera Sent: Thursday, July 08, 2021 3:29 PM To: Justin Pryzby Cc: Nagaraj Raj ; [email protected] Subject: Re: Partition column should be part of PK On 2021-Jul-08, Justin Pryzby wrote: > If I'm not wrong, this is the same thing you asked 2 week ago. > > If so, why not continue the conversation on the same thread, and why > not reference the old thread ? > > I went to the effort to find the old conversation. > https://www.postgresql.org/message-id/20210625042228.GJ29179@telsasoft > .com Actually, it looks like you're replying to the same email you replied to two weeks ago. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Re: Partition column should be part of PK
> On Jul 8, 2021, at 20:32, Nagaraj Raj wrote: > > My apologies for making confusion with new thread. Yes its same issue related > to earlier post. > > I was trying to figure out how to ensure unique values for columns > (billing_account_guid, ban). If i add partition key to constraint , it wont > be possible what im looking for. > > My use case as below > > INSERT INTO t1 SELECT * from t2 ON CONFLICT (billing_account_guid,ban) DO > UPDATE SET something… > > Or > > INSERT INTO t1 SELECT * from t2 ON CONFLICT constraint (pk or > uk)(billing_account_guid,ban) DO UPDATE SET something… Right now, PostgreSQL does not support unique indexes on partitioned tables (that operate across all partitions) unless the partition key is included in the index definition. If it didn't have that requirement, it would have to independently (and in a concurrency-supported way) scan every partition individually to see if there is a duplicate key violation in any of the partitions, and the machinery to do that does not exist right now. If the goal is to make sure there is only one (billing_account_guid, ban, date) combination across the entire partition set, you can create an index unique index on the partitioned set as (billing_account_guid, ban, date), and INSERT ... ON CONFLICT DO NOTHING works properly then. If the goal is to make sure there is only one (billing_account_uid, ban) in any partition regardless of date, you'll need to do something more sophisticated to make sure that two sessions don't insert an (billing_account_uid, ban) value into two different partitions. This isn't a great fit for table partitioning, and you might want to reconsider if partitioning the table is the right answer here. If you *must* have table partitioning, a possible algorithm is: -- Start a transaction -- Hash the (billing_account_uid, ban) key into a 64 bit value. -- Use that 64 bit value as a key to a call to pg_advisory_xact_lock() [1] to, in essence, create a signal to any other transaction attempting to insert that pair that it is being modified. -- SELECT on that pair to make sure one does not already exist. -- If one does not, do the INSERT. -- Commit, which releases the advisory lock. This doesn't provide quite the same level of uniqueness that a cross-partition index would, but if this is the only code path that does the INSERT, it should keep duplicate from showing up in different partitions. [1] https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS
