Conflict between recovery thread and client queries on a hot standby replica
Hi It seems that the recovery thread(streaming replication) and sql queries against a postgresql hot standby can produce a deadlock. If you have conflicting statements running on master and slave, you can end up with queries on the slave waiting for locks of the recovery thread and the recovery thread waiting for locks on the sql thread. The locks are never resolved or any client aborted. This happens when using max_standby_streaming_delay=-1 Which does state that the recovery thread can wait indefinitely on a SQL thread, but I would still expect dead locks to be detected and handled. How to reproduce: - Setup pgsql master/slave with streaming replication - On the slave "set max_standby_streaming_delay=-1" - Start a thread on master that does the following BEGIN; CREATE OR REPLACE VIEW va AS SELECT 1; CREATE OR REPLACE VIEW vb as SELECT 2; COMMIT; - Start multiple threads on the slave that does the following (with 5 workers running, a deadlock is produced within minutes) SELECT * FROM vb; SELECT * FROM va; Expected behaviour: The 5 threads will continue be able to query the views va, vb Observed behaviour: The 5 threads are blocked from reading the tables, after some time. They are never allowed access to the views before all queries are aborted and restarted. I have build a docker compose file to reproduce my observation. https://github.com/kimc78/postgresql-slave-deadlock Best regards Kim Carlsen
Conflict between recovery thread and client queries on a hot standby replica
Hi It seems that the recovery thread(streaming replication) and sql queries against a postgresql hot standby can produce a deadlock. If you have conflicting statements running on master and slave, you can end up with queries on the slave waiting for locks of the recovery thread and the recovery thread waiting for locks on the sql thread. The locks are never resolved or any client aborted. This happens when using max_standby_streaming_delay=-1 Which does state that the recovery thread can wait indefinitely on a SQL thread, but I would still expect dead locks to be detected and handled. How to reproduce: - Setup pgsql master/slave with streaming replication - On the slave "set max_standby_streaming_delay=-1" - Start a thread on master that does the following BEGIN; CREATE OR REPLACE VIEW va AS SELECT 1; CREATE OR REPLACE VIEW vb as SELECT 2; COMMIT; - Start multiple threads on the slave that does the following (with 5 workers running, a deadlock is produced within minutes) SELECT * FROM vb; SELECT * FROM va; Expected behaviour: The 5 threads will continue be able to query the views va, vb Observed behaviour: The 5 threads are blocked from reading the tables, after some time. They are never allowed access to the views before all queries are aborted and restarted. I have build a docker compose file to reproduce my observation. https://github.com/kimc78/postgresql-slave-deadlock Best regards Kim Carlsen
Re: survey: pg_stat_statements total_time and entry deallocation
Here is some data from our production. I hope it can be of any use to you. - System 1: hiper=> select bucket ,count(*) entries ,max(calls) max_calls ,round(sum(total_time)) total_time ,round((100*sum(total_time)/avg(total_total_time))::numeric,2) pct_time ,round(sum(rows)) "rows" ,round((100*sum(rows)/avg(total_rows))::numeric,2) pct_rows from ( select ntile(20) over (order by calls) bucket ,calls ,total_time ,sum(total_time) over () total_total_time ,rows ,sum(rows) over () total_rows from pg_stat_statements ) stmts group by rollup(bucket) order by bucket; bucket | entries | max_calls | total_time | pct_time | rows | pct_rows +-+---++--++-- 1 | 245 | 71 | 4745479 | 0.38 | 640677 | 0.04 2 | 245 | 96 | 22151762 | 1.76 | 236827 | 0.01 3 | 245 | 122 | 9028387 | 0.72 | 297861 | 0.02 4 | 244 | 167 | 4711705 | 0.38 | 328928 | 0.02 5 | 244 | 228 | 9490670 | 0.76 | 337712 | 0.02 6 | 244 | 305 | 7296024 | 0.58 | 273740 | 0.02 7 | 244 | 394 | 35828651 | 2.85 | 1140064 | 0.07 8 | 244 | 540 | 34180388 | 2.72 | 1313171 | 0.08 9 | 244 | 711 | 29748121 | 2.37 | 865894 | 0.05 10 | 244 | 989 | 12864432 | 1.02 | 1665529 | 0.10 11 | 244 | 1507 | 4009346 | 0.32 | 1295032 | 0.08 12 | 244 | 2511 | 13444734 | 1.07 | 4711699 | 0.30 13 | 244 | 4567 | 401096681 | 31.94 | 3386595 | 0.21 14 | 244 | 8086 | 4750899 | 0.38 | 8236002 | 0.52 15 | 244 | 13356 | 19875345 | 1.58 | 6040996 | 0.38 16 | 244 | 22454 | 23361859 | 1.86 | 16906926 | 1.06 17 | 244 | 59660 | 68633113 | 5.46 | 40170089 | 2.52 18 | 244 | 141667 | 59768727 | 4.76 | 76054887 | 4.77 19 | 244 | 431946 | 330488976 | 26.31 | 213238961 | 13.38 20 | 244 | 170978486 | 160486607 | 12.78 | 1216933189 | 76.34 | 4883 | 170978486 | 1255961906 | 100.00 | 1594074779 | 100.00 (21 rows) hiper=> select current_setting('pg_stat_statements.max'); current_setting - 5000 (1 row)
How to investigate what postgres is spending time on
Hi I have some simple INSERT / UPDATE queries, that takes a long time the first time they are run in out test environment, but I'm not sure what postgres is doing and what I can do to help it. Whats common is that the table contains many rows in the order of about 20 millions. Query: INSERT INTO communication.request_parameter (request_id, template_version_parameter_id, parameter_value) VALUES (1222, 1211, 122) RETURNING request_parameter_id Row from pg_stat_statements: ---+++--++--++++++++++++++++++- userid | dbid | queryid| query | calls | total_time | min_time | max_time | mean_time | stddev_time| rows | shared_blk | shared_blk | shared_blk | shared_blk | local_blks | local_blks | local_blks | local_blks | temp_blks_ | temp_blks_ | blk_read_t | blk_write_ | ---+++--++--++++++++++++++++++- 16385 | 16389 | 2064198912 | INSERT INTO communication.request_parameter (request_id, tem | 98 | 646.393451 | 0.03 | 638.712758 | 6.59585154081633 | 64.1818799227704 | 98 | 2850 | 24 | 21 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Description of table: # \d communication.request_parameter Table "communication.request_parameter" Column | Type| Collation | Nullable | Default ---+---+---+--+--- request_parameter_id | integer | | not null | nextval('communication.request_parameter_request_parameter_id_seq'::regclass) request_id| integer | | not null | template_version_parameter_id | integer | | not null | parameter_value | character varying | | | Indexes: "request_parameter_pkey" PRIMARY KEY, btree (request_parameter_id) "request_parameter_parameter_value_idx" btree (parameter_value) "request_parameter_request_id_idx" btree (request_id) "request_parameter_template_version_parameter_id_idx" btree (template_version_parameter_id) Foreign-key constraints: "request_parameter_request_id_fkey" FOREIGN KEY (request_id) REFERENCES communication.request(request_id) "request_parameter_template_version_parameter_id_fkey" FOREIGN KEY (template_version_parameter_id) REFERENCES communication.template_version_parameter(template_version_parameter_id) This only happens in testing, and on a cold bootet database. The test database is constructed with pg_dump and restore on fresh postgres installation. Best Regards Kim Carlsen
Postgres wont remove useless joins, when the UNIQUE index is partial
Hi remove_useless_join does not prove uniqueness if the unique index is partial, and therefore wont remove the join if no columns are referenced (see example in bottom). I have been trying to look around the source code and from what I have identified the problem seems to be that "check_index_predicates(..)" happens after "remove_useless_join(..)", and therefore cannot see that the unique index is actually covered by the join condition. >From analyzejoins.c:612, rel_supports_distinctness(..) if (ind->unique && ind->immediate && (ind->indpred == NIL || ind->predOK)) return true; But the problem is ind->predOK is calculated in check_index_predicates(..) but this happens later so ind->predOK is always false when checked here. I have tried to add check_index_predicates(..) to rel_supports_distinctness(..) and this produces the expected plan, but I have no idea of the implication of doing check_index_predicates(..) earlier. This is my first time looking at the postgres source code, so I know attached "patch" is not the solution, but any pointers on where to go from here would be appreciated. Example: CREATE TABLE a ( id INTEGER PRIMARY KEY, sub_id INTEGER NOT NULL, deleted_at TIMESTAMP ); CREATE UNIQUE INDEX ON a (sub_id) WHERE (deleted_at IS NULL); ANALYZE a; EXPLAIN SELECT 1 FROM a AS a LEFT JOIN a AS b ON a.id = b.sub_id AND b.deleted_at IS NULL; Expected plan: QUERY PLAN - Seq Scan on a (cost=0.00..28.50 rows=1850 width=4) Actual plan: QUERY PLAN --- Hash Left Join (cost=14.76..48.13 rows=1850 width=4) Hash Cond: (a.id = b.sub_id) -> Seq Scan on a (cost=0.00..28.50 rows=1850 width=4) -> Hash (cost=14.65..14.65 rows=9 width=4) -> Bitmap Heap Scan on a b (cost=4.13..14.65 rows=9 width=4) Recheck Cond: (deleted_at IS NULL) -> Bitmap Index Scan on a_sub_id_idx (cost=0.00..4.13 rows=9 width=0) (7 rows) mvh Kim Carlsen Hiper A/S M: 71 99 42 00 diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index 1593dbec21..12da689983 100644 --- a/src/backend/optimizer/plan/analyzejoins.c +++ b/src/backend/optimizer/plan/analyzejoins.c @@ -596,6 +596,7 @@ rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel) return false; if (rel->rtekind == RTE_RELATION) { + check_index_predicates(root, rel); /* * For a plain relation, we only know how to prove uniqueness by * reference to unique indexes. Make sure there's at least one