JOIN on partitions is very slow
Hi folks,
We are using postgreSQL database and I am hitting some limits. I have
partitions on company_sale_account table
based on company name
We generate a report on accounts matched between the two. Below is the
query:
SELECT DISTINCT cpsa1.*
FROM company_sale_account cpsa1
JOIN company_sale_account cpsa2 ON cpsa1.sale_account_id =
cpsa2.sale_account_id
WHERE cpsa1.company_name = 'company_a'
AND cpsa2.company_name = 'company_b'
We have setup BTREE indexes on sale_account_id column on both the tables.
This worked fine till recently. Now, we have 10 million rows in
company_a partition and 7 million rows in company_b partition. This query
is taking
more than 10 minutes.
Below is the explain plan output for it:
Buffers: shared hit=20125996 read=47811 dirtied=75, temp read=1333427
written=1333427
I/O Timings: read=19619.322
-> Sort (cost=167950986.43..168904299.23 rows=381325118 width=132)
(actual time=517017.334..603691.048 rows=16854094 loops=1)
Sort Key: cpsa1.crm_account_id, ((cpsa1.account_name)::text),
((cpsa1.account_owner)::text), ((cpsa1.account_type)::text),
cpsa1.is_customer, ((date_part('epoch'::text,
cpsa1.created_date))::integer),
((hstore_to_json(cpsa1.custom_crm_fields))::tex (...)
Sort Method: external merge Disk: 2862656kB
Buffers: shared hit=20125996 read=47811 dirtied=75, temp
read=1333427 written=1333427
I/O Timings: read=19619.322
-> Nested Loop (cost=0.00..9331268.39 rows=381325118 width=132)
(actual time=1.680..118698.570 rows=16854094 loops=1)
Buffers: shared hit=20125977 read=47811 dirtied=75
I/O Timings: read=19619.322
-> Append (cost=0.00..100718.94 rows=2033676 width=33)
(actual time=0.014..1783.243 rows=2033675 loops=1)
Buffers: shared hit=75298 dirtied=75
-> Seq Scan on company_sale_account cpsa2
(cost=0.00..0.00 rows=1 width=516) (actual time=0.001..0.001 rows=0
loops=1)
Filter: ((company_name)::text = 'company_b'::text)
-> Seq Scan on company_sale_account_concur cpsa2_1
(cost=0.00..100718.94 rows=2033675 width=33) (actual time=0.013..938.145
rows=2033675 loops=1)
Filter: ((company_name)::text = 'company_b'::text)
Buffers: shared hit=75298 dirtied=75
-> Append (cost=0.00..1.97 rows=23 width=355) (actual
time=0.034..0.047 rows=8 loops=2033675)
Buffers: shared hit=20050679 read=47811
I/O Timings: read=19619.322
-> Seq Scan on company_sale_account cpsa1
(cost=0.00..0.00 rows=1 width=4525) (actual time=0.000..0.000 rows=0
loops=2033675)
Filter: (((company_name)::text =
'company_a'::text) AND ((cpsa2.sale_account_id)::text =
(sale_account_id)::text))
-> Index Scan using ix_csa_adp_sale_account on
company_sale_account_adp cpsa1_1 (cost=0.56..1.97 rows=22 width=165)
(actual time=0.033..0.042 rows=8 loops=2033675)
Index Cond: ((sale_account_id)::text =
(cpsa2.sale_account_id)::text)
Filter: ((company_name)::text = 'company_a'::text)
Buffers: shared hit=20050679 read=47811
I/O Timings: read=19619.322
Planning time: 30.853 ms
Execution time: 618218.321 ms
Do you have any suggestion on how to tune postgres.
Please share your thoughts. It would be a great help to me.
Re: JOIN on partitions is very slow
Are you able to tweak the query or is that generated by an ORM? What version of Postgres? Which configs have you changed from default? How many partitions do you have? Is there an index on company name? Anytime I see distinct keyword, I expect it to be a performance bottleneck and wonder about rewriting the query. Even just using group by can be much faster because of how it gets executed.
Partitions to improve write/update speed for tables with indexes?
I have noticed that my write/update performance starts to dramatically reduce after about 10 million rows on my hardware. The reason for the slowdown is the index updates on every write/update. The solution would be partitioning? One of my tables will have more than 1 billion rows of data, so I would have to create about 100 partitions for that table. Is the a practical limit to the amount of partitions I can have with Postgresql 12?
Re: Partitions to improve write/update speed for tables with indexes?
On Sun, Mar 22, 2020 at 09:22:50PM -0400, Arya F wrote: > I have noticed that my write/update performance starts to dramatically > reduce after about 10 million rows on my hardware. The reason for the > slowdown is the index updates on every write/update. It's commonly true that the indexes need to fit entirely in shared_buffers for good write performance. I gave some suggestions here: https://www.postgresql.org/message-id/20200223101209.GU31889%40telsasoft.com > The solution would be partitioning? One of my tables will have more > than 1 billion rows of data, so I would have to create about 100 > partitions for that table. Is the a practical limit to the amount of > partitions I can have with Postgresql 12? The recommendation since pg12 is to use at most a "few thousand" partitions, so for the moment you'd be well within the recommendation. https://www.postgresql.org/docs/12/ddl-partitioning.html -- Justin
Partition Pruning (Hash Partitions) Support for DELETEs in PostgreSQL 11 and 12
Hello All,
While doing some tests with hash partitioning behavior in PG11 and 12, I
have found that PG11 is not performing partition pruning with DELETEs
(explain analyze returned >2000 lines). I then ran the same test in PG12
and recreated the objects using the same DDL, and it worked
Here are the tests:
*1) PG11 Hash Partitioning, no partition pruning:*
postgres=> \timing
Timing is on.
postgres=> select version();
version
-
PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-11), 64-bit
(1 row)
Time: 33.325 ms
postgres=> create table hp ( foo text ) partition by hash (foo);
CREATE TABLE
Time: 40.810 ms
postgres=> create table hp_0 partition of hp for values with (modulus 3,
remainder 0);
CREATE TABLE
Time: 43.990 ms
postgres=> create table hp_1 partition of hp for values with (modulus 3,
remainder 1);
CREATE TABLE
Time: 43.314 ms
postgres=> create table hp_2 partition of hp for values with (modulus 3,
remainder 2);
CREATE TABLE
Time: 43.447 ms
postgres=> insert into hp values ('shayon');
INSERT 0 1
Time: 42.975 ms
postgres=> select * from hp;
foo
shayon
(1 row)
Time: 40.210 ms
postgres=> select * from hp_0;
foo
shayon
(1 row)
Time: 38.898 ms
postgres=> insert into hp values ('shayon1'), ('shayon2'), ('shayon3');
INSERT 0 3
Time: 40.359 ms
postgres=> select * from hp_0;
foo
shayon
(1 row)
Time: 39.105 ms
postgres=> select * from hp_1;
foo
-
shayon2
(1 row)
Time: 37.292 ms
postgres=> select * from hp_2;
foo
-
shayon1
shayon3
(2 rows)
Time: 38.604 ms
postgres=> explain select * from hp where foo = 'shayon2';
QUERY PLAN
Append (cost=0.00..27.04 rows=7 width=32)
-> Seq Scan on hp_1 (cost=0.00..27.00 rows=7 width=32)
Filter: (foo = 'shayon2'::text)
(3 rows)
Time: 39.581 ms
postgres=> explain delete from hp where foo = 'shayon2';
QUERY PLAN
---
Delete on hp (cost=0.00..81.00 rows=21 width=6)
Delete on hp_0
Delete on hp_1
Delete on hp_2
* -> Seq Scan on hp_0 (cost=0.00..27.00 rows=7 width=6) Filter:
(foo = 'shayon2'::text) -> Seq Scan on hp_1 (cost=0.00..27.00 rows=7
width=6) Filter: (foo = 'shayon2'::text) -> Seq Scan on hp_2
(cost=0.00..27.00 rows=7 width=6) Filter: (foo = 'shayon2'::text)*
(10 rows)
Time: 38.749 ms
2) *PG12 hash prune, pruning works: *
dev=> \timing
Timing is on.
dev=> select version();
version
PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3
20140911 (Red Hat 4.8.3-9), 64-bit
(1 row)
Time: 29.786 ms
dev=> CREATE TABLE hp ( foo text ) PARTITION BY HASH (foo);
CREATE TABLE
Time: 30.680 ms
dev=> CREATE TABLE hp_0 PARTITION OF hp FOR VALUES WITH (MODULUS 3,
REMAINDER 0);
CREATE TABLE
Time: 122.791 ms
dev=> CREATE TABLE hp_1 PARTITION OF hp FOR VALUES WITH (MODULUS 3,
REMAINDER 1);
CREATE TABLE
Time: 32.053 ms
dev=> CREATE TABLE hp_2 PARTITION OF hp FOR VALUES WITH (MODULUS 3,
REMAINDER 2);
CREATE TABLE
Time: 31.839 ms
dev=> insert into hp values ('shayon1'), ('shayon2'), ('shayon3'),
('shayon');
INSERT 0 4
Time: 27.887 ms
dev=> select * from hp_1;
foo
-
shayon2
(1 row)
Time: 27.697 ms
dev=> select * from hp_2;
foo
-
shayon1
shayon3
(2 rows)
Time: 27.845 ms
dev=> select * from hp_0;
foo
shayon
(1 row)
Time: 27.679 ms
dev=> explain delete from hp where foo = 'shayon2';
QUERY PLAN
---
Delete on hp (cost=0.00..27.00 rows=7 width=6)
Delete on hp_1
->
*Seq Scan on hp_1 (cost=0.00..27.00 rows=7 width=6) Filter: (foo =
'shayon2'::text)*
(4 rows)
Time: 30.490 ms
Is this a bug, somewhat related to MergeAppend?
https://github.com/postgres/postgres/commit/5220bb7533f9891b1e071da6461d5c387e8f7b09
If it is, anyone know if we have a workaround for DELETEs to use hash
partitions in PG11?
Thanks,
Shayon
Re: Partition Pruning (Hash Partitions) Support for DELETEs in PostgreSQL 11 and 12
On Sun, Mar 22, 2020 at 11:45:53PM -0400, Ronnie S wrote: > Hello All, > > While doing some tests with hash partitioning behavior in PG11 and 12, I > have found that PG11 is not performing partition pruning with DELETEs > (explain analyze returned >2000 lines). I then ran the same test in PG12 > and recreated the objects using the same DDL, and it worked > Is this a bug, somewhat related to MergeAppend? > https://github.com/postgres/postgres/commit/5220bb7533f9891b1e071da6461d5c387e8f7b09 > If it is, anyone know if we have a workaround for DELETEs to use hash > partitions in PG11? I think due to this commit to pg12: https://commitfest.postgresql.org/22/1778/ https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=428b260f87e8861ba8e58807b69d433db491c4f4 ... https://www.postgresql.org/message-id/5c83dbca-12b5-1acf-0e85-58299e464a26%40lab.ntt.co.jp https://www.postgresql.org/message-id/4f049572-9440-3c99-afa1-f7ca7f38fe80%40lab.ntt.co.jp -- Justin
