Partitioning Optimizer Questions and Issues
I am using Postgres on a large system (recording approximately 20million
transactions per day). We use partitioning by date to assist with both vacuum
processing time and to archive old data. At the core of the system are records
in 2 different tables detailing different types of activity for monetary
transactions (e.g. money in and money out) -> a single transaction has entries
in both tables, so to retrieve all details for a single transaction we need to
join the 2 tables.
The use of partitioning however has a significant impact on the performance of
retrieving this data. Being relatively new to Postgres I wanted to share my
findings and understand how others address them. We run postgres version 9.6 on
CentOS, but the same behaviour is apparent in postgres 10.6. The test case
outputs are from version 10.6 running on my Ubuntu machine with default
postgres configuration.
Below is an example script to populate the test data:
===drop table if exists tablea cascade;drop table if exists
tableb cascade;
CREATE TABLE tablea ( id serial, reference int not null,
created date not null) PARTITION BY RANGE (created);
CREATE TABLE tablea_part1 PARTITION OF tablea FOR VALUES FROM ('2018-01-01')
TO ('2018-01-02');CREATE TABLE tablea_part2 PARTITION OF tablea FOR VALUES
FROM ('2018-01-02') TO ('2018-01-03');CREATE TABLE tablea_part3 PARTITION OF
tablea FOR VALUES FROM ('2018-01-03') TO ('2018-01-04');CREATE TABLE
tablea_part4 PARTITION OF tablea FOR VALUES FROM ('2018-01-04') TO
('2018-01-05');CREATE TABLE tablea_part5 PARTITION OF tablea FOR VALUES FROM
('2018-01-05') TO ('2018-01-06');
CREATE INDEX tablea_id_1 ON tablea_part1 (id);CREATE INDEX tablea_id_2 ON
tablea_part2 (id);CREATE INDEX tablea_id_3 ON tablea_part3 (id);CREATE INDEX
tablea_id_4 ON tablea_part4 (id);CREATE INDEX tablea_id_5 ON tablea_part5
(id);CREATE INDEX tablea_reference_1 ON tablea_part1 (reference);CREATE INDEX
tablea_reference_2 ON tablea_part2 (reference);CREATE INDEX tablea_reference_3
ON tablea_part3 (reference);CREATE INDEX tablea_reference_4 ON tablea_part4
(reference);CREATE INDEX tablea_reference_5 ON tablea_part5 (reference);CREATE
INDEX tablea_created_1 ON tablea_part1 (created);CREATE INDEX tablea_created_2
ON tablea_part2 (created);CREATE INDEX tablea_created_3 ON tablea_part3
(created);CREATE INDEX tablea_created_4 ON tablea_part4 (created);CREATE INDEX
tablea_created_5 ON tablea_part5 (created);alter table tablea_part1 add CHECK (
created >= DATE '2018-01-01' AND created < DATE '2018-01-02');alter table
tablea_part2 add CHECK ( created >= DATE '2018-01-02' AND created < DATE
'2018-01-03');alter table tablea_part3 add CHECK ( created >= DATE '2018-01-03'
AND created < DATE '2018-01-04');alter table tablea_part4 add CHECK ( created
>= DATE '2018-01-04' AND created < DATE '2018-01-05');alter table tablea_part5
add CHECK ( created >= DATE '2018-01-05' AND created < DATE '2018-01-06');
create or replace function populate_tablea() RETURNS integer AS$BODY$
DECLARE
i integer; v_created
date;BEGIN i := 0;
WHILE (i < 5) loop i := i + 1;
IF (mod(i,5) = 1) THEN
v_created = '2018-01-01'; ELSIF (mod(i,5) = 2) THEN v_created
= '2018-01-02'; ELSIF (mod(i,5) = 3) THEN v_created =
'2018-01-03'; ELSIF (mod(i,5) = 4) THEN v_created =
'2018-01-04'; ELSIF (mod(i,5) = 0) THEN v_created =
'2018-01-05'; END IF; insert into
tablea values (i, i, v_created);
end loop;
RETURN i;END;$BODY$LANGUAGE plpgsql VOLATILE
COST 100;
CREATE TABLE tableb( id serial, reference int not null,
created date not null) PARTITION BY RANGE (created);
CREATE TABLE tableb_part1 PARTITION OF tableb FOR VALUES FROM ('2018-01-01')
TO ('2018-01-02');CREATE TABLE tableb_part2 PARTITION OF tableb FOR VALUES
FROM ('2018-01-02') TO ('2018-01-03');CREATE TABLE tableb_part3 PARTITION OF
tableb FOR VALUES FROM ('2018-01-03') TO ('2018-01-04');CREATE TABLE
tableb_part4 PARTITION OF tableb FOR VALUES FROM ('2018-01-04') TO
('2018-01-05');CREATE TABLE tableb_part5 PARTITION OF tableb FOR VALUES FROM
('2018-01-05') TO ('2018-01-06');
CREATE INDEX tableb_id_1 ON tableb_part1 (id);CREATE INDEX tableb_id_2 ON
tableb_part2 (id);CREATE INDEX tableb_id_3 ON tableb_part3 (id);CREATE INDEX
tableb_id_4 ON tableb_part4 (id);CREATE INDEX tableb_id_5 ON tableb_part5
(id);CREATE INDEX tableb_reference_1 ON tableb_part1 (reference);CREATE INDEX
Re: Partitioning Optimizer Questions and Issues
On Fri, Feb 08, 2019 at 11:13:51AM +, keith anderson wrote: > So to summarise the findings/questions from above: > - It seems like the Postgres optimizer sometimes uses the partition level > statistics, and sometimes the global table level statistics? Or is it using > something else?- With partitioning tables with unique identifier and > retrieving explicitly on those identifiers, at present the optimizer will > always understimate the selectivity and overestimate the rows returned. This > inaccuracy increases in proportion to the number of partitions.- As a result, > when joining to other tables, you are liable to hitting sequential scans. > This becomes more likely as you have more partitions or if join to more > partitioned tables (note I am aware I could try and tune random_page_cost to > try and prevent this).- To me in the examples queries described above, it > makes sense to use the partition statistics for the partition level access > strategy, but the global statistics when estimating the actual rows returned > by all the individual partition queries. Is there a reason not to do this? Or > do others believe the optimizer is doing the right thing here? > And then some general questions: > - How do other people use partitioning but without a significant performance > disadvantage on reading the data? Is there something else I should be doing > here to achieve the same thing without the overhead? At present my reads have > increased optimization cost (as it needs to optimize access to each > partition) and also execution cost (access the index on every partition). > Even without the optimizer issues described above, the cost of reading simple > data is extremely high relative to non-partitioned data (unless you use the > partition key as a filter for each table to eliminate those partitions).- Is > there any chance/plan to add global indexes to postgres? If so would that > impact significantly the cost of the partition drop e.g. to clean up the > index. > Thanks in advance for any feedback/support, An equality or IN() query will use the pg_stats most-common-values list, whereas a range query will use the histogram. The tables probably doesn't have complete MCV list. By default, that's limited to 100 entries. Since the maximum allowed by ALTER..SET STATISTICS is 10k, I don't think it'll help to change it (at least for your production use case). Each partition's rowcount appears to be estimated from its ndistinct, and not from its content, so each is estimated as having about the same rowcount. Your partitions are sharing a sequence for their ID column, which causes the DEFAULT IDs to be unique...but their global uniqueness isn't enforced nor guaranteed. Note, in postgres11, it's possible to create an index on the parent table. It's NOT a global index, but it can be unique if it includes the partition key. I don't know how closely your example describes your real use case, but I don't think that helps with your example; it doesn't seems useful to partition on a serial column. You seem to be adding unnecessary CHECK constraints that duplicate the partition bounds. Note, it's still useful to include CHECK constraints on key column if you're planning on DETACHing and re-ATTACHing the partitions, in order to avoid seqscan to verify tuples don't violate specified bounds. You might need to rethink your partitioning scheme - you should choose one that causes performance to improve, and probably naturally includes the partition key in most queries. Perhaps you'd use 2 levels of partitioning: a RANGE partition by date, which allows for archiving, and a HASH partition by ID, which allows for partition pruning. Note that it's also possible to partition on multiple keys, like RANGE(id,date) - I don't think that's useful here, though. PG11 also allows a "default" partition. Or perhaps you could partition by RANGE(date) but add CHECK constraints on ID, after the table is fully populated, to optimize queries by allowing for partition pruning. Or you could maybe change the ID column to include the timestamp (like BIND zonesfiles MMDD). You'd set a bigint sequence on each partition's ID as default to the beginning of the month. A bigint is enough to handle 5*10^4 times your volume: 2019040120111222. (I think this is trying to be unnecessarily clever, unless there's some reason the other two ideas don't work.) Justin
