PostgreSQL 11 higher Planning time on Partitioned table
Hi,
I am looking to Range Partition one of my table (i.e. TransactionLog) in
PostgreSQL 11.While evaluating query performance difference between the
un-partitioned and partitioned table I am getting huge difference in planning
time. Planning time is very high on partitioned table.Similarly when I query by
specifying partition name directly in query the planning time is much less
**0.081 ms** as compared to when I query based on partition table (parent
table) name in query, where planning time **6.231 ms** (Samples below).
Below are the details, Let me know how can I improve query performance on
partitioned table.
Following is the schema CREATE TABLE TransactionLog (
txid character varying(36) NOT NULL, txnDetails character varying(64),
loggingtime timestamp(6) without time zone DEFAULT LOCALTIMESTAMP,) PARTITION
BY RANGE(loggingtime);
CREATE TABLE IF NOT EXISTS TransactionLog_20200223 PARTITION OF TransactionLog
FOR VALUES FROM ('2020-02-23') TO ('2020-02-24');CREATE UNIQUE INDEX
TransactionLog_20200223_UnqTxId ON TransactionLog_20200223 (txnid);
Following is explain analyze result when I query Directly on partition.
Planning time ~**0.080 ms** (average of 10 execution)postgres=> EXPLAIN
(ANALYZE,VERBOSE,COSTS,BUFFERS,TIMING,SUMMARY) select txnDetails FROM
mra_part.TransactionLog_20200223 WHERE txnid =
'febd139d-1b7f-4564-a004-1b3474e51756';
QUERY
PLAN-
Index Scan using TransactionLog_20200223_UnqTxId on TransactionLog_20200223
(cost=0.57..4.61 rows=1 width=10) (actual time=0.039..0.040 rows=1 loops=1)
Output: txnDetails Index Cond: ((TransactionLog_20200223.txnid)::text =
'febd139d-1b7f-4564-a004-1b3474e51756'::text) Buffers: shared hit=5
**Planning Time: 0.081 ms** Execution Time: 0.056 ms(6 rows)
Following is explain analyze result when I query by parent-table. Planning time
**6.198 ms** (average of 10 execution)postgres=> EXPLAIN
(ANALYZE,VERBOSE,COSTS,BUFFERS,TIMING,SUMMARY) select txnDetails FROM
mtdauthlog WHERE txnid = 'febd139d-1b7f-4564-a004-1b3474e51756' AND loggingtime
>= '2020-02-23'::timestamp without time zone AND loggingtime <
'2020-02-24'::timestamp without time zone;
QUERY
PLAN
Append (cost=0.57..4.62 rows=1 width=10) (actual time=0.036..0.037 rows=1
loops=1) Buffers: shared hit=5 -> Index Scan using
TransactionLog_20200223_UnqTxId on TransactionLog_20200223 (cost=0.57..4.61
rows=1 width=10) (actual time=0.035..0.036 rows=1 loops=1) Output:
TransactionLog_20200223.txnDetails Index Cond:
((TransactionLog_20200223.txnid)::text =
'febd139d-1b7f-4564-a004-1b3474e51756'::text) Filter:
((TransactionLog_20200223.loggingtime >= '2020-02-23 00:00:00'::timestamp
without time zone) AND (TransactionLog_20200223.loggingtime < '2020-02-24
00:00:00'::timestamp without time zone)) Buffers: shared hit=5
**Planning Time: 6.231 ms** Execution Time: 0.076 ms(9 rows)
There are around ~200 child partitions. Partition pruning enabled.PostgreSQL
Version: PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
Thanks and Regards,
Ravi Garg,
Re: PostgreSQL 11 higher Planning time on Partitioned table
On Sun, Feb 23, 2020 at 09:56:30AM +, Ravi Garg wrote: > Hi, > I am looking to Range Partition one of my table (i.e. TransactionLog) in > PostgreSQL 11.While evaluating query performance difference between the > un-partitioned and partitioned table I am getting huge difference in planning > time. Planning time is very high on partitioned table.Similarly when I query > by specifying partition name directly in query the planning time is much less > **0.081 ms** as compared to when I query based on partition table (parent > table) name in query, where planning time **6.231 ms** (Samples below). That's probably to be expected under pg11: https://www.postgresql.org/docs/11/ddl-partitioning.html |Too many partitions can mean longer query planning times... |It is also important to consider the overhead of partitioning during query planning and execution. The query planner is generally able to handle partition hierarchies with up to a few hundred partitions fairly well, provided that typical queries allow the query planner to prune all but a small number of partitions. Planning times become longer and memory consumption becomes higher as more partitions are added > There are around ~200 child partitions. Partition pruning enabled.PostgreSQL > Version: PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 > 20150623 (Red Hat 4.8.5-39), 64-bit How large are the partitions and how many indexes each, and how large are they? Each partition will be stat()ed and each index will be open()ed and read() for every query. This was resolved in pg12: https://commitfest.postgresql.org/21/1778/ -- Justin
Re: PostgreSQL 11 higher Planning time on Partitioned table
On Sun, Feb 23, 2020 at 04:12:09AM -0600, Justin Pryzby wrote: > How large are the partitions and how many indexes each, and how large are > they? > Each partition will be stat()ed ... for every query. I should have said that's every 1GB "segment" is stat()ed for every query. > This was resolved in pg12: > https://commitfest.postgresql.org/21/1778/ + https://www.postgresql.org/about/featurematrix/detail/320/ -- Justin
Re: PostgreSQL 11 higher Planning time on Partitioned table
Hi Justin, Thanks for response. Unfortunately we will not be able to migrate to PG12 any time soon. - There is only one index (unique index btree) on 'txnID' (i.e. transaction ID) character varying(36). Which we are creating on each partition. - Our use case is limited to simple selects (we don't join with the other tables) however, we are expecting ~70 million records inserted per day and there would be couple of updates on each records where average record size would be ~ 1.5 KB. - Currently we are thinking to have Daily partitions and as we need to keep 6 months of data thus 180 Partitions.However we have liberty to reduce the number of partitions to weekly/fortnightly/monthly, If we get comparable performance. - We need to look current partition and previous partition for all of our use-cases/queries. Can you please suggest what sort of combinations/partition strategy we can test considering data-volume/vacuum etc. Also let me know if some of the pg_settings can help us tuning this (I have attached my pg_settings). Thanks and Regards, Ravi Garg, Mob : +91-98930-66610 On Sunday, 23 February, 2020, 03:42:13 pm IST, Justin Pryzby wrote: On Sun, Feb 23, 2020 at 09:56:30AM +, Ravi Garg wrote: > Hi, > I am looking to Range Partition one of my table (i.e. TransactionLog) in > PostgreSQL 11.While evaluating query performance difference between the > un-partitioned and partitioned table I am getting huge difference in planning > time. Planning time is very high on partitioned table.Similarly when I query > by specifying partition name directly in query the planning time is much less > **0.081 ms** as compared to when I query based on partition table (parent > table) name in query, where planning time **6.231 ms** (Samples below). That's probably to be expected under pg11: https://www.postgresql.org/docs/11/ddl-partitioning.html |Too many partitions can mean longer query planning times... |It is also important to consider the overhead of partitioning during query planning and execution. The query planner is generally able to handle partition hierarchies with up to a few hundred partitions fairly well, provided that typical queries allow the query planner to prune all but a small number of partitions. Planning times become longer and memory consumption becomes higher as more partitions are added > There are around ~200 child partitions. Partition pruning enabled.PostgreSQL > Version: PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 > 20150623 (Red Hat 4.8.5-39), 64-bit How large are the partitions and how many indexes each, and how large are they? Each partition will be stat()ed and each index will be open()ed and read() for every query. This was resolved in pg12: https://commitfest.postgresql.org/21/1778/ -- Justin postgres=# select name, setting from pg_settings order by name desc; name | setting +- zero_damaged_pages | off xmloption | content xmlbinary | base64 work_mem | 4096 wal_writer_flush_after | 128 wal_writer_delay | 200 wal_sync_method| fdatasync wal_sender_timeout | 6 wal_segment_size | 1073741824 wal_retrieve_retry_interval| 5000 wal_receiver_timeout | 6 wal_receiver_status_interval | 10 wal_log_hints | on wal_level | replica wal_keep_segments | 50 wal_consistency_checking | wal_compression| off wal_buffers| 8192 wal_block_size | 8192 vacuum_multixact_freeze_table_age | 15000 vacuum_multixact_freeze_min_age| 500 vacuum_freeze_table_age| 15000 vacuum_freeze_min_age | 5000 vacuum_defer_cleanup_age | 0 vacuum_cost_page_miss | 10 vacuum_cost_page_hit | 1 vacuum_cost_page_dirty | 20 vacuum_cost_limit | 800 vacuum_cost_delay | 0 vacuum_cleanup_index_scale_factor | 0.1 update_process_title | on unix_socket_permissions| 0777 unix_socket_group | unix_socket_directories| /tmp transform_null_equals | off transaction_read_only | off transaction_isolation | read committed transaction_deferrable | off track_io_timing| on track_functions| all track_counts | on tr
Re: PostgreSQL 11 higher Planning time on Partitioned table
On Sun, Feb 23, 2020 at 10:57:29AM +, Ravi Garg wrote: >- Currently we are thinking to have Daily partitions and as we need to > keep 6 months of data thus 180 Partitions.However we have liberty to reduce > the number of partitions to weekly/fortnightly/monthly, If we get comparable > performance. I didn't hear how large the tables and indexes are. >- We need to look current partition and previous partition for all of our > use-cases/queries. Do you mean that a given query is only going to hit 2 partitions ? Or do you mean that all but the most recent 2 partitions are "archival" and won't be needed by future queries ? > Can you please suggest what sort of combinations/partition strategy we can > test considering data-volume/vacuum etc. Also let me know if some of the > pg_settings can help us tuning this (I have attached my pg_settings). You should determine what an acceptable planning speed is, or the best balance of planning/execution time. Try to detach half your current partitions and, if that gives acceptable performance, then partition by day/2 or more. You could make a graph of (planning and total) time vs npartitions, since I think it's likely to be nonlinear. I believe others have reported improved performance under v11 with larger numbers of partitions, by using "partitions of partitions". So you could try making partitions by month themselves partitioned by day. >- Our use case is limited to simple selects (we don't join with the other >tables) however, we are expecting ~70 million records inserted per day >and there would be couple of updates on each records where average record >size would be ~ 1.5 KB. > shared_buffers | 1048576 If you care about INSERT performance, you probably need to make at least a single partition's index fit within shared_buffers (or set shared_buffers such that it fits). Use transactions around your inserts. If your speed is not limited by I/O, you could further use multiple VALUES(),() inserts, or maybe prepared statements. Maybe synchronous_commit=off. If you care about (consistent) SELECT performance, you should consider VACUUMing the tables after bulk inserts, to set hint bits (and since non-updated tuples won't be hit by autovacuum). Or maybe VACUUM FREEZE to freeze tuples (since it sounds like a typical page is unlikely to ever be updated). -- Justin
Re: PostgreSQL 11 higher Planning time on Partitioned table
> ... txid character varying(36) NOT NULL, > ... WHERE txnid = 'febd139d-1b7f-4564-a004-1b3474e51756' > There is only one index (unique index btree) on 'txnID' (i.e. transaction ID) character varying(36). Which we are creating on each partition. IF txnid is real UUID , then you can test the https://www.postgresql.org/docs/11/datatype-uuid.html performance see https://stackoverflow.com/questions/29880083/postgresql-uuid-type-performance imho: it should be better. best, Imre Ravi Garg ezt írta (időpont: 2020. febr. 23., V, 11:57): > Hi Justin, > > Thanks for response. > > Unfortunately we will not be able to migrate to PG12 any time soon. > >- There is only one index (unique index btree) on 'txnID' (i.e. >transaction ID) character varying(36). Which we are creating on each >partition. >- Our use case is limited to simple selects (we don't join with the >other tables) however, we are expecting ~70 million records inserted >per day and there would be couple of updates on each records where average >record size would be ~ 1.5 KB. >- Currently we are thinking to have Daily partitions and as we need to >keep 6 months of data thus 180 Partitions.However we have liberty to reduce >the number of partitions to weekly/fortnightly/monthly, If we get >comparable performance. >- We need to look current partition and previous partition for all of >our use-cases/queries. > > Can you please suggest what sort of combinations/partition strategy we can > test considering data-volume/vacuum etc. Also let me know if some of the > pg_settings can help us tuning this (I have attached my pg_settings). > > > Thanks and Regards, > Ravi Garg, > Mob : +91-98930-66610 > > > On Sunday, 23 February, 2020, 03:42:13 pm IST, Justin Pryzby < > [email protected]> wrote: > > > On Sun, Feb 23, 2020 at 09:56:30AM +, Ravi Garg wrote: > > Hi, > > I am looking to Range Partition one of my table (i.e. TransactionLog) in > PostgreSQL 11.While evaluating query performance difference between the > un-partitioned and partitioned table I am getting huge difference in > planning time. Planning time is very high on partitioned table.Similarly > when I query by specifying partition name directly in query the planning > time is much less **0.081 ms** as compared to when I query based on > partition table (parent table) name in query, where planning time **6.231 > ms** (Samples below). > > That's probably to be expected under pg11: > > https://www.postgresql.org/docs/11/ddl-partitioning.html > |Too many partitions can mean longer query planning times... > |It is also important to consider the overhead of partitioning during > query planning and execution. The query planner is generally able to handle > partition hierarchies with up to a few hundred partitions fairly well, > provided that typical queries allow the query planner to prune all but a > small number of partitions. Planning times become longer and memory > consumption becomes higher as more partitions are added > > > > There are around ~200 child partitions. Partition pruning > enabled.PostgreSQL Version: PostgreSQL 11.7 on x86_64-pc-linux-gnu, > compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit > > > How large are the partitions and how many indexes each, and how large are > they? > Each partition will be stat()ed and each index will be open()ed and read() > for > every query. This was resolved in pg12: > https://commitfest.postgresql.org/21/1778/ > > -- > Justin > >
