Performance degrade in Planning Time to find appropriate Partial Index
I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The table contains multiple form data differentiated by ID range. Hence a column contains more than one form data. To achieve Unique Constraint and Indexing per form, I chose PostgreSQL Partial Indexes which suits my requirement. I have created Partial Indexes with ID Range as criteria and it provides Uniqueness and Indexing per form basis as expected. But DML operations on a particular form scans all the Indexes created for the entire table instead of scanning the Indexes created for that particular form ID Range. This degrades Planner Performance and Query Time more than 10 times as below, Query Result for the table with 3000 Partial Indexes(15 Indexes per form) : explain analyse select id from form_data_copy where id between 3001 and 4000 and bigint50=789; QUERY PLAN -- Index Scan using form_data_1_bigint50_3000 on form_data_copy (cost=0.28..8.29 rows=1 width=8) (actual time=0.057..0.057 rows=0 loops=1) Index Cond: (bigint50 = 789) *Planning time: 99.287 ms* Execution time: 0.112 ms (4 rows) *Time: 103.967 ms* Query Result for the table with no Indexes(with same record count as above table) : explain analyse select id from form_data_copy1 where id between 3001 and 4000 and bigint50=789; QUERY PLAN --- Index Scan using form_data_copy1_fk1_idx on form_data_copy1 (cost=0.42..208.62 rows=1 width=8) (actual time=1.576..1.576 rows=0 loops=1) Index Cond: ((id >= 3001) AND (id <= 4000)) Filter: (bigint50 = 789) Rows Removed by Filter: 859 Planning time: 1.243 ms Execution time: 1.701 ms (6 rows) Time: *5.891 ms* To ensure that the Planning Time 99.287 ms is not the time taken for scanning 15 Indexes for the form, I have created only 15 Indexes for the table and got the result as below, explain analyse select id from form_data_copy1 where id between 3001 and 4000 and bigint50=789; QUERY PLAN --- Index Scan using form_data_copy1_bigint50_3000 on form_data_copy1 (cost=0.28..8.29 rows=1 width=8) (actual time=0.025..0.025 rows=0 loops=1) Index Cond: (bigint50 = 789) Planning time: 3.017 ms Execution time: 0.086 ms (4 rows) Time: 7.291 ms It seems PGSQL scans all 3000 Indexes even though I provided the ID Range in the query. Please clarify whether my assumption is correct or the reason for this more Planning Time. Also, suggest me the way to reduce this planning time.
Re: Performance degrade in Planning Time to find appropriate Partial Index
Meenatchi Sandanam wrote: > I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The table > contains > multiple form data differentiated by ID range. Hence a column contains more > than one form data. > To achieve Unique Constraint and Indexing per form, I chose PostgreSQL > Partial Indexes > which suits my requirement. I have created Partial Indexes with ID Range as > criteria and > it provides Uniqueness and Indexing per form basis as expected. But DML > operations on a > particular form scans all the Indexes created for the entire table instead of > scanning > the Indexes created for that particular form ID Range. This degrades Planner > Performance > and Query Time more than 10 times as below, > > Query Result for the table with 3000 Partial Indexes(15 Indexes per form) : It is crazy to create 3000 partial indexes on one table. No wonder planning and DML statements take very long, they have to consider all the indexes. > explain analyse select id from form_data_copy where id between 3001 and 4000 > and bigint50=789; Use a single index on (bigint50, id) for best performance. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Performance degrade in Planning Time to find appropriate Partial Index
On Thu, Mar 1, 2018 at 03:10 Meenatchi Sandanam wrote: > I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The > table contains multiple form data differentiated by ID range. Hence a > column contains more than one form data. To achieve Unique Constraint and > Indexing per form, I chose PostgreSQL Partial Indexes which suits my > requirement. I have created Partial Indexes with ID Range as criteria and > it provides Uniqueness and Indexing per form basis as expected. But DML > operations on a particular form scans all the Indexes created for the > entire table instead of scanning the Indexes created for that particular > form ID Range. This degrades Planner Performance and Query Time more than > 10 times as below, > > Query Result for the table with 3000 Partial Indexes(15 Indexes per form) > : > This smells like you’ve failed to normalize your data correctly. 3k indexes to ensure uniqueness ? It sounds a lot more like you need 15 tables for 15 forms ... perhaps with a view for reading or maybe 1/15th of the columns to begin with by having a form_type column...or perhaps like an index function for the unique constraintsuch that the output of the function is the normalized portion of data that’s required to be unique If you’ve really got 3k different uniqueness criteria differing by “id” ranges then it sounds like an expression index with a function spitting out the hash of uniqueness but that’d still be hairy, at least you wouldn’t eat the time on every read though. But I’d reduce that id range based problem to include a unique_type indicator column instead. > -- "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler
Re: Performance degrade in Planning Time to find appropriate Partial Index
Il 01/03/2018 15:16, Michael Loftis ha scritto: On Thu, Mar 1, 2018 at 03:10 Meenatchi Sandanamwrote: I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The table contains multiple form data differentiated by ID range. Hence a column contains more than one form data. To achieve Unique Constraint and Indexing per form, I chose PostgreSQL Partial Indexes which suits my requirement. I have created Partial Indexes with ID Range as criteria and it provides Uniqueness and Indexing per form basis as expected. But DML operations on a particular form scans all the Indexes created for the entire table instead of scanning the Indexes created for that particular form ID Range. This degrades Planner Performance and Query Time more than 10 times as below, Query Result for the table with 3000 Partial Indexes(15 Indexes per form) : This smells like you’ve failed to normalize your data correctly. 3k indexes to ensure uniqueness ? It sounds a lot more like you need 15 tables for 15 forms ... ... or a column that specifies, e.g., the form ID. If all form has not the same number of BIGINT and TEXT, keep the maximum value and fill only the requested ones. You can also use the EAV schema, where the Entity is the form, the Attribute is the field, and the Value... is the value. CREATE TABLE tbl( id bigint, entity integer, attribute integer, --(or string, as you need) value_int bigint, value_string text ); This way you'll get more rows, but very thin, and with not more than 3 or 4 indexes (based on the querues you need to perform) you can retrieve values quickly. My 2 cent Moreno.-
