Performance degrade in Planning Time to find appropriate Partial Index

2018-03-01 Thread Meenatchi Sandanam
I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The
table contains multiple form data differentiated by ID range. Hence a
column contains more than one form data. To achieve Unique Constraint and
Indexing per form, I chose PostgreSQL Partial Indexes which suits my
requirement. I have created Partial Indexes with ID Range as criteria and
it provides Uniqueness and Indexing per form basis as expected. But DML
operations on a particular form scans all the Indexes created for the
entire table instead of scanning the Indexes created for that particular
form ID Range. This degrades Planner Performance and Query Time more than
10 times as below,

Query Result for the table with 3000 Partial Indexes(15 Indexes per form) :

explain analyse select id from form_data_copy where id between 3001 and
4000 and bigint50=789;
QUERY PLAN


--
Index Scan using form_data_1_bigint50_3000 on form_data_copy
(cost=0.28..8.29 rows=1 width=8) (actual time=0.057..0.057 rows=0 loops=1)
Index Cond: (bigint50 = 789)
*Planning time: 99.287 ms*
Execution time: 0.112 ms
(4 rows)

*Time: 103.967 ms*

Query Result for the table with no Indexes(with same record count as above
table) :

explain analyse select id from form_data_copy1 where id between 3001 and
4000 and bigint50=789; QUERY PLAN


---
Index Scan using form_data_copy1_fk1_idx on form_data_copy1
(cost=0.42..208.62 rows=1 width=8) (actual time=1.576..1.576 rows=0
loops=1)
Index Cond: ((id >= 3001) AND (id <= 4000))
Filter: (bigint50 = 789)
Rows Removed by Filter: 859
Planning time: 1.243 ms
Execution time: 1.701 ms
(6 rows)

Time: *5.891 ms*


To ensure that the Planning Time 99.287 ms is not the time taken for
scanning 15 Indexes for the form, I have created only 15 Indexes for the
table and got the result as below,

explain analyse select id from form_data_copy1 where id between 3001 and
4000 and bigint50=789;
QUERY PLAN


---
Index Scan using form_data_copy1_bigint50_3000 on form_data_copy1
(cost=0.28..8.29 rows=1 width=8) (actual time=0.025..0.025 rows=0 loops=1)
Index Cond: (bigint50 = 789)
Planning time: 3.017 ms
Execution time: 0.086 ms
(4 rows)

Time: 7.291 ms

It seems PGSQL scans all 3000 Indexes even though I provided the ID Range
in the query. Please clarify whether my assumption is correct or the reason
for this more Planning Time. Also, suggest me the way to reduce this
planning time.


Re: Performance degrade in Planning Time to find appropriate Partial Index

2018-03-01 Thread Laurenz Albe
Meenatchi Sandanam wrote:
> I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The table 
> contains
> multiple form data differentiated by ID range. Hence a column contains more 
> than one form data.
> To achieve Unique Constraint and Indexing per form, I chose PostgreSQL 
> Partial Indexes
> which suits my requirement. I have created Partial Indexes with ID Range as 
> criteria and
> it provides Uniqueness and Indexing per form basis as expected. But DML 
> operations on a
> particular form scans all the Indexes created for the entire table instead of 
> scanning
> the Indexes created for that particular form ID Range. This degrades Planner 
> Performance
> and Query Time more than 10 times as below, 
> 
> Query Result for the table with 3000 Partial Indexes(15 Indexes per form) : 

It is crazy to create 3000 partial indexes on one table.

No wonder planning and DML statements take very long, they have to consider all 
the
indexes.

> explain analyse select id from form_data_copy where id between 3001 and 4000 
> and bigint50=789;

Use a single index on (bigint50, id) for best performance.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



Re: Performance degrade in Planning Time to find appropriate Partial Index

2018-03-01 Thread Michael Loftis
On Thu, Mar 1, 2018 at 03:10 Meenatchi Sandanam  wrote:

> I have created a table with 301 columns(ID, 150 BIGINT, 150 TEXT). The
> table contains multiple form data differentiated by ID range. Hence a
> column contains more than one form data. To achieve Unique Constraint and
> Indexing per form, I chose PostgreSQL Partial Indexes which suits my
> requirement. I have created Partial Indexes with ID Range as criteria and
> it provides Uniqueness and Indexing per form basis as expected. But DML
> operations on a particular form scans all the Indexes created for the
> entire table instead of scanning the Indexes created for that particular
> form ID Range. This degrades Planner Performance and Query Time more than
> 10 times as below,
>
> Query Result for the table with 3000 Partial Indexes(15 Indexes per form)
> :
>

This smells like you’ve failed to normalize your data correctly. 3k indexes
to ensure uniqueness ? It sounds a lot more like you need 15 tables for 15
forms ... perhaps with a view for reading or maybe 1/15th of the columns to
begin with by having a form_type column...or perhaps like an index function
for the unique constraintsuch that the output of the function is the
normalized portion of data that’s required to be unique

If you’ve really got 3k different uniqueness criteria differing by “id”
ranges then it sounds like an expression index with a function spitting out
the hash of uniqueness but that’d still be hairy, at least you wouldn’t eat
the time on every read though. But I’d reduce that id range based problem
to include a unique_type indicator column instead.



> --

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler


Re: Performance degrade in Planning Time to find appropriate Partial Index

2018-03-01 Thread Moreno Andreo

  
  
Il 01/03/2018 15:16, Michael Loftis ha
  scritto:


  

  On Thu, Mar 1, 2018 at 03:10 Meenatchi
Sandanam  wrote:
  
  
I
have created a table with 301 columns(ID, 150 BIGINT,
150 TEXT). The table contains multiple form data
differentiated by ID range. Hence a column contains more
than one form data. To achieve Unique Constraint and
Indexing per form, I chose PostgreSQL Partial Indexes
which suits my requirement. I have created Partial
Indexes with ID Range as criteria and it provides
Uniqueness and Indexing per form basis as expected. But
DML operations on a particular form scans all the
Indexes created for the entire table instead of scanning
the Indexes created for that particular form ID Range.
This degrades Planner Performance and Query Time more
than 10 times as below, 
  
  Query
Result for the table with 3000 Partial Indexes(15
Indexes per form) : 
  
  
  
  This smells like you’ve failed to normalize
your data correctly. 3k indexes to ensure uniqueness ? It
sounds a lot more like you need 15 tables for 15 forms ... 
  

  

... or a column that specifies, e.g., the form ID. If all form has
not the same number of BIGINT and TEXT, keep the maximum value and
fill only the requested ones.

You can also use the EAV schema, where the Entity is the form, the
Attribute is the field, and the Value... is the value.

CREATE TABLE tbl(
id bigint,
entity integer,
attribute integer, --(or string, as you need)
value_int bigint,
value_string text
);

This way you'll get more rows, but very thin, and with not more than
3 or 4 indexes (based on the querues you need to perform) you can
retrieve values quickly.

My 2 cent
Moreno.-