Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Trying to optimize the Elapsed Time (ET) of this query. Currently, it is hovering around 3 hrs. Running a 'vaccum analyse' had no effect on ET. Even forcing an 'indexonly' scan by disabling 'enable_seqscan', still around the 3 hrs. The table is around 4.6B rows, explain select cit_id, cl_value from reflink.citation_locators where cl_value = '1507617681' and vclf_number = 1 ; QUERY PLAN - Bitmap Heap Scan on citation_locators (cost=5066559.01..50999084.79 rows=133 width=23) Recheck Cond: (vclf_number = 1) Filter: (cl_value = '1507617681'::text) -> Bitmap Index Scan on cl_indx_fk02 (cost=0.00..5066558.97 rows=493984719 width=0) Index Cond: (vclf_number = 1) (5 rows) reflink.citation_locators Table"reflink.citation_locators" Column | Type | Modifiers | Storage | Stats target | Description --+--+---+--+--+- cl_id| bigint | notnull | plain| | cl_value | text | notnull | extended | | vclf_number | integer | notnull | plain| | cit_id | bigint | notnull | plain| | cl_date_created | timestamp with time zone | notnull | plain| | cl_date_modified | timestamp with time zone | | plain| | Indexes: "cl_pk" PRIMARY KEY, btree (cl_id) "cl_cnst_uk01" UNIQUE CONSTRAINT, btree (cit_id, vclf_number, cl_value) "cl_indx_fk01" btree (cit_id) "cl_indx_fk02" btree (vclf_number) Foreign-key constraints: "cl_cnst_fk01" FOREIGN KEY (cit_id) REFERENCES citations(cit_id) NOT VALID"cl_cnst_fk02" FOREIGN KEY (vclf_number) REFERENCES valid_cit_locator_fields(vclf_number)
Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Hello Try using index btree(vclf_number, cl_value) instead of btree (vclf_number). regards, Sergei
Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Fred Habash writes: > Indexes: > "cl_pk" PRIMARY KEY, btree (cl_id) > "cl_cnst_uk01" UNIQUE CONSTRAINT, btree (cit_id, vclf_number, cl_value) > "cl_indx_fk01" btree (cit_id) > "cl_indx_fk02" btree (vclf_number) This is pretty inefficient index design. Your query is slow because the only selective condition it has is on cl_value, but you have no index that can be searched with cl_value as the leading condition. Moreover, you have two indexes that can be searched with cit_id as the leading condition, which is just wasteful. I'd try reorganizing the cl_cnst_uk01 index as (cl_value, vclf_number, cit_id) so that it can serve for searches on cl_value, while still enforcing the same uniqueness condition. This particular column ordering would also let your query use the vclf_number constraint as a secondary search condition, which would help even more. There's relevant advice about index design in the manual, https://www.postgresql.org/docs/current/static/indexes.html (see 11.3 and 11.5 particularly) regards, tom lane
Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Probably the cardinality of "vclf_number" is really bad. So the scan on that index is returning many million or billion rows and then you get a recheck which takes semi-forever. So you need an index on cl_value or both vclf_number and cl_value. If you know some properties of the values actually stored inside of those that will help. Matthew Hall > On Jun 5, 2018, at 7:17 AM, Fred Habash wrote: > > Trying to optimize the Elapsed Time (ET) of this query. Currently, it is > hovering around 3 hrs. > > Running a 'vaccum analyse' had no effect on ET. Even forcing an 'indexonly' > scan by disabling 'enable_seqscan', still around the 3 hrs. > The table is around 4.6B rows, > explain select cit_id, cl_value from reflink.citation_locators where > cl_value = '1507617681' and vclf_number = 1 ; >QUERY PLAN > > - > Bitmap Heap Scan on citation_locators (cost=5066559.01..50999084.79 > rows=133 width=23) >Recheck Cond: (vclf_number = 1) >Filter: (cl_value = '1507617681'::text) >-> Bitmap Index Scan on cl_indx_fk02 (cost=0.00..5066558.97 > rows=493984719 width=0) > Index Cond: (vclf_number = 1) > (5 rows) > > reflink.citation_locators > Table "reflink.citation_locators" > Column | Type | Modifiers | Storage | Stats > target | Description > --+--+---+--+--+- > cl_id| bigint | not null | plain| > | > cl_value | text | not null | extended | > | > vclf_number | integer | not null | plain| > | > cit_id | bigint | not null | plain| > | > cl_date_created | timestamp with time zone | not null | plain| > | > cl_date_modified | timestamp with time zone | | plain| > | > Indexes: > "cl_pk" PRIMARY KEY, btree (cl_id) > "cl_cnst_uk01" UNIQUE CONSTRAINT, btree (cit_id, vclf_number, cl_value) > "cl_indx_fk01" btree (cit_id) > "cl_indx_fk02" btree (vclf_number) > Foreign-key constraints: > "cl_cnst_fk01" FOREIGN KEY (cit_id) REFERENCES citations(cit_id) NOT > VALID"cl_cnst_fk02" FOREIGN KEY (vclf_number) REFERENCES > valid_cit_locator_fields(vclf_number)
RE: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Indexes are being redone as per these insights. Appreciate the great support. Thank you From: Matthew Hall Sent: Tuesday, June 5, 2018 10:42 AM To: Fred Habash Cc: [email protected] Subject: Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan Probably the cardinality of "vclf_number" is really bad. So the scan on that index is returning many million or billion rows and then you get a recheck which takes semi-forever. So you need an index on cl_value or both vclf_number and cl_value. If you know some properties of the values actually stored inside of those that will help. Matthew Hall On Jun 5, 2018, at 7:17 AM, Fred Habash wrote: Trying to optimize the Elapsed Time (ET) of this query. Currently, it is hovering around 3 hrs. Running a 'vaccum analyse' had no effect on ET. Even forcing an 'indexonly' scan by disabling 'enable_seqscan', still around the 3 hrs. The table is around 4.6B rows, explain select cit_id, cl_value from reflink.citation_locators where cl_value = '1507617681' and vclf_number = 1 ; QUERY PLAN - Bitmap Heap Scan on citation_locators (cost=5066559.01..50999084.79 rows=133 width=23) Recheck Cond: (vclf_number = 1) Filter: (cl_value = '1507617681'::text) -> Bitmap Index Scan on cl_indx_fk02 (cost=0.00..5066558.97 rows=493984719 width=0) Index Cond: (vclf_number = 1) (5 rows) reflink.citation_locators Table "reflink.citation_locators" Column | Type | Modifiers | Storage | Stats target | Description --+--+---+--+--+- cl_id| bigint | not null | plain| | cl_value | text | not null | extended | | vclf_number | integer | not null | plain| | cit_id | bigint | not null | plain| | cl_date_created | timestamp with time zone | not null | plain| | cl_date_modified | timestamp with time zone | | plain| | Indexes: "cl_pk" PRIMARY KEY, btree (cl_id) "cl_cnst_uk01" UNIQUE CONSTRAINT, btree (cit_id, vclf_number, cl_value) "cl_indx_fk01" btree (cit_id) "cl_indx_fk02" btree (vclf_number) Foreign-key constraints: "cl_cnst_fk01" FOREIGN KEY (cit_id) REFERENCES citations(cit_id) NOT VALID "cl_cnst_fk02" FOREIGN KEY (vclf_number) REFERENCES valid_cit_locator_fields(vclf_number)
