Incorrect index used in few cases..

2019-06-18 Thread AminPG Jaffer
Hi

We recently upgraded one of the instances from 9.6.3 to 9.6.12 and seeing
following issue that occurs for few cases.

I have tried running analyze on the table with different values from 1000 -
5000 but it doesn't seem to help the issue. There is some skew in a_id
but the combination index  i_tc_adid_tid btree (a_id, id) makes the index
unique as it includes primary key.

Is there an explanation why it is using incorrect index?

SQL:
SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))

Indexes on the table:
   i_tc_adid_tid btree (a_id, id)
   pk_id PRIMARY KEY, btree (id)
   i_agi_tc_tcn btree (ag_id, tname)  ---> index that gets used


duration: 49455.649 ms  execute S_10: SELECT count(*) FROM tc WHERE
((tc.a_id = $1)) AND ((tc.m_id = $2)) AND ((tc.ag_id is not null)) AND ((
tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))
DETAIL:  parameters: $1 = '11786959222', $2 = '6', $3 = '54460816501', $4 =
'3', $5 = '6', $6 = '103'
LOG:  duration: 49455.639 ms  plan:
Query Text: SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND
((tc.m_id = $2)) AND ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((
tc.pt in ($4, $5, $6)))
Aggregate  (cost=5009342.34..5009342.35 rows=1 width=8) (actual
time=49455.626..49455.626 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=56288997
  ->  Index Scan using i_agi_tc_tcn on b.tc  (cost=0.57..5009342.34
rows=1 width=0) (actual time=46452.555..49455.616 rows=1 loops=1)
Output: id, tname, ...
Index Cond: (tc.ag_id IS NOT NULL)
Filter: ((tc.a_id = '11786959222'::numeric) AND (tc.m_id =
'6'::numeric) AND (tc.id = '54460816501'::numeric) AND (tc.pt = ANY
('{3,6,103}'::numeric[])))
Rows Removed by Filter: 70996637
Buffers: shared hit=56288997

Thanks


Re: Incorrect index used in few cases..

2019-06-18 Thread Tom Lane
AminPG Jaffer  writes:
> Is there an explanation why it is using incorrect index?

> SQL:
> SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
> ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))

What data types are these columns?  For that matter, could we see the
whole schema for the table (psql \d+ output or equivalent)?

regards, tom lane




Re: Incorrect index used in few cases..

2019-06-18 Thread AminPG Jaffer
Here is the table structure.

Column |Type |
Modifiers
---+-+---
 id| numeric(38,0)   | not null
 tname | character varying(255)  | not null
 ag_id | numeric(38,0)   |
 tc| character varying(255)  | not null
 status| numeric(10,0)   | not null
 internal_status   | numeric(10,0)   | not null
 create_date   | timestamp(6) with time zone | not null
 version   | numeric(38,0)   | not null
 match_type| numeric(10,0)   | not null default 0
 c_id  | numeric(38,0)   | not null
 m_id  | numeric(38,0)   | not null
 a_id  | numeric(38,0)   | not null
 maxb  | numeric(18,6)   |
 b_cc  | character varying(10)   |
 ui_status | numeric(10,0)   | not null default 0
 destination_url   | character varying(2084) |
 created_by| character varying(64)   | not null
 creation_date | timestamp(0) with time zone | not null default
timezone('UTC'::text, clock_timestamp())
 last_updated_by   | character varying(64)   | not null
 last_updated_date | timestamp(0) with time zone | not null
 pr| numeric(5,0)| not null default 0
 ts| numeric(1,0)| not null default 0
 uniqueness_hash_v2| numeric(29,0)   | not null
 pt| numeric(5,0)|
 history   | bigint  |
 t_secondary   | text|

Indexes:
"pk_id" PRIMARY KEY, btree (id)
"i_agi_tc_tcn" btree (ag_id, tname)
"i_cid_agid_tcn" btree (c_id, ag_id, tname)
"i_tc_adid_tid" btree (a_id, id)
"i_tc_advertiser_id" btree (a_id)
"i_tc_campaign_id" btree (c_id)
"i_tc_lud_agi" btree (last_updated_date, ag_id)
"i_tc_uniqueness_hash_v2" btree (uniqueness_hash_v2)
Check constraints:
"tc_secondary" CHECK (length(t_secondary) <= 4500)

On Tue, Jun 18, 2019 at 6:35 AM Tom Lane  wrote:

> AminPG Jaffer  writes:
> > Is there an explanation why it is using incorrect index?
>
> > SQL:
> > SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
> > ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5,
> $6)))
>
> What data types are these columns?  For that matter, could we see the
> whole schema for the table (psql \d+ output or equivalent)?
>
> regards, tom lane
>


Re: Incorrect index used in few cases..

2019-06-18 Thread Tom Lane
AminPG Jaffer  writes:
> Here is the table structure.

Hpmh.  I thought it was just barely possible that you had a datatype
mismatch between the columns and the parameters, but nope, the columns
are "numeric" just like the parameters.

I'm pretty baffled.  I tried to duplicate the problem with some dummy
data (as attached) and could not.  In my hands, it wants to use the
i_tc_adid_tid index, or if I drop that then the pkey index, and any
other possible plan is orders of magnitude more expensive than those.

Another far-fetched theory is that the theoretically-better indexes
are so badly bloated as to discourage the planner from using them.
You could eliminate that one by checking the index sizes with "\di+".

Are you perhaps running with non-default values for any planner cost
parameters?  Or it's not a stock build of Postgres?

If you could find a way to adjust the attached example so that it
produces the same misbehavior you see with live data, that would be
very interesting ...

regards, tom lane

drop table tc;

create table tc(
 id numeric(38,0)primary key,
 tname  character varying(255)   not null,
 ag_id  numeric(38,0)   ,
 tc character varying(255)   not null,
 status numeric(10,0)not null,
 internal_statusnumeric(10,0)not null,
 create_datetimestamp(6) with time zone  not null,
 versionnumeric(38,0)not null,
 match_type numeric(10,0)not null default 0,
 c_id   numeric(38,0)not null,
 m_id   numeric(38,0)not null,
 a_id   numeric(38,0)not null,
 maxb   numeric(18,6)   ,
 b_cc   character varying(10)   ,
 ui_status  numeric(10,0)not null default 0,
 destination_urlcharacter varying(2084) ,
 created_by character varying(64)not null,
 creation_date  timestamp(0) with time zone  not null default 
timezone('UTC'::text, clock_timestamp()),
 last_updated_bycharacter varying(64)not null,
 last_updated_date  timestamp(0) with time zone  not null,
 pr numeric(5,0) not null default 0,
 ts numeric(1,0) not null default 0,
 uniqueness_hash_v2 numeric(29,0)not null,
 pt numeric(5,0),
 historybigint  ,
 t_secondarytextCHECK (length(t_secondary) <= 4500)
);

insert into tc
select
  x as id,
  'tname_' || x as tname,
  case when x % 10 = 0 then null else x end as ag_id,
  'tc' as tc,
  0 as status,
  0 as internal_status,
  now() as create_date,
  42 as version,
  0 as match_type,
  (random()*1000)::int as c_id,
  x/4 as m_id,
  (random()*100)::int as a_id,
  0 as maxb,
  null as b_cc,
  0 as ui_status,
  null as destination_url,
  'c_b' as created_by,
  now() as creation_date,
  'l_u' as last_updated_by,
  now() as last_updated_date,
  0 as pr,
  0 as ts,
  x as uniqueness_hash_v2,
  x % 100 as pt,
  0 as history,
  null as t_secondary
from generate_series(1,100) x;

create index"i_agi_tc_tcn" on tc (ag_id, tname);
create index"i_cid_agid_tcn" on tc (c_id, ag_id, tname);
create index"i_tc_adid_tid" on tc (a_id, id);
create index"i_tc_advertiser_id" on tc (a_id);
create index"i_tc_campaign_id" on tc (c_id);
create index"i_tc_lud_agi" on tc (last_updated_date, ag_id);
create index"i_tc_uniqueness_hash_v2" on tc (uniqueness_hash_v2);

vacuum analyze tc;

prepare p as 
SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)));

explain verbose execute p(0, 1, 42, 4,5,6);


Re: Incorrect index used in few cases..

2019-06-18 Thread Peter Geoghegan
On Tue, Jun 18, 2019 at 2:08 PM Tom Lane  wrote:
> Are you perhaps running with non-default values for any planner cost
> parameters?  Or it's not a stock build of Postgres?
>
> If you could find a way to adjust the attached example so that it
> produces the same misbehavior you see with live data, that would be
> very interesting ...

FWIW, if you move the CREATE INDEX statements before the INSERT, and
compared earlier versions of Postgres to 12, you'll see that the size
of some of the indexes are a lot smaller on 12.

v11 (representative of 9.6):

pg@tc:5411 [1067]=# \di+ i_*
   List of relations
 Schema │  Name   │ Type  │ Owner │ Table │ Size  │ Description
┼─┼───┼───┼───┼───┼─
 public │ i_agi_tc_tcn│ index │ pg│ tc│ 74 MB │
 public │ i_cid_agid_tcn  │ index │ pg│ tc│ 82 MB │
 public │ i_tc_adid_tid   │ index │ pg│ tc│ 57 MB │
 public │ i_tc_advertiser_id  │ index │ pg│ tc│ 27 MB │
 public │ i_tc_campaign_id│ index │ pg│ tc│ 28 MB │
 public │ i_tc_lud_agi│ index │ pg│ tc│ 57 MB │
 public │ i_tc_uniqueness_hash_v2 │ index │ pg│ tc│ 21 MB │
(7 rows)

v12/master:

pg@regression:5432 [1022]=# \di+ i_*
   List of relations
 Schema │  Name   │ Type  │ Owner │ Table │ Size  │ Description
┼─┼───┼───┼───┼───┼─
 public │ i_agi_tc_tcn│ index │ pg│ tc│ 69 MB │
 public │ i_cid_agid_tcn  │ index │ pg│ tc│ 78 MB │
 public │ i_tc_adid_tid   │ index │ pg│ tc│ 36 MB │
 public │ i_tc_advertiser_id  │ index │ pg│ tc│ 20 MB │
 public │ i_tc_campaign_id│ index │ pg│ tc│ 24 MB │
 public │ i_tc_lud_agi│ index │ pg│ tc│ 30 MB │
 public │ i_tc_uniqueness_hash_v2 │ index │ pg│ tc│ 21 MB │
(7 rows)

Note that i_tc_lud_agi is 30 MB, not 57MB, and that i_tc_adid_tid is
36 MB, not 57 MB.

I can see that both i_tc_lud_agi and i_tc_adid_tid consistently use
the "split after new tuple" optimization on v12.

-- 
Peter Geoghegan


Re: Incorrect index used in few cases..

2019-06-18 Thread Andres Freund
Hi,

On 2019-06-18 06:11:54 -0700, AminPG Jaffer wrote:
> We recently upgraded one of the instances from 9.6.3 to 9.6.12 and seeing
> following issue that occurs for few cases.
> 
> We recently upgraded one of the instances from 9.6.3 to 9.6.12 and seeing
> following issue that occurs for few cases.
> 
> I have tried running analyze on the table with different values from 1000 -
> 5000 but it doesn't seem to help the issue. There is some skew in a_id
> but the combination index  i_tc_adid_tid btree (a_id, id) makes the index
> unique as it includes primary key.
> 
> Is there an explanation why it is using incorrect index?
> 
> SQL:
> SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
> ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))
> 
> Indexes on the table:
>i_tc_adid_tid btree (a_id, id)
>pk_id PRIMARY KEY, btree (id)
>i_agi_tc_tcn btree (ag_id, tname)  ---> index that gets used

Are those indexes used for other queries? Any chance they've been
recently created?

SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready,
indislive, txid_current(), txid_current_snapshot()
FROM pg_index WHERE indrelid = 'tc'::regclass;

might tell us.


On 2019-06-18 17:07:55 -0400, Tom Lane wrote:
> I'm pretty baffled.  I tried to duplicate the problem with some dummy
> data (as attached) and could not.  In my hands, it wants to use the
> i_tc_adid_tid index, or if I drop that then the pkey index, and any
> other possible plan is orders of magnitude more expensive than those.

> Another far-fetched theory is that the theoretically-better indexes
> are so badly bloated as to discourage the planner from using them.
> You could eliminate that one by checking the index sizes with "\di+".
> 
> Are you perhaps running with non-default values for any planner cost
> parameters?  Or it's not a stock build of Postgres?
> 
> If you could find a way to adjust the attached example so that it
> produces the same misbehavior you see with live data, that would be
> very interesting ...

Amin, might be worth to see what the query plan is if you disable that
index. I assume it's too big to quickly drop (based on the ?

Something like:

BEGIN;
LOCK tc;
UPDATE pg_index SET indisvalid = false WHERE indexrelid = 
'name_of_index'::regclass AND indisvalid;
EXPLAIN yourquery;
ROLLBACK;

might allow to test that without actually dropping the index. But that
of course requires superuser access.

Greetings,

Andres Freund




Re: Incorrect index used in few cases..

2019-06-18 Thread Tom Lane
Andres Freund  writes:
> Are those indexes used for other queries? Any chance they've been
> recently created?

> SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready,
> indislive, txid_current(), txid_current_snapshot()
> FROM pg_index WHERE indrelid = 'tc'::regclass;

> might tell us.

Oh, that's a good idea.

> Amin, might be worth to see what the query plan is if you disable that
> index. I assume it's too big to quickly drop (based on the ?

Considering that the "right" query plan would have a cost estimate in
the single digits or close to it, I have to suppose that the planner is
rejecting that index as unusable, not making a cost-based decision not
to use it.  (Well, maybe if it's bloated by three orders of magnitude
compared to the other indexes, it'd lose on cost.  Doesn't seem likely
though.)

So I think we're looking for a hard "can't use the index" reason, and
now we've eliminated datatype mismatch which'd be the most obvious
such reason.  But index-isnt-valid or index-isnt-ready might do the
trick.

regards, tom lane