Re: Postgres using the wrong index index

2021-08-12 Thread Matt Dupree
Justin,

The rowcount estimate for the time column is bad for all these plans - do
> you
> know why ?  You're using inheritence - have you analyzed the parent tables
> recently ?
>

Yes. I used ANALYZE before posting, as it's one of the "things to try"
listed in the slow queries wiki. I even ran the queries immediately after
analyzing. No difference. Can you say more about why the bad row estimate
would cause Postgres to use the bigger index? I would expect Postgres to
use the smaller index if it's over-estimating how many rows will be
returned.

Mladen,

You know that you can use pg_hint_plan extension? That way you don't have
> to disable indexes or set session parameters.
>

Thanks for the tip! I didn't know you could use pg_hint_plan to force the
use of certain indexes. For now, I'd like to avoid hinting and fix the
underlying issue.

On Wed, Aug 11, 2021 at 11:45 PM Justin Pryzby  wrote:

> The rowcount estimate for the time column is bad for all these plans - do
> you
> know why ?  You're using inheritence - have you analyzed the parent tables
> recently ?
>
> | Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on
> public.other_events_1004175222 (cost=0.28..1,648,877.92 rows=1,858,891
> width=32) (actual time=1.008..15.245 rows=23 loops=1)
> |Index Cond: ((other_events_1004175222."time" >=
> '162477720'::bigint) AND (other_events_1004175222."time" <=
> '162736920'::bigint))
>
> --
> Justin
>


-- 

K. Matt Dupree

Data Science Engineer
321.754.0526  |  [email protected]


Re: Postgres using the wrong index index

2021-08-12 Thread Justin Pryzby
On Thu, Aug 12, 2021 at 09:38:45AM -0400, Matt Dupree wrote:
> > The rowcount estimate for the time column is bad for all these plans - do 
> > you
> > know why ?  You're using inheritence - have you analyzed the parent tables 
> > recently ?
> 
> Yes. I used ANALYZE before posting, as it's one of the "things to try"
> listed in the slow queries wiki. I even ran the queries immediately after
> analyzing. No difference. Can you say more about why the bad row estimate
> would cause Postgres to use the bigger index? I would expect Postgres to
> use the smaller index if it's over-estimating how many rows will be
> returned.

The overestimate is in the table's "time" column (not index) and applies to all
the plans.  Is either half of the AND estimated correctly?  If you do a query
with only ">=", and a query with only "<=", do either of them give an accurate
rowcount estimate ?

|Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on 
public.other_events_1004175222 (cost=0.28..1,648,877.92 rows=1,858,891 
width=32) (actual time=1.008..15.245 rows=23 loops=1) 
|Index Cond: ((other_events_1004175222."time" >= '162477720'::bigint) AND 
(other_events_1004175222."time" <= '162736920'::bigint))

It seems like postgres expects the scan to return a large number of matching
rows, so tries to use the more selective index which includes the "type"
column.  But "type" is not very selective either (it has only 4 distinct
values), and "time" is not the first column, so it reads a large fraction of
the table, slowly.

Could you check pg_stat_all_tables and be sure the last_analyzed is recent for
both parent and child tables ?

Could you send the histogram bounds for "time" ?
SELECT tablename, attname, inherited, array_length(histogram_bounds,1), 
(histogram_bounds::text::text[])[1], 
(histogram_bounds::text::text[])[array_length(histogram_bounds,1)]
FROM pg_stats ... ;

-- 
Justin