Re: Inaccurate Rows estimate for "Bitmap And" causes Planner to choose wrong join

2020-05-13 Thread Steve Pritchard
Many thanks Justin & Jeff for your replies.

Presumbly the conditions are partially redundant, so loc_id => user_id


Yes you're right. I had overlooked this.

I've done some further testing and this confirms what you say: if the WHERE
columns are independent, then the Planner makes a reasonable estimate of
the number of rows. irrespective of whether it uses a single index or a
"Bitmap And" of two indexes.

I've also tested "create statistics" on Postgres 12:

   - gives good estimate with WHERE user_id = 'USER123' and loc_id =
   'LOC12345678'
   - but Plan Rows = 5 with WHERE user_id = 'USER123'  and loc_id = ANY('{
   LOC12345678 }'::text[])
   - Note: if I omit the user_id condition then it gives a good estimate,
   i.e. with WHERE loc_id = ANY('{ LOC12345678 }'::text[])

So statistics objects don't seem to be able to handle the combination of
dependencies and arrays (at least in 12.2).

Steve

On Wed, 6 May 2020 at 19:25, Jeff Janes  wrote:

> On Wed, May 6, 2020 at 12:20 PM Steve Pritchard 
> wrote:
>
>> Version: Postgres 9.6.3 production system (but also tested on Postgres 12)
>>
>> For my query the Planner is sometimes choosing an execution plan that
>> uses "Bitmap And" (depending on the parameters):
>>
>> ->  Bitmap Heap Scan on observation  (cost=484.92..488.93 rows=1
>> width=203) (actual time=233.129..330.886 rows=15636 loops=1)
>>   Recheck Cond: (((user_id)::text = 'USER123'::text) AND ((loc_id)::text
>> = ANY ('{LOC12345678}'::text[])))
>>
>
> If you change " = ANY(array_of_one)" to " = scalar", does that change
> anything?  You might be able to fix this (in v12) using CREATE STATISTICS,
> but I don't know if that mechanism can see through the ANY(array_of_one)
> wrapper.
>
>
>> Note that in cases where the Planner selects a single Index Scan for this
>> query (with different parameters), the Planner makes an accurate estimate
>> of the number of rows and then makes sensible selections of joins (i.e.
>> quick).
>> i.e. the issue seems to be with the "Bitmap And".
>>
>
>
> I don't know if this nitpick matters, but I don't think that that is how
> the planner works.  The row estimates work from the top down, not the
> bottom up.  The row estimate of 1 is based on what conditions the bitmap
> heap scan implements, it is not arrived at by combining the estimates from
> the index scans below it.  If it were to change to a different type of node
> but implemented the same conditions, I think it would have the same row
> estimate.
>
>
>>
>> I don't have an index with both user_id & loc_id, as this is one of
>> several different combinations that can arise (it would require quite a few
>> indexes to cover all the possible combinations).
>>
>
> Are you actually experiencing problems with those other combinations as
> well?  If not, I wouldn't worry about solving hypothetical problems.  If
> those other combinations are actually problems and you go with CREATE
> STATISTICS, then you would have to be creating a lot of different
> statistics.  That would still be ugly, but at least the overhead for
> statistics is lower than for indexes.
>
>
>> However if I did have such an index, the planner would presumably be able
>> to use the statistics for user_id and loc_id to estimate the number of rows.
>>
>
> Indexes on physical columns do not have statistics, so making that index
> would not help with the estimation.  (Expressional indexes do have
> statistics, but I don't see that helping you here).   So while this node
> would execute faster with that index, it would still be kicking the unshown
> nested loop left join 15,636 times when it thinks it will be doing it
> once, and so would still be slow.  The most robust solution might be to
> make the outer part of that nested loop left join faster, so that your
> system would be more tolerant of statistics problems.
>
>
>>
>> So why can't it make an accurate estimate of the rows with a "Bitmap And"
>> & " Bitmap Heap Scan"? (as above)
>>
>
> In the absence of custom statistics, it assumes the selectivities of user_id
> = 'USER123', of loc_id = ANY ('{LOC12345678}'::text[]), and of taxa =
> 'Birds' are all independent of each other and can be multiplied to arrive
> at the overall selectivity.  But clearly that is not the case.  Bird
> watchers mostly watch near where they live, not in random other places.
>
> Cheers,
>
> Jeff
>


-- 
Steve Pritchard
Database Developer

British Trust for Ornithology, The Nunnery, Thetford, Norfolk IP24 2PU, UK
Tel: +44 (0)1842 750050, fax: +44 (0)1842 750030
Registered Charity No 216652 (England & Wales) No SC039193 (Scotland)
Company Limited by Guarantee No 357284 (England & Wales)


Re: Please help! Query jumps from 1s -> 4m

2020-05-13 Thread James Thompson
Just to follow up on this...
Tried increasing stats targets last week + re-analyzing but the query was
just as bad.
Ended up increasing the prepareThreshold to prevent server-side prepares
for now (and thus later generic statements). This 'fixed' the issue and had
no noticeable negative effect for our workloads.

I still don't understand why the plan being off makes the query so much
slower in this case (the plans I shared in the last email don't look too
different, I don't understand how the filter can add on 2mins of execution
time to an index-only scan). If anyone does have thoughts on what could be
happening I would be very interested to hear, but the main performance
problem is effectively solved.

Thanks all for the valuable help getting to the bottom of what was
happening.

On Tue, 5 May 2020 at 22:42, Tom Lane  wrote:

> James Thompson  writes:
> > The slowness occurs when the prepared statement changes to a generic
> plan.
>
> > Initial plan:
> > ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on
> table1
> > table1alias2  (cost=0.56..2549.70 rows=70 width=36) (actual
> > time=1.901..45.256 rows=65000 loops=1)
> > Output: table1alias2.uniqueid
> > Index Cond: ((table1alias2.col20 = '12345'::bigint) AND
> (table1alias2.
> > col8 = ANY ('{c5986b02-3a02-4639-8147-f286972413ba,...
> > 98ed24b1-76f5-4b0e-bb94-86cf13a4809c}'::text[])))
> > Heap Fetches: 10
> > Buffers: shared hit=5048
>
> > after 5 executions of the statement:
> > ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on
> table1
> > table1alias2  (cost=0.56..17.23 rows=1 width=36) (actual
> > time=125.344..126877.822 rows=65000 loops=1)
> > Output: table1alias2.uniqueid
> > Index Cond: (table1alias2.col20 = $1001)
> > Filter: ((table1alias2.col8)::text = ANY ((ARRAY[$1, ...,
> > $1000])::text[]))
> > Rows Removed by Filter: 2670023
> > Heap Fetches: 428
> > Buffers: shared hit=45933 read=42060 dirtied=4
>
> Yeah, this is a dynamic we've seen before.  The rowcount estimate, and
> hence the cost estimate, for the plan with explicit parameter values is
> way off; but the estimate for the generic plan is even more way off,
> causing the system to falsely decide that the latter is cheaper.
>
> I've speculated about refusing to believe generic cost estimates if they
> are
> more than epsilon less than the concrete cost estimate, but it's not quite
> clear how that should work or whether it'd have its own failure modes.
>
> The one thing that is totally clear is that these rowcount estimates are
> crappy.  Can you improve them by increasing the stats target for that
> table?  Maybe with less-garbage-y inputs, the system would make the right
> plan choice here.
>
> regards, tom lane
>


Plan not skipping unnecessary inner join

2020-05-13 Thread Matthew Nelson
I noticed something peculiar while optimizing complex views today. The query 
planner does not skip inner joins that, to my understanding, can have no impact 
on the result. Am I missing a situation where these joins could impact the 
result?

The following demonstrates the problem without the complex views. It also 
demonstrates how the planner simplifies a LEFT JOIN in the same situation. The 
left and right sides of an inner join could be swapped, obviously, but here I 
kept the unique constraint on the right.



CREATE TABLE foo (
id INTEGER PRIMARY KEY
);

CREATE TABLE bar (
foo_id INTEGER NOT NULL REFERENCES foo
);

-- This simplifies to SELECT COUNT(*) FROM bar;
EXPLAIN SELECT COUNT(*)
FROM bar
LEFT JOIN foo ON bar.foo_id = foo.id;

-- This should simplify to SELECT COUNT(*) FROM bar WHERE foo_id IS NOT NULL;
-- The presence of a NOT NULL constraint on foo_id has no effect.
EXPLAIN SELECT COUNT(*)
FROM bar
INNER JOIN foo ON bar.foo_id = foo.id;



 QUERY PLAN  
-
 Aggregate  (cost=38.25..38.26 rows=1 width=8)
   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=0)
(2 rows)

   QUERY PLAN
-
 Aggregate  (cost=111.57..111.58 rows=1 width=8)
   ->  Hash Join  (cost=67.38..105.92 rows=2260 width=0)
 Hash Cond: (bar.foo_id_not_null = foo.id)
 ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)
 ->  Hash  (cost=35.50..35.50 rows=2550 width=4)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

  version   
   
---
 PostgreSQL 12.2 on x86_64-apple-darwin19.4.0, compiled by Apple clang version 
11.0.3 (clang-1103.0.32.59), 64-bit
(1 row)




Re: Plan not skipping unnecessary inner join

2020-05-13 Thread David Wheeler
> Am I missing a situation where these joins could impact the result?

Yes it will impact the number of rows in the result. for example if foo is 
empty then postgres is required to return no results, regardless of how many 
rows are in bar. This is why it can ignore the table in the left join

— David

> On 14 May 2020, at 1:44 pm, Matthew Nelson  wrote:
> 
> I noticed something peculiar while optimizing complex views today. The query 
> planner does not skip inner joins that, to my understanding, can have no 
> impact on the result. Am I missing a situation where these joins could impact 
> the result?
> 
> The following demonstrates the problem without the complex views. It also 
> demonstrates how the planner simplifies a LEFT JOIN in the same situation. 
> The left and right sides of an inner join could be swapped, obviously, but 
> here I kept the unique constraint on the right.
> 
> 
> 
> CREATE TABLE foo (
>id INTEGER PRIMARY KEY
> );
> 
> CREATE TABLE bar (
>foo_id INTEGER NOT NULL REFERENCES foo
> );
> 
> -- This simplifies to SELECT COUNT(*) FROM bar;
> EXPLAIN SELECT COUNT(*)
> FROM bar
> LEFT JOIN foo ON bar.foo_id = foo.id;
> 
> -- This should simplify to SELECT COUNT(*) FROM bar WHERE foo_id IS NOT NULL;
> -- The presence of a NOT NULL constraint on foo_id has no effect.
> EXPLAIN SELECT COUNT(*)
> FROM bar
> INNER JOIN foo ON bar.foo_id = foo.id;
> 
> 
> 
> QUERY PLAN  
> -
> Aggregate  (cost=38.25..38.26 rows=1 width=8)
>   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=0)
> (2 rows)
> 
>   QUERY PLAN
> -
> Aggregate  (cost=111.57..111.58 rows=1 width=8)
>   ->  Hash Join  (cost=67.38..105.92 rows=2260 width=0)
> Hash Cond: (bar.foo_id_not_null = foo.id)
> ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)
> ->  Hash  (cost=35.50..35.50 rows=2550 width=4)
>   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
> (6 rows)
> 
>  version  
> 
> ---
> PostgreSQL 12.2 on x86_64-apple-darwin19.4.0, compiled by Apple clang version 
> 11.0.3 (clang-1103.0.32.59), 64-bit
> (1 row)
> 
>