Re: Strange runtime partition pruning behaviour with 11.4

2019-08-03 Thread MichaelDBA
I too am a bit perplexed by why runtime partition pruning does not seem 
to work with this example.  Anybody got any ideas of this?


Regards,
Michael Vitale

Thomas Kellerer wrote on 8/2/2019 9:58 AM:

I stumbled across this question on SO: 
https://stackoverflow.com/questions/56517852

Disregarding the part about Postgres 9.3, the example for Postgres 11 looks a 
bit confusing.

There is a script to setup test data in that question:

 start of script 

 create table foo (
 foo_id integer not null,
 foo_name varchar(10),
 constraint foo_pkey primary key (foo_id)
 );

 insert into foo
   (foo_id, foo_name)
 values
   (1, 'eeny'),
   (2, 'meeny'),
   (3, 'miny'),
   (4, 'moe'),
   (5, 'tiger'),
   (6, 'toe');

 create table foo_bar_baz (
 foo_id integer not null,
 bar_id integer not null,
 bazinteger not null,
 constraint foo_bar_baz_pkey primary key (foo_id, bar_id, baz),
 constraint foo_bar_baz_fkey1 foreign key (foo_id)
 references foo (foo_id)
 ) partition by range (foo_id)
 ;

 create table if not exists foo_bar_baz_0 partition of foo_bar_baz for 
values from (0) to (1);
 create table if not exists foo_bar_baz_1 partition of foo_bar_baz for 
values from (1) to (2);
 create table if not exists foo_bar_baz_2 partition of foo_bar_baz for 
values from (2) to (3);
 create table if not exists foo_bar_baz_3 partition of foo_bar_baz for 
values from (3) to (4);
 create table if not exists foo_bar_baz_4 partition of foo_bar_baz for 
values from (4) to (5);
 create table if not exists foo_bar_baz_5 partition of foo_bar_baz for 
values from (5) to (6);

 with foos_and_bars as (
 select ((random() * 4) + 1)::int as foo_id, bar_id::int
 from generate_series(0, 1499) as t(bar_id)
 ), bazzes as (
 select baz::int
 from generate_series(1, 1500) as t(baz)
 )
 insert into foo_bar_baz (foo_id, bar_id, baz)
 select foo_id, bar_id, baz
 from bazzes as bz
   join foos_and_bars as fab on mod(bz.baz, fab.foo_id) = 0;

 end of script 

I see the some strange behaviour similar to to what is reported in the comments 
to that question:

When I run the test query immediately after populating the tables with the 
sample data:

 explain analyze
 select count(*)
 from foo_bar_baz as fbb
   join foo on fbb.foo_id = foo.foo_id
 where foo.foo_name = 'eeny'

I do see an "Index Only Scan  (never executed)" in the plan for the 
irrelevant partitions:

   https://explain.depesz.com/s/AqlE

However once I run "analyze foo_bar_baz" (or "vacuum analyze"), Postgres chooses to do a 
"Parallel Seq Scan" for each partition:

   https://explain.depesz.com/s/WwxE

Why does updating the statistics mess up (runtime) partition pruning?


I played around with random_page_cost and that didn't change anything.
I tried to create extended statistics on "foo(id, name)" so that the planner 
would no, that there is only one name per id. No change.

I saw the above behaviour when running this on Windows 10 (my Laptop) or CentOS 
7 (a test environment on a VM)

On the CentOS server default_statistics_target is set to 100, on my laptop it 
is set to 1000

In both cases the Postgres version was 11.4

Any ideas?

Thomas








Re: Strange runtime partition pruning behaviour with 11.4

2019-08-03 Thread Andreas Kretschmer

Hi,


Am 03.08.19 um 15:16 schrieb MichaelDBA:
I too am a bit perplexed by why runtime partition pruning does not 
seem to work with this example.  Anybody got any ideas of this? 



please don't top-posting.

it's posible to rewrite the query to:


test=# explain analyse select count(*) from foo_bar_baz as fbb where 
foo_id = (select foo_id from foo where foo_name = 'eeny');

QUERY PLAN
---
 Finalize Aggregate  (cost=15880.63..15880.64 rows=1 width=8) (actual 
time=48.447..48.448 rows=1 loops=1)

   InitPlan 1 (returns $0)
 ->  Seq Scan on foo  (cost=0.00..24.50 rows=6 width=4) (actual 
time=0.243..0.246 rows=1 loops=1)

   Filter: ((foo_name)::text = 'eeny'::text)
   Rows Removed by Filter: 5
   ->  Gather  (cost=15855.92..15856.13 rows=2 width=8) (actual 
time=48.376..51.468 rows=3 loops=1)

 Workers Planned: 2
 Params Evaluated: $0
 Workers Launched: 2
 ->  Partial Aggregate  (cost=14855.92..14855.93 rows=1 
width=8) (actual time=42.600..42.600 rows=1 loops=3)
   ->  Parallel Append  (cost=0.00..13883.01 rows=389162 
width=0) (actual time=0.139..34.914 rows=83500 loops=3)
 ->  Parallel Bitmap Heap Scan on foo_bar_baz_0 
fbb  (cost=4.23..14.73 rows=6 width=0) (never executed)

   Recheck Cond: (foo_id = $0)
   ->  Bitmap Index Scan on foo_bar_baz_0_pkey  
(cost=0.00..4.23 rows=10 width=0) (never executed)

 Index Cond: (foo_id = $0)
 ->  Parallel Seq Scan on foo_bar_baz_2 fbb_2  
(cost=0.00..3865.72 rows=178218 width=0) (never executed)

   Filter: (foo_id = $0)
 ->  Parallel Seq Scan on foo_bar_baz_1 fbb_1  
(cost=0.00..3195.62 rows=147250 width=0) (actual time=0.129..24.735 
rows=83500 loops=3)

   Filter: (foo_id = $0)
 ->  Parallel Seq Scan on foo_bar_baz_3 fbb_3  
(cost=0.00..2334.49 rows=107559 width=0) (never executed)

   Filter: (foo_id = $0)
 ->  Parallel Seq Scan on foo_bar_baz_4 fbb_4  
(cost=0.00..1860.95 rows=85756 width=0) (never executed)

   Filter: (foo_id = $0)
 ->  Parallel Seq Scan on foo_bar_baz_5 fbb_5  
(cost=0.00..665.69 rows=30615 width=0) (never executed)

   Filter: (foo_id = $0)
 Planning Time: 12.648 ms
 Execution Time: 52.621 ms
(27 rows)

test=*#


I know, that's not a solution, but a workaround. :-(

(pg 12beta2 and also with PostgreSQL 11.4 (2ndQPG 11.4r1.6.7))



Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





Re: Strange runtime partition pruning behaviour with 11.4

2019-08-03 Thread Thomas Kellerer

it's posible to rewrite the query to:


test=# explain analyse select count(*) from foo_bar_baz as fbb where foo_id = 
(select foo_id from foo where foo_name = 'eeny');

I know, that's not a solution, but a workaround. :-(


Yes, I discovered that as well.

But I'm more confused (or concerned) by the fact that the (original) query 
works correctly *without* statistics.

Thomas







Re: Strange runtime partition pruning behaviour with 11.4

2019-08-03 Thread Andreas Kretschmer




Am 03.08.19 um 16:06 schrieb Thomas Kellerer:

it's posible to rewrite the query to:


test=# explain analyse select count(*) from foo_bar_baz as fbb where 
foo_id = (select foo_id from foo where foo_name = 'eeny');


I know, that's not a solution, but a workaround. :-(


Yes, I discovered that as well.

But I'm more confused (or concerned) by the fact that the (original) 
query works correctly *without* statistics.


Thomas




can't reproduce that :-(  (PG 11.4 Community)

(all in a file and executed the explain immediately)


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





Re: Strange runtime partition pruning behaviour with 11.4

2019-08-03 Thread Tom Lane
Andreas Kretschmer  writes:
> Am 03.08.19 um 16:06 schrieb Thomas Kellerer:
>> But I'm more confused (or concerned) by the fact that the (original) 
>> query works correctly *without* statistics.

> can't reproduce that :-(  (PG 11.4 Community)

Yeah, I get the same plan with or without ANALYZE, too.  In this example,
having the ANALYZE stats barely moves the rowcount estimates for
foo_bar_baz at all, so it's not surprising that the plan doesn't change.
(I do wonder how Thomas got a different outcome...)

Given the shape of the preferred plan:

 Finalize Aggregate  (cost=15779.59..15779.60 rows=1 width=8) (actual 
time=160.329..160.330 rows=1 loops=1)
   ->  Gather  (cost=15779.38..15779.59 rows=2 width=8) (actual 
time=160.011..161.712 rows=3 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=14779.38..14779.39 rows=1 width=8) 
(actual time=154.675..154.675 rows=1 loops=3)
   ->  Hash Join  (cost=1.09..14612.90 rows=66590 width=0) (actual 
time=86.814..144.793 rows=100500 loops=3)
 Hash Cond: (fbb_1.foo_id = foo.foo_id)
 ->  Parallel Append  (cost=0.00..12822.21 rows=399537 
width=4) (actual time=0.019..95.644 rows=318950 loops=3)
   ->  Parallel Seq Scan on foo_bar_baz_1 fbb_1  
(cost=0.00..3403.53 rows=177353 width=4) (actual time=0.012..18.881 rows=100500 
loops=3)
   ->  Parallel Seq Scan on foo_bar_baz_2 fbb_2  
(cost=0.00..3115.53 rows=162353 width=4) (actual time=0.018..51.716 rows=276000 
loops=1)
   ->  Parallel Seq Scan on foo_bar_baz_3 fbb_3  
(cost=0.00..2031.82 rows=105882 width=4) (actual time=0.011..16.854 rows=9 
loops=2)
   ->  Parallel Seq Scan on foo_bar_baz_4 fbb_4  
(cost=0.00..1584.00 rows=82500 width=4) (actual time=0.011..26.950 rows=140250 
loops=1)
   ->  Parallel Seq Scan on foo_bar_baz_5 fbb_5  
(cost=0.00..667.65 rows=34765 width=4) (actual time=0.014..11.896 rows=59100 
loops=1)
   ->  Parallel Seq Scan on foo_bar_baz_0 fbb  
(cost=0.00..22.00 rows=1200 width=4) (actual time=0.001..0.001 rows=0 loops=1)
 ->  Hash  (cost=1.07..1.07 rows=1 width=4) (actual 
time=0.038..0.038 rows=1 loops=3)
   Buckets: 1024  Batches: 1  Memory Usage: 9kB
   ->  Seq Scan on foo  (cost=0.00..1.07 rows=1 
width=4) (actual time=0.021..0.023 rows=1 loops=3)
 Filter: ((foo_name)::text = 'eeny'::text)
 Rows Removed by Filter: 5

it's obvious that no pruning can happen, run-time or otherwise,
because the partitioned table is being scanned on the outside
of the join --- so the target value of foo_id isn't available.

We can force the planner to its second best choice with
set enable_hashjoin to 0;

and then we get

 Aggregate  (cost=31954.09..31954.10 rows=1 width=8) (actual 
time=420.158..420.158 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..31554.55 rows=159815 width=0) (actual 
time=0.058..389.974 rows=301500 loops=1)
 Join Filter: (fbb.foo_id = foo.foo_id)
 Rows Removed by Join Filter: 655350
 ->  Seq Scan on foo  (cost=0.00..1.07 rows=1 width=4) (actual 
time=0.025..0.028 rows=1 loops=1)
   Filter: ((foo_name)::text = 'eeny'::text)
   Rows Removed by Filter: 5
 ->  Append  (cost=0.00..19567.35 rows=958890 width=4) (actual 
time=0.026..280.510 rows=956850 loops=1)
   ->  Seq Scan on foo_bar_baz_0 fbb  (cost=0.00..30.40 rows=2040 
width=4) (actual time=0.003..0.003 rows=0 loops=1)
   ->  Seq Scan on foo_bar_baz_1 fbb_1  (cost=0.00..4645.00 
rows=301500 width=4) (actual time=0.022..57.836 rows=301500 loops=1)
   ->  Seq Scan on foo_bar_baz_2 fbb_2  (cost=0.00..4252.00 
rows=276000 width=4) (actual time=0.019..51.834 rows=276000 loops=1)
   ->  Seq Scan on foo_bar_baz_3 fbb_3  (cost=0.00..2773.00 
rows=18 width=4) (actual time=0.016..31.951 rows=18 loops=1)
   ->  Seq Scan on foo_bar_baz_4 fbb_4  (cost=0.00..2161.50 
rows=140250 width=4) (actual time=0.015..24.392 rows=140250 loops=1)
   ->  Seq Scan on foo_bar_baz_5 fbb_5  (cost=0.00..911.00 
rows=59100 width=4) (actual time=0.012..10.252 rows=59100 loops=1)

This is a good deal slower, and the planner correctly estimates that it's
a good deal slower, so that's why it didn't get picked.

But ... why didn't any run-time pruning happen?  Because the shape of the
plan is still wrong: the join condition is being applied at the nestloop
node.  If we'd pushed down the foo_id condition to the foo_bar_baz scan
then there'd be hope of pruning.

I think the reason that that isn't happening is that the planner has
not been taught that run-time pruning is a thing, so it's not giving
any cost preference to doing things in a way that would enable that.

Re: Strange runtime partition pruning behaviour with 11.4

2019-08-03 Thread MichaelDBA
I too got the same plan (non runtime partition pruning plan) with or 
without the statistics.  So it looks like the workaround until this is 
fixed is to re-arrange the query to do a subselect to force the runtime 
partition pruning as Andreas suggested, which I tested and indeed does 
work for me too!


Regards,
Michael Vitale

Thomas Kellerer wrote on 8/3/2019 10:06 AM:

it's posible to rewrite the query to:


test=# explain analyse select count(*) from foo_bar_baz as fbb where 
foo_id = (select foo_id from foo where foo_name = 'eeny');


I know, that's not a solution, but a workaround. :-(


Yes, I discovered that as well.

But I'm more confused (or concerned) by the fact that the (original) 
query works correctly *without* statistics.


Thomas