Re: force partition pruning

Vijaykumar Jain Tue, 11 May 2021 11:34:06 -0700

ok i think i just may be there is very less data , hence no index scan, no
pruning.


when i try to force seq_scan off,

postgres=# set enable_seqscan TO off;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Merge Semi Join  (cost=0.88..20.98 rows=2 width=4) (actual
time=0.031..0.042 rows=2 loops=1)
   Merge Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=0.75..28.89 rows=7 width=4) (actual time=0.024..0.032
rows=3 loops=1)
         ->  Index Only Scan using tprt1_idx on tprt_1  (cost=0.13..8.16
rows=2 width=4) (actual time=0.023..0.024 rows=2 loops=1)
               Heap Fetches: 0
         ->  Index Only Scan using tprt2_idx on tprt_2  (cost=0.12..4.14
rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1)
               Heap Fetches: 0
         ->  Index Only Scan using tprt3_idx on tprt_3  (cost=0.12..4.14
rows=1 width=4) (never executed)
               Heap Fetches: 0
         ->  Index Only Scan using tprt4_idx on tprt_4  (cost=0.12..4.14
rows=1 width=4) (never executed)
               Heap Fetches: 0
         ->  Index Only Scan using tprt5_idx on tprt_5  (cost=0.12..4.14
rows=1 width=4) (never executed)
               Heap Fetches: 0
         ->  Index Only Scan using tprt6_idx on tprt_6  (cost=0.12..4.14
rows=1 width=4) (never executed)
               Heap Fetches: 0
   ->  Index Scan using tbl1_col1_idx on tbl1  (cost=0.13..12.16 rows=2
width=4) (actual time=0.006..0.007 rows=1 loops=1)
         Filter: (col2 = ANY ('{1,2}'::integer[]))
         Rows Removed by Filter: 1
 Planning Time: 0.244 ms
 Execution Time: 0.067 ms
(20 rows)

postgres=# set enable_seqscan TO on;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=1.05..7.20 rows=2 width=4) (actual time=0.016..0.028
rows=2 loops=1)
   Hash Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.005..0.013
rows=7 loops=1)
         ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.004..0.005 rows=2 loops=1)
         ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.002 rows=1 loops=1)
         ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
         ->  Seq Scan on tprt_4  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
         ->  Seq Scan on tprt_5  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
         ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.001..0.001 rows=1 loops=1)
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.007
rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.004..0.005 rows=1 loops=1)
               Filter: (col2 = ANY ('{1,2}'::integer[]))
               Rows Removed by Filter: 1
 Planning Time: 0.236 ms
 Execution Time: 0.048 ms
(16 rows)

It then takes a costlier option of index scan, but it helps me understand
pruning does take place.

now resetting to enable_seqscan to on and filling with lot of data.

********************************************
insert into tprt select f[(random() * 100)::int % 6 + 1] from (select
ARRAY(select col1 from tprt) as f) g, generate_series(1, 100000) x;
vacuum analyze tprt;

postgres=# show enable_seqscan;
 enable_seqscan
----------------
 on
(1 row)

postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
                                                                 QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Merge Semi Join  (cost=1.74..1217.88 rows=33336 width=4) (actual
time=0.019..8.026 rows=33394 loops=1)
   Merge Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=1.56..2369.70 rows=100007 width=4) (actual
time=0.012..4.609 rows=33395 loops=1)
         ->  Index Only Scan using tprt1_idx on tprt_1  (cost=0.29..617.20
rows=33394 width=4) (actual time=0.011..2.607 rows=33394 loops=1)
               Heap Fetches: 0
         ->  Index Only Scan using tprt2_idx on tprt_2  (cost=0.29..320.14
rows=17057 width=4) (actual time=0.009..0.009 rows=1 loops=1)
               Heap Fetches: 0
*         ->  Index Only Scan using tprt3_idx on tprt_3  (cost=0.29..320.37
rows=17072 width=4) (never executed)*
*               Heap Fetches: 0*
*         ->  Index Only Scan using tprt4_idx on tprt_4  (cost=0.29..306.20
rows=16394 width=4) (never executed)*
*               Heap Fetches: 0*
*         ->  Index Only Scan using tprt5_idx on tprt_5  (cost=0.29..301.62
rows=16089 width=4) (never executed)*
*               Heap Fetches: 0*
*         ->  Index Only Scan using tprt6_idx on tprt_6  (cost=0.12..4.14
rows=1 width=4) (never executed)*
               Heap Fetches: 0
   ->  Index Scan using tbl1_col1_idx on tbl1  (cost=0.13..12.16 rows=2
width=4) (actual time=0.006..0.008 rows=1 loops=1)
         Filter: (col2 = ANY ('{1,2}'::integer[]))
         Rows Removed by Filter: 1
 Planning Time: 0.239 ms
 Execution Time: 9.129 ms
(20 rows)

postgres=# set enable_indexonlyscan TO off;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Merge Semi Join  (cost=1.75..1372.21 rows=33336 width=4) (actual
time=0.018..9.624 rows=33394 loops=1)
   Merge Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=1.56..2833.70 rows=100007 width=4) (actual
time=0.012..6.048 rows=33395 loops=1)
         ->  Index Scan using tprt1_idx on tprt_1  (cost=0.29..768.20
rows=33394 width=4) (actual time=0.012..4.117 rows=33394 loops=1)
         ->  Index Scan using tprt2_idx on tprt_2  (cost=0.29..399.14
rows=17057 width=4) (actual time=0.007..0.007 rows=1 loops=1)
         ->  Index Scan using tprt3_idx on tprt_3  (cost=0.29..399.37
rows=17072 width=4) (never executed)
         ->  Index Scan using tprt4_idx on tprt_4  (cost=0.29..382.20
rows=16394 width=4) (never executed)
         ->  Index Scan using tprt5_idx on tprt_5  (cost=0.29..376.62
rows=16089 width=4) (never executed)
         ->  Index Scan using tprt6_idx on tprt_6  (cost=0.12..8.14 rows=1
width=4) (never executed)
   ->  Index Scan using tbl1_col1_idx on tbl1  (cost=0.13..12.16 rows=2
width=4) (actual time=0.004..0.005 rows=1 loops=1)
         Filter: (col2 = ANY ('{1,2}'::integer[]))
         Rows Removed by Filter: 1
 Planning Time: 0.237 ms
 Execution Time: 10.634 ms
(14 rows)

postgres=# set enable_indexscan TO off;
SET
postgres=# explain analyze select * from tprt where tprt.col1 in (select
tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=1.05..2580.54 rows=33336 width=4) (actual
time=0.034..21.374 rows=33394 loops=1)
   Hash Cond: (tprt.col1 = tbl1.col1)
   ->  Append  (cost=0.00..1946.11 rows=100007 width=4) (actual
time=0.006..11.179 rows=100007 loops=1)
         ->  Seq Scan on tprt_1  (cost=0.00..481.94 rows=33394 width=4)
(actual time=0.006..2.342 rows=33394 loops=1)
         ->  Seq Scan on tprt_2  (cost=0.00..246.57 rows=17057 width=4)
(actual time=0.003..0.886 rows=17057 loops=1)
         ->  Seq Scan on tprt_3  (cost=0.00..246.72 rows=17072 width=4)
(actual time=0.003..0.897 rows=17072 loops=1)
         ->  Seq Scan on tprt_4  (cost=0.00..236.94 rows=16394 width=4)
(actual time=0.003..0.876 rows=16394 loops=1)
         ->  Seq Scan on tprt_5  (cost=0.00..232.89 rows=16089 width=4)
(actual time=0.003..0.797 rows=16089 loops=1)
         ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
time=0.003..0.003 rows=1 loops=1)
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006
rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.004..0.005 rows=1 loops=1)
               Filter: (col2 = ANY ('{1,2}'::integer[]))
               Rows Removed by Filter: 1
 Planning Time: 0.233 ms
 Execution Time: 22.428 ms
(16 rows)

********************************************

so i guess it works. I should not try to be smarter than the optimizer :)


On Tue, 11 May 2021 at 22:59, Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

> ok,
>
> partitioning - Can PostgreSQL 12 do partition pruning at execution time
> with subquery returning a list? - Stack Overflow
> <https://stackoverflow.com/questions/61111395/can-postgresql-12-do-partition-pruning-at-execution-time-with-subquery-returning>
>
> ok forcing hash join off, did not work as the outer table was the
> partitioned table selected.
>
>
> On Tue, 11 May 2021 at 22:42, Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
>> Ok. maybe you are in a rush.
>>
>> But I would keep the thread open, to understand what I am not
>> understanding or else, it'll become a habit of converting sql to plpgsql :)
>>
>> Big Guys,
>> It seems, when the table is partitioned by range, it makes use of a
>> nested loop which helps in partition pruning.
>> if the table is list partitioned, it scans all the partitions.
>>
>> Is this expected ?
>>
>>
>>
>> LIST BASED PARTITION
>> **********************
>> postgres@go:~$ more p.sql
>> drop table tbl1;
>> drop table tprt;
>>
>> create table tbl1(col1 int);
>> insert into tbl1 values (501), (505);
>>
>> -- Basic table
>> create table tprt (col1 int) partition by list (col1);
>> create table tprt_1 partition of tprt for values in (501);
>> create table tprt_2 partition of tprt for values in (1001);
>> create table tprt_3 partition of tprt for values in  (2001);
>> create table tprt_4 partition of tprt for values in  (3001);
>> create table tprt_5 partition of tprt for values in  (4001);
>> create table tprt_6 partition of tprt for values in  (5001);
>>
>> create index tprt1_idx on tprt_1 (col1);
>> create index tprt2_idx on tprt_2 (col1);
>> create index tprt3_idx on tprt_3 (col1);
>> create index tprt4_idx on tprt_4 (col1);
>> create index tprt5_idx on tprt_5 (col1);
>> create index tprt6_idx on tprt_6 (col1);
>>
>> insert into tprt values (501), (1001), (2001), (3001), (4001), (5001),
>> (501);
>>
>> alter table tbl1 add column col2 int default 0;
>> update tbl1 set col2 =1 where col1 = 501;
>>
>> vacuum analyze tprt;
>> vacuum analyze tbl1;
>>
>> explain analyze select * from tprt where tprt.col1 in (select tbl1.col1
>> from tbl1 where tbl1.col2 in (1, 2) );
>>
>>                                                  QUERY PLAN
>>
>> ------------------------------------------------------------------------------------------------------------
>>  Hash Semi Join  (cost=1.05..7.20 rows=2 width=4) (actual
>> time=0.028..0.034 rows=2 loops=1)
>>    Hash Cond: (tprt.col1 = tbl1.col1)
>>    ->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.003..0.008
>> rows=7 loops=1)
>>          ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
>> time=0.002..0.003 rows=2 loops=1)
>>          ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_4  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.000..0.000 rows=1 loops=1)
>>          ->  Seq Scan on tprt_5  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>    ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.006..0.006
>> rows=1 loops=1)
>>          Buckets: 1024  Batches: 1  Memory Usage: 9kB
>>          ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
>> time=0.003..0.003 rows=1 loops=1)
>>                Filter: (col2 = ANY ('{1,2}'::integer[]))
>>                Rows Removed by Filter: 1
>>  Planning Time: 0.237 ms
>>  Execution Time: 0.060 ms
>>
>>
>> *even if i set hashjoin off*
>>
>> postgres=# set enable_hashjoin TO 0;
>> SET
>> postgres=# explain analyze select * from tprt where tprt.col1 in (select
>> tbl1.col1 from tbl1 where tbl1.col2 in (1, 2) );
>>                                                  QUERY PLAN
>>
>> ------------------------------------------------------------------------------------------------------------
>>  Nested Loop Semi Join  (cost=0.00..7.34 rows=2 width=4) (actual
>> time=0.013..0.023 rows=2 loops=1)
>>    Join Filter: (tprt.col1 = tbl1.col1)
>>    Rows Removed by Join Filter: 5
>>    ->  Append  (cost=0.00..6.10 rows=7 width=4) (actual time=0.004..0.010
>> rows=7 loops=1)
>>          ->  Seq Scan on tprt_1  (cost=0.00..1.02 rows=2 width=4) (actual
>> time=0.003..0.003 rows=2 loops=1)
>>          ->  Seq Scan on tprt_2  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_4  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_5  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>          ->  Seq Scan on tprt_6  (cost=0.00..1.01 rows=1 width=4) (actual
>> time=0.001..0.001 rows=1 loops=1)
>>    ->  Materialize  (cost=0.00..1.03 rows=2 width=4) (actual
>> time=0.001..0.001 rows=1 loops=7)
>>          ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4) (actual
>> time=0.007..0.007 rows=1 loops=1)
>>                Filter: (col2 = ANY ('{1,2}'::integer[]))
>>                Rows Removed by Filter: 1
>>  Planning Time: 0.578 ms
>>  Execution Time: 0.038 ms
>> (16 rows)
>>
>> **********************
>>
>>
>>
>> RANGE BASED PARTITION
>> **********************
>> postgres@go:~$ more q.sql
>> drop table tbl1;
>> drop table tprt;
>> create table tbl1(col1 int);
>> insert into tbl1 values (501), (505);
>>
>> -- Basic table
>> create table tprt (col1 int) partition by range(col1);
>> create table tprt_1 partition of tprt for values from (0) to (500);
>> create table tprt_2 partition of tprt for values from (500) to (1000);
>> create table tprt_3 partition of tprt for values from (1000) to (1500);
>> create table tprt_4 partition of tprt for values from (1500) to (2000);
>> create table tprt_5 partition of tprt for values from (2000) to (22500);
>>
>> create index tprt1_idx on tprt_1 (col1);
>> create index tprt2_idx on tprt_2 (col1);
>> create index tprt3_idx on tprt_3 (col1);
>> create index tprt4_idx on tprt_4 (col1);
>> create index tprt5_idx on tprt_5 (col1);
>>
>> insert into tprt values (501), (1001), (2001), (3001), (4001), (5001),
>> (501);
>>
>> vacuum analyze tbl1;
>> vacuum analyze tprt;
>>
>> alter table tbl1 add column col2 int default 0;
>> update tbl1 set col2 =1 where col1 = 501;
>>
>> explain analyze select * from tprt where tprt.col1 in (select tbl1.col1
>> from tbl1 where tbl1.col2 in (1, 2) );
>>
>>                                                    QUERY PLAN
>>
>> ----------------------------------------------------------------------------------------------------------------
>>  Nested Loop  (cost=3.29..55.37 rows=2 width=4) (actual time=0.016..0.018
>> rows=2 loops=1)
>>    ->  Unique  (cost=1.03..1.04 rows=2 width=4) (actual time=0.012..0.012
>> rows=1 loops=1)
>>          ->  Sort  (cost=1.03..1.04 rows=2 width=4) (actual
>> time=0.011..0.011 rows=1 loops=1)
>>                Sort Key: tbl1.col1
>>                Sort Method: quicksort  Memory: 25kB
>>                ->  Seq Scan on tbl1  (cost=0.00..1.02 rows=2 width=4)
>> (actual time=0.004..0.005 rows=1 loops=1)
>>                      Filter: (col2 = ANY ('{1,2}'::integer[]))
>>                      Rows Removed by Filter: 1
>>    ->  Append  (cost=2.26..26.86 rows=30 width=4) (actual
>> time=0.003..0.004 rows=2 loops=1)
>>          ->  Bitmap Heap Scan on tprt_1  (cost=2.26..11.81 rows=13
>> width=4) (never executed)
>>                Recheck Cond: (col1 = tbl1.col1)
>>                ->  Bitmap Index Scan on tprt1_idx  (cost=0.00..2.25
>> rows=13 width=0) (never executed)
>>                      Index Cond: (col1 = tbl1.col1)
>>          ->  Seq Scan on tprt_2  (cost=0.00..1.02 rows=2 width=4) (actual
>> time=0.001..0.002 rows=2 loops=1)
>>                Filter: (tbl1.col1 = col1)
>>          ->  Seq Scan on tprt_3  (cost=0.00..1.01 rows=1 width=4) (never
>> executed)
>>                Filter: (tbl1.col1 = col1)
>>          ->  Bitmap Heap Scan on tprt_4  (cost=2.26..11.81 rows=13
>> width=4) (never executed)
>>                Recheck Cond: (col1 = tbl1.col1)
>>                ->  Bitmap Index Scan on tprt4_idx  (cost=0.00..2.25
>> rows=13 width=0) (never executed)
>>                      Index Cond: (col1 = tbl1.col1)
>>          ->  Seq Scan on tprt_5  (cost=0.00..1.05 rows=1 width=4) (never
>> executed)
>>                Filter: (tbl1.col1 = col1)
>>  Planning Time: 0.214 ms
>>  Execution Time: 0.069 ms
>> (25 rows)
>>
>> **********************
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, 11 May 2021 at 17:44, Niels Jespersen <n...@dst.dk> wrote:
>>
>>> >
>>>
>>> >Sorry,
>>>
>>> >
>>>
>>> >I made a major mistake. I somehow saw the period and period_version as
>>> the same.
>>>
>>> >so, yes partitions are not pruned here. So my suggestion makes no
>>> sense.
>>>
>>>
>>>
>>> Thats quite ok. I think my plan now is to have a table returning
>>> function that executes a query dynamically. The query has a where caluse
>>> that is first constructed.
>>>
>>>
>>>
>>> Like this:
>>>
>>>
>>>
>>>
>>>
>>> return query execute format('select d.x, d.y from %1$I.%1$I d where
>>> d.period_version = any(' || quote_literal(_periode_version_array)
>>> ||'::text[])', register_in);
>>>
>>>
>>>
>>> Regards Niels
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Vijay
>> Mumbai, India
>>
>
>
> --
> Thanks,
> Vijay
> Mumbai, India
>


-- 
Thanks,
Vijay
Mumbai, India

Re: force partition pruning

Reply via email to