Hash Right join and seq scan

2024-07-02 Thread James Pang
 Both tables are hash partition tables ,  and we have a left out join ,
optimizer convert to Hash Right Join,  but it always try to seq scan on
tablexxx 32 paritions. there are almost 250k rows per parition for
table , so it's slow. As a workaround, I disable hashjoin the it run
much fast with index scan on table ,nestloop join.
With Hash Right Join, optimizer always use seq scan for outer table ?
PGv13.11

  ->  Hash Right Join  (cost=22.50..6760.46 rows=5961 width=78)
Hash Cond: ((aa.partitionkeyid)::text = (b_9.paritionkeyid)::text)
->  Append  (cost=0.00..6119.48 rows=149032 width=79)
  ->  Seq Scan on table_p0 aa_2  (cost=0.00..89.71
rows=2471 width=78)
  ->  Seq Scan on table_p1 aa_3  (cost=0.00..88.23
rows=2423 width=78)
  ->  Seq Scan on table_p2 aa_4  (cost=0.00..205.26
rows=5726 width=79)
  ->  Seq Scan on table_p3 aa_5  (cost=0.00..102.92
rows=2892 width=78)
  ->  Seq Scan on table_p4 aa_6  (cost=0.00..170.27
rows=4727 width=78)
  ...
  ->  Seq Scan on table_p31 aa_33  (cost=0.00..220.59
rows=6159 width=79)
  ->  Append  (cost=0.69..187.64 rows=4034 width=78) (actual
time=0.030..0.035 rows=3 loops=3)
index  scan table_p0 b_2
index  scan .   table_p1 b_3

index scan ... table_p31 b_33

Thanks,

James


Re: Hash Right join and seq scan

2024-07-02 Thread James Pang
the query is
   select 
   from table b join table  aa
   on  b.partitionkeyid=aa.partitionkeyid
   where b.id1= $1 and b.id2=$2 and b.rtime between $3 and $4;

  looks like optimizer try to "calculate cost for nestloop  for
scanning all partitions of tablexxx (32 hash partitions) " but actually ,
it only scan only a few partitions. that make the nestloop cost more than
hashjoin with table seq scan cost.  optimizer does not the partitioney
passed in by tableyyy that got selected based on indexes on other columns.
possible to make optimizer to calculate cost with partition pruning? since
the join key is hash partition key .


Thanks,

James


James Pang  於 2024年7月3日週三 下午12:57寫道:

>  Both tables are hash partition tables ,  and we have a left out join ,
> optimizer convert to Hash Right Join,  but it always try to seq scan on
> tablexxx 32 paritions. there are almost 250k rows per parition for
> table , so it's slow. As a workaround, I disable hashjoin the it run
> much fast with index scan on table ,nestloop join.
> With Hash Right Join, optimizer always use seq scan for outer table ?
> PGv13.11
>
>   ->  Hash Right Join  (cost=22.50..6760.46 rows=5961 width=78)
> Hash Cond: ((aa.partitionkeyid)::text = (b_9.paritionkeyid)::text)
> ->  Append  (cost=0.00..6119.48 rows=149032 width=79)
>   ->  Seq Scan on table_p0 aa_2  (cost=0.00..89.71
> rows=2471 width=78)
>   ->  Seq Scan on table_p1 aa_3  (cost=0.00..88.23
> rows=2423 width=78)
>   ->  Seq Scan on table_p2 aa_4  (cost=0.00..205.26
> rows=5726 width=79)
>   ->  Seq Scan on table_p3 aa_5  (cost=0.00..102.92
> rows=2892 width=78)
>   ->  Seq Scan on table_p4 aa_6  (cost=0.00..170.27
> rows=4727 width=78)
>   ...
>   ->  Seq Scan on table_p31 aa_33  (cost=0.00..220.59
> rows=6159 width=79)
>   ->  Append  (cost=0.69..187.64 rows=4034 width=78) (actual
> time=0.030..0.035 rows=3 loops=3)
> index  scan table_p0 b_2
> index  scan .   table_p1 b_3
> 
> index scan ... table_p31 b_33
>
> Thanks,
>
> James
>


Re: Hash Right join and seq scan

2024-07-02 Thread James Pang
   the join is  "left out join"

James Pang  於 2024年7月3日週三 下午2:51寫道:

>
> the query is
>select 
>from table b join table  aa
>on  b.partitionkeyid=aa.partitionkeyid
>where b.id1= $1 and b.id2=$2 and b.rtime between $3 and $4;
>
>   looks like optimizer try to "calculate cost for nestloop  for
> scanning all partitions of tablexxx (32 hash partitions) " but actually ,
> it only scan only a few partitions. that make the nestloop cost more than
> hashjoin with table seq scan cost.  optimizer does not the partitioney
> passed in by tableyyy that got selected based on indexes on other columns.
> possible to make optimizer to calculate cost with partition pruning? since
> the join key is hash partition key .
>
>
> Thanks,
>
> James
>
>
> James Pang  於 2024年7月3日週三 下午12:57寫道:
>
>>  Both tables are hash partition tables ,  and we have a left out join ,
>> optimizer convert to Hash Right Join,  but it always try to seq scan on
>> tablexxx 32 paritions. there are almost 250k rows per parition for
>> table , so it's slow. As a workaround, I disable hashjoin the it run
>> much fast with index scan on table ,nestloop join.
>> With Hash Right Join, optimizer always use seq scan for outer table ?
>> PGv13.11
>>
>>   ->  Hash Right Join  (cost=22.50..6760.46 rows=5961 width=78)
>> Hash Cond: ((aa.partitionkeyid)::text = (b_9.paritionkeyid)::text)
>> ->  Append  (cost=0.00..6119.48 rows=149032 width=79)
>>   ->  Seq Scan on table_p0 aa_2  (cost=0.00..89.71
>> rows=2471 width=78)
>>   ->  Seq Scan on table_p1 aa_3  (cost=0.00..88.23
>> rows=2423 width=78)
>>   ->  Seq Scan on table_p2 aa_4  (cost=0.00..205.26
>> rows=5726 width=79)
>>   ->  Seq Scan on table_p3 aa_5  (cost=0.00..102.92
>> rows=2892 width=78)
>>   ->  Seq Scan on table_p4 aa_6  (cost=0.00..170.27
>> rows=4727 width=78)
>>   ...
>>   ->  Seq Scan on table_p31 aa_33  (cost=0.00..220.59
>> rows=6159 width=79)
>>   ->  Append  (cost=0.69..187.64 rows=4034 width=78) (actual
>> time=0.030..0.035 rows=3 loops=3)
>> index  scan table_p0 b_2
>> index  scan .   table_p1 b_3
>> 
>> index scan ... table_p31 b_33
>>
>> Thanks,
>>
>> James
>>
>