Hash Right join and seq scan
Both tables are hash partition tables , and we have a left out join , optimizer convert to Hash Right Join, but it always try to seq scan on tablexxx 32 paritions. there are almost 250k rows per parition for table , so it's slow. As a workaround, I disable hashjoin the it run much fast with index scan on table ,nestloop join. With Hash Right Join, optimizer always use seq scan for outer table ? PGv13.11 -> Hash Right Join (cost=22.50..6760.46 rows=5961 width=78) Hash Cond: ((aa.partitionkeyid)::text = (b_9.paritionkeyid)::text) -> Append (cost=0.00..6119.48 rows=149032 width=79) -> Seq Scan on table_p0 aa_2 (cost=0.00..89.71 rows=2471 width=78) -> Seq Scan on table_p1 aa_3 (cost=0.00..88.23 rows=2423 width=78) -> Seq Scan on table_p2 aa_4 (cost=0.00..205.26 rows=5726 width=79) -> Seq Scan on table_p3 aa_5 (cost=0.00..102.92 rows=2892 width=78) -> Seq Scan on table_p4 aa_6 (cost=0.00..170.27 rows=4727 width=78) ... -> Seq Scan on table_p31 aa_33 (cost=0.00..220.59 rows=6159 width=79) -> Append (cost=0.69..187.64 rows=4034 width=78) (actual time=0.030..0.035 rows=3 loops=3) index scan table_p0 b_2 index scan . table_p1 b_3 index scan ... table_p31 b_33 Thanks, James
Re: Hash Right join and seq scan
the query is select from table b join table aa on b.partitionkeyid=aa.partitionkeyid where b.id1= $1 and b.id2=$2 and b.rtime between $3 and $4; looks like optimizer try to "calculate cost for nestloop for scanning all partitions of tablexxx (32 hash partitions) " but actually , it only scan only a few partitions. that make the nestloop cost more than hashjoin with table seq scan cost. optimizer does not the partitioney passed in by tableyyy that got selected based on indexes on other columns. possible to make optimizer to calculate cost with partition pruning? since the join key is hash partition key . Thanks, James James Pang 於 2024年7月3日週三 下午12:57寫道: > Both tables are hash partition tables , and we have a left out join , > optimizer convert to Hash Right Join, but it always try to seq scan on > tablexxx 32 paritions. there are almost 250k rows per parition for > table , so it's slow. As a workaround, I disable hashjoin the it run > much fast with index scan on table ,nestloop join. > With Hash Right Join, optimizer always use seq scan for outer table ? > PGv13.11 > > -> Hash Right Join (cost=22.50..6760.46 rows=5961 width=78) > Hash Cond: ((aa.partitionkeyid)::text = (b_9.paritionkeyid)::text) > -> Append (cost=0.00..6119.48 rows=149032 width=79) > -> Seq Scan on table_p0 aa_2 (cost=0.00..89.71 > rows=2471 width=78) > -> Seq Scan on table_p1 aa_3 (cost=0.00..88.23 > rows=2423 width=78) > -> Seq Scan on table_p2 aa_4 (cost=0.00..205.26 > rows=5726 width=79) > -> Seq Scan on table_p3 aa_5 (cost=0.00..102.92 > rows=2892 width=78) > -> Seq Scan on table_p4 aa_6 (cost=0.00..170.27 > rows=4727 width=78) > ... > -> Seq Scan on table_p31 aa_33 (cost=0.00..220.59 > rows=6159 width=79) > -> Append (cost=0.69..187.64 rows=4034 width=78) (actual > time=0.030..0.035 rows=3 loops=3) > index scan table_p0 b_2 > index scan . table_p1 b_3 > > index scan ... table_p31 b_33 > > Thanks, > > James >
Re: Hash Right join and seq scan
the join is "left out join" James Pang 於 2024年7月3日週三 下午2:51寫道: > > the query is >select >from table b join table aa >on b.partitionkeyid=aa.partitionkeyid >where b.id1= $1 and b.id2=$2 and b.rtime between $3 and $4; > > looks like optimizer try to "calculate cost for nestloop for > scanning all partitions of tablexxx (32 hash partitions) " but actually , > it only scan only a few partitions. that make the nestloop cost more than > hashjoin with table seq scan cost. optimizer does not the partitioney > passed in by tableyyy that got selected based on indexes on other columns. > possible to make optimizer to calculate cost with partition pruning? since > the join key is hash partition key . > > > Thanks, > > James > > > James Pang 於 2024年7月3日週三 下午12:57寫道: > >> Both tables are hash partition tables , and we have a left out join , >> optimizer convert to Hash Right Join, but it always try to seq scan on >> tablexxx 32 paritions. there are almost 250k rows per parition for >> table , so it's slow. As a workaround, I disable hashjoin the it run >> much fast with index scan on table ,nestloop join. >> With Hash Right Join, optimizer always use seq scan for outer table ? >> PGv13.11 >> >> -> Hash Right Join (cost=22.50..6760.46 rows=5961 width=78) >> Hash Cond: ((aa.partitionkeyid)::text = (b_9.paritionkeyid)::text) >> -> Append (cost=0.00..6119.48 rows=149032 width=79) >> -> Seq Scan on table_p0 aa_2 (cost=0.00..89.71 >> rows=2471 width=78) >> -> Seq Scan on table_p1 aa_3 (cost=0.00..88.23 >> rows=2423 width=78) >> -> Seq Scan on table_p2 aa_4 (cost=0.00..205.26 >> rows=5726 width=79) >> -> Seq Scan on table_p3 aa_5 (cost=0.00..102.92 >> rows=2892 width=78) >> -> Seq Scan on table_p4 aa_6 (cost=0.00..170.27 >> rows=4727 width=78) >> ... >> -> Seq Scan on table_p31 aa_33 (cost=0.00..220.59 >> rows=6159 width=79) >> -> Append (cost=0.69..187.64 rows=4034 width=78) (actual >> time=0.030..0.035 rows=3 loops=3) >> index scan table_p0 b_2 >> index scan . table_p1 b_3 >> >> index scan ... table_p31 b_33 >> >> Thanks, >> >> James >> >
