lotan created HIVE-26678: ---------------------------- Summary: In the filter criteria associated with multiple tables, the filter result of the subquery by not in or in is incorrect. Key: HIVE-26678 URL: https://issues.apache.org/jira/browse/HIVE-26678 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 3.1.0 Reporter: lotan Fix For: 4.0.0
create testtable as follow: create table test101 (id string,id2 string); create table test102 (id string,id2 string); create table test103 (id string,id2 string); create table test104 (id string,id2 string); when cbo is false,run the following SQL statement: explain select count(1) from test101 t1 left join test102 t2 on t1.id=t2.id left join test103 t3 on t1.id=t3.id2 where t1.id in (select s.id from test104 s) and t3.id2='123'; you will see: The filter criteria in the right table are lost. The execution plan is as follows: +-----------------------------------------------------------------------------------------------------+ | Explain | +-----------------------------------------------------------------------------------------------------+ | STAGE DEPENDENCIES: | | Stage-9 is a root stage | | Stage-3 depends on stages: Stage-9 | | Stage-0 depends on stages: Stage-3 | | | | STAGE PLANS: | | Stage: Stage-9 | | Map Reduce Local Work | | Alias -> Map Local Tables: | | sq_1:s | | Fetch Operator | | limit: -1 | | t2 | | Fetch Operator | | limit: -1 | | t3 | | Fetch Operator | | limit: -1 | | Alias -> Map Local Operator Tree: | | sq_1:s | | TableScan | | alias: s | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Filter Operator | | predicate: id is not null (type: boolean) | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Select Operator | | expressions: id (type: string) | | outputColumnNames: _col0 | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Group By Operator | | keys: _col0 (type: string) | | mode: hash | | outputColumnNames: _col0 | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | HashTable Sink Operator | | keys: | | 0 _col0 (type: string) | | 1 _col0 (type: string) | | t2 | | TableScan | | alias: t2 | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Filter Operator | | predicate: id is not null (type: boolean) | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | HashTable Sink Operator | | keys: | | 0 id (type: string) | | 1 id (type: string) | | 2 id2 (type: string) | | t3 | | TableScan | | alias: t3 | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Filter Operator | | predicate: id2 is not null (type: boolean) | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | HashTable Sink Operator | | keys: | | 0 id (type: string) | | 1 id (type: string) | | 2 id2 (type: string) | | | | Stage: Stage-3 | | Map Reduce | | Map Operator Tree: | | TableScan | | alias: t1 | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Filter Operator | | predicate: id is not null (type: boolean) | | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Map Join Operator | | condition map: | | Left Outer Join 0 to 1 | | Left Outer Join 0 to 2 | | keys: | | 0 id (type: string) | | 1 id (type: string) | | 2 id2 (type: string) | | outputColumnNames: _col0 | | Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Map Join Operator | | condition map: | | Left Semi Join 0 to 1 | | keys: | | 0 _col0 (type: string) | | 1 _col0 (type: string) | | Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE | | Group By Operator | | aggregations: count(1) | | mode: hash | | outputColumnNames: _col0 | | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE | | Reduce Output Operator | | sort order: | | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE | | value expressions: _col0 (type: bigint) | | Local Work: | +-----------------------------------------------------------------------------------------------------+ | Explain | +-----------------------------------------------------------------------------------------------------+ | Map Reduce Local Work | | Reduce Operator Tree: | | Group By Operator | | aggregations: count(VALUE._col0) | | mode: mergepartial | | outputColumnNames: _col0 | | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE | | File Output Operator | | compressed: false | | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE | | table: | | input format: org.apache.hadoop.mapred.SequenceFileInputFormat | | output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | | serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | | | Stage: Stage-0 | | Fetch Operator | | limit: -1 | | Processor Tree: | | ListSink | | | +-----------------------------------------------------------------------------------------------------+ -- This message was sent by Atlassian Jira (v8.20.10#820010)