Steve Carlin created HIVE-26737: ----------------------------------- Summary: Subquery returning wrong results when database has materialized views Key: HIVE-26737 URL: https://issues.apache.org/jira/browse/HIVE-26737 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Steve Carlin
When HS2 has materialized views in its registry, subqueries with correlated variables may return wrong results. An example of this: {code:java} CREATE TABLE t_test1( id int, int_col int, year int, month int ); CREATE TABLE t_test2( id int, int_col int, year int, month int ); CREATE TABLE dummy ( id int ) stored as orc TBLPROPERTIES ('transactional'='true'); CREATE MATERIALIZED VIEW need_a_mat_view_in_registry AS SELECT * FROM dummy where id > 5; INSERT INTO t_test1 VALUES (1, 1, 2009, 1), (10,0, 2009, 1); INSERT INTO t_test2 VALUES (1, 1, 2009, 1); select id, int_col, year, month from t_test1 s where s.int_col = (select count(*) from t_test2 t where s.id = t.id) order by id; {code} The select statement should produce 2 rows, but it is only producing one. The CBO plan produced has an inner join instead of a left join. {code:java} HiveSortLimit(sort0=[$0], dir0=[ASC]) HiveProject(id=[$0], int_col=[$1], year=[$2], month=[$3]) HiveJoin(condition=[AND(=($0, $5), =($4, $6))], joinType=[inner], algorithm=[none], cost=[not available]) HiveProject(id=[$0], int_col=[$1], year=[$2], month=[$3], CAST=[CAST($1):BIGINT]) HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL(CAST($1):BIGINT))]) HiveTableScan(table=[[default, t_test1]], table:alias=[s]) HiveProject(id=[$0], $f1=[$1]) HiveFilter(condition=[IS NOT NULL($1)]) HiveAggregate(group=[{0}], agg#0=[count()]) HiveFilter(condition=[IS NOT NULL($0)]) HiveTableScan(table=[[default, t_test2]], table:alias=[t]){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)