[ 
https://issues.apache.org/jira/browse/IMPALA-14716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18070971#comment-18070971
 ] 

ASF subversion and git services commented on IMPALA-14716:
----------------------------------------------------------

Commit 81c82c063e73b966b92b103287346f91b6c98b03 in impala's branch 
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=81c82c063 ]

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner. Using calculations similar to
the current Impala planner is better for the first pass so we have slightly
more of an "apples to apples" comparison of the Calcite planner versus the
original planner.  These calculations should be re-examined later, especially
when other features such as generating histogram statistics are implemented.

The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different expressions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

IMPALA-14867 has been created for "between" selectivity which has not yet
been implemented.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Reviewed-on: http://gerrit.cloudera.org:8080/23930
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>


> Calcite Planner: Make condition estimates more similar to original planner
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-14716
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14716
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Steve Carlin
>            Priority: Major
>             Fix For: Impala 5.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to