peter-toth commented on code in PR #10868:
URL: https://github.com/apache/datafusion/pull/10868#discussion_r1637814077


##########
datafusion/optimizer/src/common_subexpr_eliminate.rs:
##########
@@ -166,6 +166,15 @@ impl CommonSubexprEliminate {
     ) -> Result<(Vec<Vec<Expr>>, LogicalPlan)> {
         let mut common_exprs = IndexMap::new();
 
+        input.schema().iter().for_each(|(qualifier, field)| {

Review Comment:
   No, what I meant by "idexes go out of sync" is that if your modified CSE 
rule runs on a plan that we got in the 3rd step (i.e. there is no `#1` in the 
plan) e.g.:
   ```rust
   let plan = LogicalPlanBuilder::from(table_scan.clone())
       .project(vec![(col("a") + col("b")).alias("#2"), col("c")])?
       .project(vec![
           col("#2").alias("c1"),
           col("#2").alias("c2"),
           (col("c") + lit(2)).alias("c3"),
           (col("c") + lit(2)).alias("c4"),
       ])?
       .build()?;
   ```
   then it produces an incorrect plan:
   ```
   Projection: #2 AS c1, #2 AS c2, #2 AS c3, #2 AS c4 // The issue here is that 
`#2` gets aliased to `#1` below, but `#2` doesn't change here.
     Projection: #2 AS #1, test.c + Int32(2) AS #2, test.c
       Projection: test.a + test.b AS #2, test.c
         TableScan: test
   ```
   This is because you inject `#2` into `common_exprs`, but you don't inject it 
to `expr_stats` (and others).
   
   IMO modifying `common_exprs` is hacky if you don't do it in 
`CommonSubexprRewriter`, that's why I suggested the solution in my previous 
comment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to