yonatan-sevenai opened a new issue, #21490:
URL: https://github.com/apache/datafusion/issues/21490

   ### Describe the bug
   
   `plan_to_sql` generates invalid SQL when an outer Projection excludes a 
column that the Sort references by alias. The alias is defined in an inner 
Projection that the Unparser flattens away.
   
     Given this logical plan:
   
     ```
     Projection: [a, b]                         -- outer: excludes sort column 
"c"
       Sort: [c DESC, fetch=1]                   -- references alias "c"
         Projection: [X AS a, Y AS b, Z AS c]    -- defines alias "c" → 
physical column "Z"
           SubqueryAlias: t
             TableScan: phys_table [X, Y, Z]
     ```
   
     The Unparser produces:
   
     ```sql
     SELECT t."X" AS a, t."Y" AS b FROM phys_table AS t ORDER BY c DESC NULLS 
FIRST LIMIT 1
     ```
   
     Column `c` does not exist — it was defined as `Z AS c` in the inner 
Projection, but that definition was dropped during plan rewriting.
   
     ### Root cause
   
     The function `rewrite_plan_for_sort_on_non_projected_fields` in 
`datafusion/sql/src/unparser/rewrite.rs` detects that the outer Projection's 
columns plus the Sort's columns form the same set as the inner Projection's 
columns. It then flattens the two Projections
      into one by replacing the inner Projection's expressions with the outer 
Projection's mapped expressions (line 257: 
`inner_p.expr.clone_from(&new_exprs)`).
   
     The set comparison succeeds because:
     - Inner Projection aliases are collapsed to their alias names: `{a, b, c}`
     - Outer Projection `{a, b}` + Sort column `{c}` = `{a, b, c}`
   
     However, `new_exprs` is built from only the outer Projection's expressions 
(`[X AS a, Y AS b]`). The sort column's alias definition (`Z AS c`) is not 
included. The rewritten plan becomes:
   
     ```
     Sort: [c DESC, fetch=1]              -- still references "c"
       Projection: [X AS a, Y AS b]       -- "Z AS c" is gone
         SubqueryAlias: t
           TableScan: phys_table
     ```
   
     The Sort's reference to `c` is now dangling.
   
   ### To Reproduce
   
    1. The **outer Projection must exclude the Sort column** — if the outer 
Projection includes `c`, the alias is preserved in `new_exprs` and the bug does 
not manifest.
     2. The Sort column must be defined as an **alias** in the inner Projection 
(not a plain column passthrough).
     3. A **SubqueryAlias** (or similar) node below the inner Projection is 
necessary to create the two-Projection shape that triggers the rewrite.
   
   
   ### Expected behavior
   
   The generated SQL should either:
     - Inline the physical column into ORDER BY: `ORDER BY t."Z" DESC NULLS 
FIRST`
     - Or preserve the subquery boundary so the alias remains in scope
   
   ### Additional context
   
   The following test shows the bug in action
    
   ```rust
     #[test]
     fn test_sort_on_aliased_column_dropped_by_outer_projection() -> Result<()> 
{
         let schema = Schema::new(vec![
             Field::new("X", DataType::Utf8, true),
             Field::new("Y", DataType::Utf8, true),
             Field::new("Z", DataType::Utf8, true),
         ]);
   
         let plan = table_scan(Some("phys_table"), &schema, None)?
             .alias("t")?
             .project(vec![
                 Expr::Column(Column::new(Some(TableReference::bare("t")), 
"X")).alias("a"),
                 Expr::Column(Column::new(Some(TableReference::bare("t")), 
"Y")).alias("b"),
                 Expr::Column(Column::new(Some(TableReference::bare("t")), 
"Z")).alias("c"),
             ])?
             .sort_with_limit(
                 vec![Expr::Column(Column::new_unqualified("c")).sort(false, 
true)],
                 Some(1),
             )?
             .project(vec![
                 Expr::Column(Column::new_unqualified("a")),
                 Expr::Column(Column::new_unqualified("b")),
             ])?
             .build()?;
   
         let unparser = Unparser::default();
         let sql = unparser.plan_to_sql(&plan)?.to_string();
   
         // This assertion fails: ORDER BY references "c" but the alias was 
dropped
         assert!(
             sql.contains("Z"),
             "ORDER BY references 'c' but the alias was dropped: {sql}"
         );
   
         Ok(())
     }
     ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to