SubhamSinghal opened a new issue, #21597:
URL: https://github.com/apache/datafusion/issues/21597

   ### Is your feature request related to a problem or challenge?
   
   [Discussion 
thread](https://github.com/apache/datafusion/pull/21549#pullrequestreview-4094422196)
                                                                                
                                         
   
   Several optimizer rules need to determine whether an expression "rejects 
nulls" — i.e., returns NULL/false when one or more input columns are NULL. 
Today this logic lives in `EliminateOuterJoin`'s 
`extract_non_nullable_columns()` with explicit pattern matching for each 
expression type (comparison, IN, BETWEEN, LIKE, IS TRUE/FALSE, etc.). Every new 
expression type must be added manually.
   
   
   ### Describe the solution you'd like
   
   Introduce a `NullRejection` trait on expressions, similar to Apache 
Calcite's 
[`Strong`](https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/plan/Strong.java#L100)
 class:
   
    ```rust
   
     trait PhysicalExpr {
         // ... existing methods ...
   
         /// Returns whether this expression is guaranteed to be not-true
         /// (i.e., NULL or false) when all given columns are NULL.
         ///
         /// - `Some(true)`:  definitely rejects nulls (safe to eliminate outer 
join)
         /// - `Some(false)`: definitely does NOT reject nulls
         /// - `None`:        unknown (conservative default, assume not 
null-rejecting)
         fn is_not_true(&self, all_null_cols: &[&Column]) -> Option<bool> {
             None
         }
     }
   
     ```
   
     Each expression type overrides with simple structural logic:
   
     ```rust
     // Comparison (=, >, <, etc.): NULL on either side → NULL result
     impl PhysicalExpr for BinaryExpr {
         fn is_not_true(&self, all_null_cols: &[&Column]) -> Option<bool> {
             match (self.left.is_not_true(all_null_cols), 
self.right.is_not_true(all_null_cols)) {
                 (Some(true), _) | (_, Some(true)) => Some(true),
                 (Some(false), Some(false)) => Some(false),
                 _ => None,
             }
         }
     }
   
     ```
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to