asolimando commented on code in PR #4878:
URL: https://github.com/apache/calcite/pull/4878#discussion_r3069390566


##########
core/src/main/java/org/apache/calcite/rel/rules/SetOpToFilterRule.java:
##########
@@ -205,13 +207,35 @@ private static RelBuilder buildSetOp(RelBuilder builder, 
int count, RelNode setO
         // Skip non-deterministic conditions or those containing subqueries
         return Pair.of(input, null);
       }
-      return Pair.of(filter.getInput().stripped(), filter.getCondition());
+      final RelNode source = filter.getInput().stripped();
+      if (containsSortInProjectFilterChain(source)) {
+        return Pair.of(input, null);
+      }
+      return Pair.of(source, filter.getCondition());
+    }
+    if (containsSortInProjectFilterChain(input)) {
+      return Pair.of(input, null);
     }
     // For non-filter inputs, use TRUE literal as default condition.
     return Pair.of(input.stripped(),
         input.getCluster().getRexBuilder().makeLiteral(true));
   }
 
+  private static boolean containsSortInProjectFilterChain(RelNode input) {
+    RelNode current = input.stripped();
+    while (true) {
+      if (current instanceof Sort) {
+        return true;

Review Comment:
   I wonder if this couldn't/shouldn't be generalized via a new metadata 
characterizing what makes this rewrite unsafe (and probably makes potentially 
other filter pushdowns unsound?
   
   After reading the comments in Jira I am still trying to wrap my head around 
the problem and I think it's a more general pattern, but I couldn't spend much 
time thinking it through so I am just sharing my intuition hoping it helps.
   
   I understand the LIMIT 2 unions up to a LIMIT 4 conceptually but it's an "at 
most 4" and non deterministic as discussed.
   
   So what characterizes here a property of the union operands that makes the 
rewrite unsafe?
   
   It's a sort+limit it seems, maybe this can become a metadata? Does this 
apply to other types of pushdowns like filter-sort transpose? Can they all be 
rewritten in terms of this new metadata?
   
   I don't mean to block this effort until we figure this out. But I think we 
might have a more general fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to