viirya commented on code in PR #21833:
URL: https://github.com/apache/datafusion/pull/21833#discussion_r3172487631


##########
datafusion/physical-plan/src/joins/nested_loop_join.rs:
##########
@@ -1680,10 +1723,58 @@ impl NestedLoopJoinStream {
         }
     }
 
-    /// Handle EmitRightUnmatched state - emit unmatched right rows
+    /// Handle EmitRightUnmatched state - emit unmatched right rows.
+    ///
+    /// In memory-limited mode, instead of emitting unmatched right rows
+    /// per-batch (which would be incorrect since more left chunks may
+    /// match those rows), we merge the bitmap into the global accumulator
+    /// and defer emission to `EmitGlobalRightUnmatched`.
     fn handle_emit_right_unmatched(
         &mut self,
     ) -> ControlFlow<Poll<Option<Result<RecordBatch>>>> {
+        // In memory-limited mode, merge bitmap into global and move on
+        if self.is_memory_limited() {
+            debug_assert!(
+                self.current_right_batch_matched.is_some(),
+                "right bitmap must be present"
+            );
+            let bitmap = std::mem::take(&mut self.current_right_batch_matched)
+                .expect("right bitmap should be available");
+            let (values, _nulls) = bitmap.into_parts();
+

Review Comment:
   Done — extracted to SpillStateActive::merge_current_right_bitmap(idx, 
values), which centralizes the first-seen-vs-OR-merge behavior and the 
reservation accounting. The state-machine call site is now a 2-line invocation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to