viirya commented on code in PR #21833:
URL: https://github.com/apache/datafusion/pull/21833#discussion_r3172487171


##########
datafusion/sqllogictest/test_files/nested_loop_join_spill.slt:
##########
@@ -57,6 +57,66 @@ Plan with Metrics
 06)------ProjectionExec: expr=[value@0 as v2], metrics=[<slt:ignore>]
 07)--------LazyMemoryExec: partitions=1, batch_generators=[generate_series: 
start=1, end=1, batch_size=8192], metrics=[<slt:ignore>]
 
+# --- RIGHT JOIN with non-equijoin predicate ---
+# Every (v1, v2=1) pair passes the predicate `v1 + v2 > 0`, so all
+# 100000 left rows match the single right row. Output count = 100000.
+query I nosort
+SELECT count(*)
+FROM generate_series(1, 100000) AS t1(v1)
+RIGHT JOIN generate_series(1, 1) AS t2(v2)
+  ON (t1.v1 + t2.v2) > 0
+----
+100000
+
+# RIGHT JOIN where NO right row matches any left row. All 3 right rows
+# get NULL-padded on the left side. This exercises the global right
+# bitmap: every right batch is seen across multiple left chunks, and
+# we must emit the correct unmatched rows at the end.
+query II rowsort
+SELECT t1.v1, t2.v2
+FROM generate_series(1, 5) AS t1(v1)

Review Comment:
   Good point — the small generate_series(1, 5) cases ran in single-pass and 
didn't exercise the global bitmap. I've replaced them with a 100K-left × 
200-right test using the predicate (t1.v1 + t2.v2) = 2 AND t2.v2 <= 100. This 
is non-equi (forces NLJ), forces spill (left side ~800KB > 150K limit), and 
produces exactly 1 matched pair + 199 unmatched right rows — so each right 
batch has both bits-on and bits-off entries that must be correctly accumulated 
across passes. There's a corresponding EXPLAIN ANALYZE assertion confirming 
spill_count=2. Added the same predicate for FULL JOIN too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to