BiteTheDDDDt opened a new pull request, #64596:
URL: https://github.com/apache/doris/pull/64596

   ### What problem does this PR solve?
   
   Issue Number: None
   
   Related PR: None
   
   Problem Summary:
   Nested loop join used to fully consume and buffer the build side before the 
probe side could produce rows. With a LIMIT above the join, only the probe side 
could stop early because the build side had already been fully read.
   
   This PR lets the nested loop join build side publish non-empty build blocks 
incrementally and yield to the probe side. For non-mark inner/cross joins, the 
probe side can produce rows from the currently published build prefix and 
request more build data only when needed. Other join types continue to wait for 
full build-side EOS before producing rows from the build data, preserving 
unmatched-row semantics. Once LIMIT or probe EOS is reached, the probe side 
marks the build side as no longer required so the build sink can finish early.
   
   Runtime filters are only published when the build side really reaches EOS, 
so a partial build prefix is not used to produce an invalid filter.
   
   ### Release note
   
   Improve nested loop join LIMIT execution by allowing the build side to stop 
early when enough rows have been produced.
   
   ### Check List (For Author)
   
   - Test: Format and static checks
       - `build-support/clang-format.sh 
be/src/exec/operator/nested_loop_join_build_operator.cpp 
be/src/exec/operator/nested_loop_join_probe_operator.cpp 
be/src/exec/pipeline/dependency.h`
       - `build-support/check-format.sh`
       - `git diff --check`
   - Behavior changed: Yes. Nested loop join can interleave build/probe 
execution and stop reading build rows after LIMIT is satisfied for safe join 
types.
   - Does this need documentation: No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to