BiteTheDDDDt opened a new pull request, #64596:
URL: https://github.com/apache/doris/pull/64596
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary:
Nested loop join used to fully consume and buffer the build side before the
probe side could produce rows. With a LIMIT above the join, only the probe side
could stop early because the build side had already been fully read.
This PR lets the nested loop join build side publish non-empty build blocks
incrementally and yield to the probe side. For non-mark inner/cross joins, the
probe side can produce rows from the currently published build prefix and
request more build data only when needed. Other join types continue to wait for
full build-side EOS before producing rows from the build data, preserving
unmatched-row semantics. Once LIMIT or probe EOS is reached, the probe side
marks the build side as no longer required so the build sink can finish early.
Runtime filters are only published when the build side really reaches EOS,
so a partial build prefix is not used to produce an invalid filter.
### Release note
Improve nested loop join LIMIT execution by allowing the build side to stop
early when enough rows have been produced.
### Check List (For Author)
- Test: Format and static checks
- `build-support/clang-format.sh
be/src/exec/operator/nested_loop_join_build_operator.cpp
be/src/exec/operator/nested_loop_join_probe_operator.cpp
be/src/exec/pipeline/dependency.h`
- `build-support/check-format.sh`
- `git diff --check`
- Behavior changed: Yes. Nested loop join can interleave build/probe
execution and stop reading build rows after LIMIT is satisfied for safe join
types.
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]