github-actions[bot] commented on code in PR #64596:
URL: https://github.com/apache/doris/pull/64596#discussion_r3450779803
##########
be/src/exec/operator/nested_loop_join_build_operator.cpp:
##########
@@ -54,20 +54,26 @@ Status
NestedLoopJoinBuildSinkLocalState::open(RuntimeState* state) {
Status NestedLoopJoinBuildSinkLocalState::close(RuntimeState* state, Status
exec_status) {
if (!state->is_cancelled()) {
- RETURN_IF_ERROR(
- _runtime_filter_producer_helper->process(state,
_shared_state->build_blocks));
+ RETURN_IF_ERROR(_runtime_filter_producer_helper->process(
Review Comment:
When LIMIT/probe completion stops a partial-build join before the build
child reaches EOS, this close path still publishes runtime filters from
whatever prefix has been appended so far. The probe sets
`build_side_no_more_required` from
`_finish_probe_side_for_incremental_build()`, and `sink_impl()` can then return
early on `should_stop_build()` without ever setting `build_side_eos`; `close()`
still calls `process()` on the current `build_blocks`, which marks the filters
READY and publishes them. A runtime filter built from only a build-side prefix
can incorrectly filter rows in downstream scans. Please only publish the
cross-join runtime filter after real build EOS, or call
`skip_process()`/disable the helper on the partial-stop path.
##########
be/src/exec/operator/nested_loop_join_probe_operator.cpp:
##########
@@ -695,19 +721,28 @@ Status
NestedLoopJoinProbeLocalState::_generate_lazy_block_base_build(RuntimeSta
size_t processed_rows = 0;
while (processed_rows + probe_rows <= state->batch_size()) {
- if (_probe_block_pos == probe_rows) {
- _current_build_row_pos++;
- _probe_block_pos = 0;
Review Comment:
This check can observe `build_blocks.size()` and `build_side_eos` from
different moments. If the probe has consumed `N` published blocks, it can read
size `N` here, then the build sink appends the final block and sets
`build_side_eos = true`; the probe then takes the EOS branch and treats the
current probe block as complete without ever joining it with that final block.
That can drop inner/cross join rows. Please make the block count and EOS state
a consistent snapshot, or re-read the size after seeing EOS and continue when
`_current_build_pos` is now within the published range.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]