github-actions[bot] commented on code in PR #61535:
URL: https://github.com/apache/doris/pull/61535#discussion_r3033265862
##########
be/src/runtime/runtime_state.h:
##########
@@ -138,7 +138,40 @@ class RuntimeState {
const DescriptorTbl& desc_tbl() const { return *_desc_tbl; }
void set_desc_tbl(const DescriptorTbl* desc_tbl) { _desc_tbl = desc_tbl; }
+
+ int block_max_rows() const {
+ return config::enable_adaptive_batch_size
+ ? std::max(_query_options.batch_size,
int(preferred_block_size_rows()))
Review Comment:
`block_max_rows()` is now the shared cap for many operator/exchange/sort
paths, but this expression changes semantics even when the adaptive feature is
supposed to be disabled.
When `config::enable_adaptive_batch_size` is true and the session sets
`preferred_block_size_bytes = 0`, `block_max_bytes()` correctly goes back to
`0`, but `block_max_rows()` still becomes `max(batch_size,
preferred_block_size_rows)`. Since `preferred_block_size_rows` defaults to
`65535`, a session that intentionally uses `batch_size = 1024` will now let all
of the converted call sites accumulate up to 65535 rows. That makes the
`preferred_block_size_bytes = 0` disable path ineffective and can cause large
intermediate blocks, memory spikes, and OOM/latency regressions.
This also means users cannot lower the row cap below `batch_size` via
`preferred_block_size_rows`, because `max()` always widens it. Please gate the
row override on the byte-based feature actually being enabled (or otherwise
preserve legacy `batch_size` behavior when `preferred_block_size_bytes == 0`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]