Copilot commented on code in PR #60941:
URL: https://github.com/apache/doris/pull/60941#discussion_r2870298182
##########
be/src/pipeline/exec/streaming_aggregation_operator.cpp:
##########
@@ -728,10 +728,12 @@ bool
StreamingAggLocalState::_emplace_into_hash_table_limit(vectorized::Aggregat
};
SCOPED_TIMER(_hash_table_emplace_timer);
- for (i = 0; i < num_rows; ++i) {
- places[i] = *agg_method.lazy_emplace(state, i,
creator,
-
creator_for_null_key);
- }
+ vectorized::lazy_emplace_batch(agg_method, state,
num_rows, creator,
+
creator_for_null_key,
+ [&](uint32_t row,
auto& mapped) {
+ i = row;
+ places[row] =
mapped;
+ });
Review Comment:
In the new lazy_emplace_batch() usage, the `creator` lambda uses `i` (e.g.,
`_refresh_limit_heap(i, ...)`) but `i` is only updated inside the
result_handler, which is invoked *after* the emplace attempt. This means
inserts may refresh the limit heap using a stale row index, potentially
corrupting the LIMIT optimization logic. Consider extending lazy_emplace_batch
to take a pre_handler (like lazy_emplace_batch_void) that sets `i` before each
emplace, or otherwise ensure the current row index is set before `creator` can
run.
##########
be/src/vec/common/hash_table/hash_map_context.h:
##########
@@ -121,18 +121,14 @@ struct MethodBaseInner {
template <typename State>
ALWAYS_INLINE auto find(State& state, size_t i) {
- if constexpr (!is_string_hash_map()) {
- prefetch<true>(i);
- }
+ prefetch<true>(i);
return state.find_key_with_hash(*hash_table, i, keys[i],
hash_values[i]);
}
template <typename State, typename F, typename FF>
ALWAYS_INLINE auto lazy_emplace(State& state, size_t i, F&& creator,
FF&& creator_for_null_key) {
- if constexpr (!is_string_hash_map()) {
- prefetch<false>(i);
- }
+ prefetch<false>(i);
return state.lazy_emplace_key(*hash_table, i, keys[i], hash_values[i],
creator,
creator_for_null_key);
}
Review Comment:
This change makes find()/lazy_emplace() always call
MethodBaseInner::prefetch() even for StringHashMap. For StringHashTable,
prefetch() performs per-row key-size dispatch/branching (and touches key
bytes), which the previous code explicitly avoided. If there isn't benchmarking
showing a win, consider restoring the previous `if constexpr
(!is_string_hash_map())` guard to avoid a potential performance regression for
string-key paths that still use per-row find/lazy_emplace (e.g., joins or other
callers not using the new batch helpers).
##########
be/src/pipeline/exec/aggregation_sink_operator.cpp:
##########
@@ -650,10 +649,12 @@ bool
AggSinkLocalState::_emplace_into_hash_table_limit(vectorized::AggregateData
};
SCOPED_TIMER(_hash_table_emplace_timer);
- for (i = 0; i < num_rows; ++i) {
- places[i] = *agg_method.lazy_emplace(state, i,
creator,
-
creator_for_null_key);
- }
+ vectorized::lazy_emplace_batch(agg_method, state,
num_rows, creator,
+
creator_for_null_key,
+ [&](uint32_t row,
auto& mapped) {
+ i = row;
+ places[row] =
mapped;
+ });
Review Comment:
Same issue as StreamingAggLocalState: `creator`/`creator_for_null_key`
depend on `i` (`refresh_top_limit(i, ...)`), but `i` is only assigned in the
result_handler passed to lazy_emplace_batch, which runs after emplace. This can
refresh the top-limit state for the wrong row. Consider adding a pre_handler to
lazy_emplace_batch (or switching to an API that provides the row index to
`creator`) so `i` is set before any insertion-time callbacks execute.
##########
be/src/vec/common/hash_table/hash_map_context.h:
##########
@@ -365,8 +418,290 @@ struct MethodStringNoCache : public MethodBase<TData> {
key_columns[0]->reserve(num_rows);
key_columns[0]->insert_many_strings(input_keys.data(), num_rows);
}
+
+ const StringKeySubTableGroups& get_sub_table_groups() const { return
_sub_table_groups; }
+};
+
+/// Helper: detect whether HashMap is a nullable-wrapped StringHashMap.
+template <typename HashMap>
+struct IsNullableStringHashMap : std::false_type {};
+
+template <typename Mapped, typename Allocator>
+struct IsNullableStringHashMap<DataWithNullKey<StringHashMap<Mapped,
Allocator>>> : std::true_type {
};
+/// Helper: get the underlying StringHashTable from a hash table (handles
DataWithNullKey wrapper).
+template <typename HashMap>
+ALWAYS_INLINE auto& get_string_hash_table(HashMap& data) {
+ return data;
+}
+
+/// Compile-time key conversion for each sub-table group.
+/// Groups 1-4 use to_string_key<T>(); groups 0 and 5 use StringRef directly.
+/// Returns a pair: (converted_key, needs_copy).
+/// For groups 0 and 5, the key must be a non-const copy (lazy_emplace_if_zero
takes Key&).
Review Comment:
The comment says this helper returns a pair `(converted_key, needs_copy)`,
but the function returns only the converted key. Please update the comment to
match the actual return type/behavior to avoid misleading future changes.
```suggestion
/// Returns the converted key for the given group.
/// For groups 0 and 5, the key is returned as a non-const copy
(lazy_emplace_if_zero takes Key&).
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]