github-actions[bot] commented on code in PR #60986:
URL: https://github.com/apache/doris/pull/60986#discussion_r2881492483
##########
be/src/pipeline/shuffle/exchange_writer.cpp:
##########
@@ -184,23 +184,46 @@ Status ExchangeTrivialWriter::_channel_add_rows(
RuntimeState* state,
std::vector<std::shared_ptr<vectorized::Channel>>& channels,
size_t channel_count, const std::vector<HashValType>& channel_ids,
size_t rows,
vectorized::Block* block, bool eos) {
Review Comment:
**[Observability]** The old code path went through `_add_rows_impl` →
`Channel::add_rows` → `next_serialized_block()`, which wraps the merge
operation in `SCOPED_TIMER(_parent->merge_block_timer())`. The new scatter path
completely bypasses this timer, losing performance profiling for the
scatter/merge phase.
Consider adding timer instrumentation around the scatter loop or in
`try_flush_after_scatter`, so the time spent scattering and serializing remains
visible in query profiles.
##########
be/src/vec/columns/column_string.cpp:
##########
@@ -281,6 +281,70 @@ void ColumnStr<T>::insert_indices_from(const IColumn& src,
const uint32_t* indic
sanity_check_simple();
}
+template <typename T>
+void ColumnStr<T>::insert_to_multi_column(const std::vector<IColumn*>& dsts,
+ const uint32_t* positions, size_t
rows) const {
+ const size_t num_dsts = dsts.size();
+ const auto* __restrict src_offset_data = offsets.data();
+ const auto* __restrict src_chars_data = chars.data();
+
+ // Phase 1: Compute per-destination row counts and total char sizes.
+ std::vector<size_t> dst_row_counts(num_dsts, 0);
+ std::vector<size_t> dst_chars_sizes(num_dsts, 0);
+
+ for (size_t i = 0; i < rows; ++i) {
Review Comment:
**[Correctness note]** When `i == 0`, this accesses `src_offset_data[-1]`.
The inline comment on line 302 notes this is OK due to `PaddedPODArray`, which
provides a padding element at index -1 initialized to 0. This is correct but
worth a brief comment here as well (same access pattern at index -1), since
it's a non-obvious invariant that readers unfamiliar with `PaddedPODArray`
might flag as a bug.
##########
be/src/vec/sink/vdata_stream_sender.h:
##########
@@ -86,6 +86,28 @@ class BlockSerializer {
MutableBlock* get_block() const { return _mutable_block.get(); }
+ // Ensure _mutable_block is initialized (using block's schema).
+ void ensure_mutable_block(Block* block) {
+ if (_mutable_block == nullptr) {
+ _mutable_block = MutableBlock::create_unique(block->clone_empty());
Review Comment:
**[Maintainability]** This flush condition (`rows() >= _batch_size || eos ||
(rows() > 0 && allocated_bytes() > _buffer_mem_limit)`) duplicates the logic in
`next_serialized_block()` (`vdata_stream_sender.cpp` around line 328). If the
flush heuristic is changed in one place, it could easily be missed in the other.
Consider extracting the condition into a single method and having both
`next_serialized_block()` and `need_flush()` share it.
Also minor: `_batch_size` is `const int` but `_mutable_block->rows()`
returns `size_t`. This is a signed/unsigned comparison. With `compile_check`
enabled, this would trigger a `-Wconversion` warning.
##########
be/test/pipeline/shuffle/exchange_writer_test.cpp:
##########
@@ -60,91 +60,6 @@ static std::vector<std::shared_ptr<Channel>>
make_disabled_channels(
return channels;
Review Comment:
**[Test Coverage]** Three unit tests (`BasicDistribution`,
`AllRowsToSingleChannel`, `EmptyInput`) are removed with no replacement. The
old tests verified the row-index scatter logic; the new scatter-based path has
zero test coverage.
Please add tests that:
1. Verify `insert_to_multi_column` correctness for `ColumnVector`,
`ColumnStr`, and `ColumnNullable` individually (unit tests in
`be/test/vec/columns/`)
2. Verify the new `ExchangeTrivialWriter::_channel_add_rows` scatter path
end-to-end (replacing these deleted tests)
3. Cover edge cases: empty input, all rows to single channel, single row,
nullable columns with nulls
##########
be/src/vec/columns/column.h:
##########
@@ -755,6 +762,14 @@ class IColumn : public COW<IColumn> {
// e.g. const(nullable(...)) is allowed.
// const(array(const(...))) is not allowed.
void check_const_only_in_top_level() const;
+
+ template <typename Derived>
+ void insert_to_multi_column_impl(const std::vector<IColumn*>& dsts, const
uint32_t* positions,
+ size_t rows) const {
+ for (size_t i = 0; i < rows; ++i) {
Review Comment:
**[Parallel Code Path / Performance]** `ColumnDecimal<T>` is structurally
identical to `ColumnVector<T>` — it stores a `PaddedPODArray<T>` and its
`insert_from` copies a single element. It's commonly used for DECIMAL,
DATETIME, and DATE columns. Without an optimized `insert_to_multi_column`
override, it falls through to this default row-by-row `insert_from` loop via
`COWHelper`, negating the performance benefit of the scatter optimization.
The implementation would be nearly identical to
`ColumnVector<T>::insert_to_multi_column`. Consider adding an optimized
override for `ColumnDecimal<T>`, or extracting a shared template for
POD-array-based columns.
Other column types without overrides (`ColumnArray`, `ColumnMap`,
`ColumnStruct`, `ColumnObject`, etc.) will also use this default path, but
`ColumnDecimal` is the most impactful since it's among the most common column
types in analytical queries.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]