This is an automated email from the ASF dual-hosted git repository.
Mryange pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 73b32d29744 [refine](array) introduce ColumnArrayView to unify array
column access in array functions (#63386)
73b32d29744 is described below
commit 73b32d29744638d76c7379db9a0261a4f1988e6c
Author: Mryange <[email protected]>
AuthorDate: Mon May 25 11:48:20 2026 +0800
[refine](array) introduce ColumnArrayView to unify array column access in
array functions (#63386)
### What problem does this PR solve?
Issue Number: N/A
Problem Summary:
Array functions like `array_distance` and `array_join` previously
required
hand-written boilerplate to unwrap `Const`, `Nullable`, and plain
`ColumnArray`
variants before accessing element data. This led to duplicated code,
manual
offset arithmetic, and a proliferation of helper structs
(`ConstArrayInfo`,
`ColumnArrayExecutionData`, etc.).
Root cause: there was no shared abstraction for "read a row of an array
column
regardless of its outer wrapper". Each function solved this
independently,
accumulating inconsistent patterns.
This PR introduces `ColumnArrayView<PType>` (and its row-accessor
`ArrayDataView<PType>`) in `be/src/core/column/column_array_view.h`.
The view is created once via `ColumnArrayView::create(col)` and handles
Const/Nullable unwrapping automatically. Per-row access via
`operator[](row)`
returns an `ArrayDataView` with `get_data()`, `size()`, and
`is_null_at()`
— a uniform interface regardless of the underlying column shape.
For ultra-light nullable primitive loops, `ColumnArrayView` also exposes
flat-access helpers (`get_data()`, `get_null_map_data()`, `row_begin()`,
`row_end()`) so callers can keep wrapper unwrapping centralized while
still
iterating directly over the flattened buffers when benchmark data shows
that
per-element row-view access would regress.
**Benchmark results** (4096 rows, RELEASE build,
`--benchmark_repetitions=5` on a
shared host with CPU scaling enabled; raw outputs saved in
`benchmark_array_view_raw_results_20260519.txt` and
`benchmark_array_view_distance_split_raw_results_20260519.txt`):
**Row-view access (`operator[]` / `ArrayDataView`)**
| Scenario | Handwritten CPU (ns) | ColumnArrayView CPU (ns) | Delta |
|---|---|---|---|
| Distance Plain/Plain | 322530 | 311276 | **-3.5%** |
| Distance Const/Plain | 301473 | 289794 | **-3.9%** |
| Distance Nullable/Plain | 305970 | 313687 | +2.5% |
| Int64 Plain sum | 15971 | 16036 | +0.4% |
| Int64 WithNulls sum | 26700 | 29497 | +10.5% |
| String Plain len-sum | 16857 | 17120 | +1.6% |
| Int64 Const sum | 16051 | 16148 | +0.6% |
| Int64 Nullable sum | 16198 | 16174 | -0.1% |
**Flat-access follow-up (`get_data()` / `get_null_map_data()` /
`row_begin()` / `row_end()`)**
| Scenario | Handwritten CPU (ns) | ColumnArrayView Flat CPU (ns) |
Delta |
|---|---|---|---|
| Int64 WithNulls sum | 26700 | 26765 | +0.2% |
| Distance Plain/Plain | 322530 | 301274 | **-6.6%** |
| Distance Const/Plain | 301473 | 314259 | +4.2% |
| Distance Nullable/Plain | 305970 | 314077 | +2.7% |
Most production-shaped cases stay within a few percent on this shared
host.
The only stable double-digit regression is the synthetic `Int64
WithNulls`
microbenchmark, where each element performs only `if (!null) sum +=
val`.
The flat-access helper path removes that regression (+0.2% vs
handwritten)
while keeping `Const` / `Nullable` unwrapping centralized in
`ColumnArrayView`.
Because these numbers were collected on a shared machine with CPU
scaling
enabled, the distance cases show visible run-to-run noise;
---
be/benchmark/benchmark_column_array_view.hpp | 418 +++++++++++++++++++++
.../benchmark_column_array_view_distance.hpp | 353 +++++++++++++++++
be/benchmark/benchmark_main.cpp | 2 +
be/benchmark/binary_cast_benchmark.hpp | 49 +--
be/src/core/column/column_array_view.h | 135 +++++++
be/src/core/column/column_execute_util.h | 1 +
.../exprs/function/array/function_array_distance.h | 149 ++------
be/src/exprs/function/array/function_array_join.h | 39 +-
be/test/core/column/column_array_view_test.cpp | 290 ++++++++++++++
9 files changed, 1255 insertions(+), 181 deletions(-)
diff --git a/be/benchmark/benchmark_column_array_view.hpp
b/be/benchmark/benchmark_column_array_view.hpp
new file mode 100644
index 00000000000..09baf2bd435
--- /dev/null
+++ b/be/benchmark/benchmark_column_array_view.hpp
@@ -0,0 +1,418 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// ============================================================
+// Benchmark: ColumnArrayView vs hand-written array column access
+//
+// ColumnArrayView (see column_array_view.h) provides a unified interface
+// to read array column elements regardless of whether the underlying
+// column is Plain, ColumnConst, or ColumnNullable.
+//
+// This benchmark measures whether ColumnArrayView introduces measurable
+// overhead compared to hand-written (direct) array column access code.
+//
+// Test scenarios:
+// 1. Int64 array: sum all elements across all rows
+// 2. String array: sum lengths of all elements across all rows
+// 3. Const array: same as above but with ColumnConst wrapper
+// 4. Nullable array: with outer nullable wrapper
+// ============================================================
+
+#include <benchmark/benchmark.h>
+
+#include <cstdint>
+#include <string>
+
+#include "core/assert_cast.h"
+#include "core/column/column_array.h"
+#include "core/column/column_array_view.h"
+#include "core/column/column_const.h"
+#include "core/column/column_nullable.h"
+#include "core/column/column_string.h"
+#include "core/column/column_vector.h"
+#include "core/data_type/primitive_type.h"
+
+namespace doris {
+
+static constexpr size_t ARR_NUM_ROWS = 4096;
+static constexpr size_t ARR_ELEM_PER_ROW = 8;
+
+// ============================================================
+// Array column factory helpers
+// ============================================================
+
+// Build Array<Nullable(Int64)> with ARR_NUM_ROWS rows, each having
ARR_ELEM_PER_ROW elements.
+static ColumnPtr make_int64_array_column() {
+ auto data_col = ColumnInt64::create();
+ auto null_col = ColumnUInt8::create();
+ auto offsets = ColumnArray::ColumnOffsets::create();
+
+ data_col->reserve(ARR_NUM_ROWS * ARR_ELEM_PER_ROW);
+ null_col->reserve(ARR_NUM_ROWS * ARR_ELEM_PER_ROW);
+
+ size_t offset = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ for (size_t j = 0; j < ARR_ELEM_PER_ROW; ++j) {
+ data_col->insert_value(static_cast<int64_t>(i * ARR_ELEM_PER_ROW +
j + 1));
+ null_col->insert_value(0);
+ }
+ offset += ARR_ELEM_PER_ROW;
+ offsets->insert_value(offset);
+ }
+
+ auto nullable_data = ColumnNullable::create(std::move(data_col),
std::move(null_col));
+ return ColumnArray::create(std::move(nullable_data), std::move(offsets));
+}
+
+// Build Array<Nullable(Int64)> with some null elements (every 5th element is
null).
+static ColumnPtr make_int64_array_column_with_nulls() {
+ auto data_col = ColumnInt64::create();
+ auto null_col = ColumnUInt8::create();
+ auto offsets = ColumnArray::ColumnOffsets::create();
+
+ data_col->reserve(ARR_NUM_ROWS * ARR_ELEM_PER_ROW);
+ null_col->reserve(ARR_NUM_ROWS * ARR_ELEM_PER_ROW);
+
+ size_t offset = 0;
+ size_t flat_idx = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ for (size_t j = 0; j < ARR_ELEM_PER_ROW; ++j) {
+ data_col->insert_value(static_cast<int64_t>(flat_idx + 1));
+ null_col->insert_value(flat_idx % 5 == 0 ? 1 : 0);
+ flat_idx++;
+ }
+ offset += ARR_ELEM_PER_ROW;
+ offsets->insert_value(offset);
+ }
+
+ auto nullable_data = ColumnNullable::create(std::move(data_col),
std::move(null_col));
+ return ColumnArray::create(std::move(nullable_data), std::move(offsets));
+}
+
+// Build Array<Nullable(String)> with ARR_NUM_ROWS rows.
+static ColumnPtr make_string_array_column() {
+ auto data_col = ColumnString::create();
+ auto null_col = ColumnUInt8::create();
+ auto offsets = ColumnArray::ColumnOffsets::create();
+
+ size_t offset = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ for (size_t j = 0; j < ARR_ELEM_PER_ROW; ++j) {
+ std::string val = "str_" + std::to_string(i * ARR_ELEM_PER_ROW +
j);
+ data_col->insert_data(val.data(), val.size());
+ null_col->insert_value(0);
+ }
+ offset += ARR_ELEM_PER_ROW;
+ offsets->insert_value(offset);
+ }
+
+ auto nullable_data = ColumnNullable::create(std::move(data_col),
std::move(null_col));
+ return ColumnArray::create(std::move(nullable_data), std::move(offsets));
+}
+
+// Wrap with outer Nullable (no rows are actually null, just the wrapper
overhead).
+static ColumnPtr wrap_nullable(const ColumnPtr& col) {
+ return ColumnNullable::create(col->assume_mutable(),
+ ColumnUInt8::create(col->size(), 0));
+}
+
+// Wrap as Const.
+static ColumnPtr wrap_const(const ColumnPtr& col) {
+ // Take the first row of the array column, make a 1-row column, then
const-expand.
+ auto single = col->clone_empty();
+ single->insert_from(*col, 0);
+ return ColumnConst::create(std::move(single), ARR_NUM_ROWS);
+}
+
+// ============================================================
+// Hand-written accessor for Array<Nullable(Int64)>
+// ============================================================
+
+struct HandwrittenArrayAccessor {
+ const ColumnArray::Offsets64& offsets;
+ const ColumnInt64::Container& data;
+ const NullMap& nested_null_map;
+
+ explicit HandwrittenArrayAccessor(const ColumnPtr& col)
+ : offsets(assert_cast<const ColumnArray&>(*col).get_offsets()),
+ data(assert_cast<const ColumnInt64&>(
+ assert_cast<const ColumnNullable&>(
+ assert_cast<const ColumnArray&>(*col).get_data())
+ .get_nested_column())
+ .get_data()),
+ nested_null_map(assert_cast<const ColumnNullable&>(
+ assert_cast<const
ColumnArray&>(*col).get_data())
+ .get_null_map_data()) {}
+
+ size_t row_begin(size_t row) const { return offsets[row - 1]; }
+ size_t row_end(size_t row) const { return offsets[row]; }
+ int64_t value_at(size_t flat_idx) const { return data[flat_idx]; }
+ bool is_null_at(size_t flat_idx) const { return nested_null_map[flat_idx];
}
+};
+
+// ============================================================
+// 1. Int64 Plain Array: sum all elements
+// ============================================================
+
+static void Handwritten_ArrayInt64_Plain(benchmark::State& state) {
+ const auto col = make_int64_array_column();
+ HandwrittenArrayAccessor acc(col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ size_t begin = acc.row_begin(i);
+ size_t end = acc.row_end(i);
+ for (size_t j = begin; j < end; ++j) {
+ sum += acc.value_at(j);
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(Handwritten_ArrayInt64_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_ArrayInt64_Plain(benchmark::State& state) {
+ const auto col = make_int64_array_column();
+ const auto view = ColumnArrayView<TYPE_BIGINT>::create(col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ auto arr = view[i];
+ for (size_t j = 0; j < arr.size(); ++j) {
+ sum += arr.value_at(j);
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(ArrayView_ArrayInt64_Plain)->Unit(benchmark::kNanosecond);
+
+// ============================================================
+// 2. Int64 Array with null elements: sum non-null elements
+// ============================================================
+
+static void Handwritten_ArrayInt64_WithNulls(benchmark::State& state) {
+ const auto col = make_int64_array_column_with_nulls();
+ HandwrittenArrayAccessor acc(col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ size_t begin = acc.row_begin(i);
+ size_t end = acc.row_end(i);
+ for (size_t j = begin; j < end; ++j) {
+ if (!acc.is_null_at(j)) {
+ sum += acc.value_at(j);
+ }
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(Handwritten_ArrayInt64_WithNulls)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_ArrayInt64_WithNulls(benchmark::State& state) {
+ const auto col = make_int64_array_column_with_nulls();
+ const auto view = ColumnArrayView<TYPE_BIGINT>::create(col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ auto arr = view[i];
+ for (size_t j = 0; j < arr.size(); ++j) {
+ if (!arr.is_null_at(j)) {
+ sum += arr.value_at(j);
+ }
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(ArrayView_ArrayInt64_WithNulls)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_ArrayInt64_WithNulls_Flat(benchmark::State& state) {
+ const auto col = make_int64_array_column_with_nulls();
+ const auto view = ColumnArrayView<TYPE_BIGINT>::create(col);
+ const auto* data = view.get_data();
+ const auto* null_map = view.get_null_map_data();
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ size_t begin = view.row_begin(i);
+ size_t end = view.row_end(i);
+ for (size_t j = begin; j < end; ++j) {
+ if (!null_map[j]) {
+ sum += data[j];
+ }
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(ArrayView_ArrayInt64_WithNulls_Flat)->Unit(benchmark::kNanosecond);
+
+// ============================================================
+// 3. String Array: sum string lengths
+// ============================================================
+
+struct HandwrittenStringArrayAccessor {
+ const ColumnArray::Offsets64& offsets;
+ const ColumnString& str_col;
+ const NullMap& nested_null_map;
+
+ explicit HandwrittenStringArrayAccessor(const ColumnPtr& col)
+ : offsets(assert_cast<const ColumnArray&>(*col).get_offsets()),
+ str_col(assert_cast<const ColumnString&>(
+ assert_cast<const ColumnNullable&>(
+ assert_cast<const ColumnArray&>(*col).get_data())
+ .get_nested_column())),
+ nested_null_map(assert_cast<const ColumnNullable&>(
+ assert_cast<const
ColumnArray&>(*col).get_data())
+ .get_null_map_data()) {}
+
+ size_t row_begin(size_t row) const { return offsets[row - 1]; }
+ size_t row_end(size_t row) const { return offsets[row]; }
+ StringRef value_at(size_t flat_idx) const { return
str_col.get_data_at(flat_idx); }
+ bool is_null_at(size_t flat_idx) const { return nested_null_map[flat_idx];
}
+};
+
+static void Handwritten_ArrayString_Plain(benchmark::State& state) {
+ const auto col = make_string_array_column();
+ HandwrittenStringArrayAccessor acc(col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ size_t begin = acc.row_begin(i);
+ size_t end = acc.row_end(i);
+ for (size_t j = begin; j < end; ++j) {
+ sum += acc.value_at(j).size;
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(Handwritten_ArrayString_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_ArrayString_Plain(benchmark::State& state) {
+ const auto col = make_string_array_column();
+ const auto view = ColumnArrayView<TYPE_STRING>::create(col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ auto arr = view[i];
+ for (size_t j = 0; j < arr.size(); ++j) {
+ sum += arr.value_at(j).size;
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(ArrayView_ArrayString_Plain)->Unit(benchmark::kNanosecond);
+
+// ============================================================
+// 4. Const Array: Const(Array<Int64>)
+// ============================================================
+
+static void Handwritten_ArrayInt64_Const(benchmark::State& state) {
+ const auto base = make_int64_array_column();
+ const auto const_col = wrap_const(base);
+ // Hand-written: unpack const, then access the single row repeatedly
+ const auto& inner = assert_cast<const
ColumnConst&>(*const_col).get_data_column();
+ const auto& array_col = assert_cast<const ColumnArray&>(inner);
+ const auto& arr_offsets = array_col.get_offsets();
+ const auto& nested_nullable = assert_cast<const
ColumnNullable&>(array_col.get_data());
+ const auto& int_data = assert_cast<const
ColumnInt64&>(nested_nullable.get_nested_column()).get_data();
+
+ size_t begin = arr_offsets[-1]; // sentinel = 0
+ size_t end = arr_offsets[0];
+
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ for (size_t j = begin; j < end; ++j) {
+ sum += int_data[j];
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(Handwritten_ArrayInt64_Const)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_ArrayInt64_Const(benchmark::State& state) {
+ const auto base = make_int64_array_column();
+ const auto const_col = wrap_const(base);
+ const auto view = ColumnArrayView<TYPE_BIGINT>::create(const_col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ auto arr = view[i];
+ for (size_t j = 0; j < arr.size(); ++j) {
+ sum += arr.value_at(j);
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(ArrayView_ArrayInt64_Const)->Unit(benchmark::kNanosecond);
+
+// ============================================================
+// 5. Nullable Array: Nullable(Array<Int64>)
+// ============================================================
+
+static void Handwritten_ArrayInt64_Nullable(benchmark::State& state) {
+ const auto base = make_int64_array_column();
+ const auto nullable_col = wrap_nullable(base);
+ // Hand-written: unpack nullable
+ const auto& nullable = assert_cast<const ColumnNullable&>(*nullable_col);
+ const auto& outer_null_map = nullable.get_null_map_data();
+ const auto& array_col = assert_cast<const
ColumnArray&>(nullable.get_nested_column());
+ const auto& arr_offsets = array_col.get_offsets();
+ const auto& nested_nullable = assert_cast<const
ColumnNullable&>(array_col.get_data());
+ const auto& int_data = assert_cast<const
ColumnInt64&>(nested_nullable.get_nested_column()).get_data();
+
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ if (outer_null_map[i]) continue;
+ size_t begin = arr_offsets[i - 1];
+ size_t end = arr_offsets[i];
+ for (size_t j = begin; j < end; ++j) {
+ sum += int_data[j];
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(Handwritten_ArrayInt64_Nullable)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_ArrayInt64_Nullable(benchmark::State& state) {
+ const auto base = make_int64_array_column();
+ const auto nullable_col = wrap_nullable(base);
+ const auto view = ColumnArrayView<TYPE_BIGINT>::create(nullable_col);
+ for (auto _ : state) {
+ int64_t sum = 0;
+ for (size_t i = 0; i < ARR_NUM_ROWS; ++i) {
+ if (view.is_null_at(i)) continue;
+ auto arr = view[i];
+ for (size_t j = 0; j < arr.size(); ++j) {
+ sum += arr.value_at(j);
+ }
+ }
+ benchmark::DoNotOptimize(sum);
+ }
+}
+BENCHMARK(ArrayView_ArrayInt64_Nullable)->Unit(benchmark::kNanosecond);
+
+} // namespace doris
diff --git a/be/benchmark/benchmark_column_array_view_distance.hpp
b/be/benchmark/benchmark_column_array_view_distance.hpp
new file mode 100644
index 00000000000..34fd287f203
--- /dev/null
+++ b/be/benchmark/benchmark_column_array_view_distance.hpp
@@ -0,0 +1,353 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// ============================================================
+// Benchmark: ColumnArrayView vs hand-written for array distance
+//
+// Simulates the FunctionArrayDistance pattern:
+// - Build Array<Nullable(Float32)> columns
+// - Extract raw float* pointers + dimensions per row
+// - Call faiss L2 distance on each row pair
+//
+// Compares:
+// 1. Hand-written: manual Const/Nullable unwrapping + offsets
+// 2. ColumnArrayView: original row-view access via ArrayDataView::get_data()
+// 3. ColumnArrayView flat access: prefetch flat data pointer + row offsets
+// ============================================================
+
+#include <benchmark/benchmark.h>
+
+#include <cmath>
+#include <cstdint>
+#include <random>
+
+#include "core/assert_cast.h"
+#include "core/column/column_array.h"
+#include "core/column/column_array_view.h"
+#include "core/column/column_const.h"
+#include "core/column/column_nullable.h"
+#include "core/column/column_vector.h"
+#include "core/data_type/primitive_type.h"
+
+namespace doris {
+
+// Inline L2 distance to avoid faiss build dependency in benchmark.
+// Both paths call the same function, so the measurement is purely
+// about pointer-extraction overhead, not about the distance kernel.
+static inline float inline_l2_distance(const float* x, const float* y, size_t
d) {
+ float sum = 0.0f;
+ for (size_t i = 0; i < d; ++i) {
+ float diff = x[i] - y[i];
+ sum += diff * diff;
+ }
+ return std::sqrt(sum);
+}
+
+static constexpr size_t DIST_NUM_ROWS = 4096;
+static constexpr size_t DIST_DIM = 128; // typical embedding dimension
+
+// ============================================================
+// Column factory: Array<Nullable(Float32)> with fixed dimension
+// ============================================================
+
+static ColumnPtr make_float_array_column_for_dist(size_t num_rows, size_t dim)
{
+ auto data_col = ColumnFloat32::create();
+ auto null_col = ColumnUInt8::create();
+ auto offsets = ColumnArray::ColumnOffsets::create();
+
+ data_col->reserve(num_rows * dim);
+ null_col->reserve(num_rows * dim);
+
+ std::mt19937 rng(42);
+ std::uniform_real_distribution<float> dist(-1.0f, 1.0f);
+
+ size_t offset = 0;
+ for (size_t i = 0; i < num_rows; ++i) {
+ for (size_t j = 0; j < dim; ++j) {
+ data_col->insert_value(dist(rng));
+ null_col->insert_value(0);
+ }
+ offset += dim;
+ offsets->insert_value(offset);
+ }
+
+ auto nullable_data = ColumnNullable::create(std::move(data_col),
std::move(null_col));
+ return ColumnArray::create(std::move(nullable_data), std::move(offsets));
+}
+
+static ColumnPtr make_const_float_array_for_dist(size_t dim) {
+ auto single = make_float_array_column_for_dist(1, dim);
+ return ColumnConst::create(std::move(single), DIST_NUM_ROWS);
+}
+
+// ============================================================
+// 1. Both columns non-const: L2 distance per row
+// ============================================================
+
+static void Handwritten_Distance_Plain_Plain(benchmark::State& state) {
+ const auto col1 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ // Hand-written extraction (mirrors FunctionArrayDistance::execute_impl)
+ const auto& arr1 = assert_cast<const ColumnArray&>(*col1);
+ const auto& arr2 = assert_cast<const ColumnArray&>(*col2);
+ const auto& nested1 = assert_cast<const ColumnNullable&>(arr1.get_data());
+ const auto& nested2 = assert_cast<const ColumnNullable&>(arr2.get_data());
+ const auto& float1 = assert_cast<const
ColumnFloat32&>(nested1.get_nested_column());
+ const auto& float2 = assert_cast<const
ColumnFloat32&>(nested2.get_nested_column());
+ const auto* fdata1 = float1.get_data().data();
+ const auto* fdata2 = float2.get_data().data();
+ const auto& offsets1 = arr1.get_offsets();
+ const auto& offsets2 = arr2.get_offsets();
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ auto prev1 = offsets1[row - 1];
+ auto prev2 = offsets2[row - 1];
+ auto size1 = offsets1[row] - prev1;
+ dst_data[row] = inline_l2_distance(fdata1 + prev1, fdata2 + prev2,
size1);
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(Handwritten_Distance_Plain_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_Distance_Plain_Plain(benchmark::State& state) {
+ const auto col1 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ const auto view1 = ColumnArrayView<TYPE_FLOAT>::create(col1);
+ const auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ auto a1 = view1[row];
+ auto a2 = view2[row];
+ const float* p1 = a1.get_data();
+ const float* p2 = a2.get_data();
+ dst_data[row] = inline_l2_distance(p1, p2, a1.size());
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(ArrayView_Distance_Plain_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_Distance_Plain_Plain_Flat(benchmark::State& state) {
+ const auto col1 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ const auto view1 = ColumnArrayView<TYPE_FLOAT>::create(col1);
+ const auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
+ const auto* data1 = view1.get_data();
+ const auto* data2 = view2.get_data();
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ size_t begin1 = view1.row_begin(row);
+ size_t begin2 = view2.row_begin(row);
+ size_t dim1 = view1.row_end(row) - begin1;
+ dst_data[row] = inline_l2_distance(data1 + begin1, data2 + begin2,
dim1);
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(ArrayView_Distance_Plain_Plain_Flat)->Unit(benchmark::kNanosecond);
+
+// ============================================================
+// 2. One column const (query vs many vectors)
+// ============================================================
+
+static void Handwritten_Distance_Const_Plain(benchmark::State& state) {
+ const auto const_col = make_const_float_array_for_dist(DIST_DIM);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ // Extract const array once
+ const auto& const_inner = assert_cast<const
ColumnConst&>(*const_col).get_data_column();
+ const auto& const_arr = assert_cast<const ColumnArray&>(const_inner);
+ const auto& const_nested = assert_cast<const
ColumnNullable&>(const_arr.get_data());
+ const auto& const_float = assert_cast<const
ColumnFloat32&>(const_nested.get_nested_column());
+ const float* const_data = const_float.get_data().data();
+ size_t const_dim = const_float.size();
+
+ // Extract non-const array
+ const auto& arr2 = assert_cast<const ColumnArray&>(*col2);
+ const auto& nested2 = assert_cast<const ColumnNullable&>(arr2.get_data());
+ const auto& float2 = assert_cast<const
ColumnFloat32&>(nested2.get_nested_column());
+ const auto* fdata2 = float2.get_data().data();
+ const auto& offsets2 = arr2.get_offsets();
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ auto prev2 = offsets2[row - 1];
+ dst_data[row] = inline_l2_distance(const_data, fdata2 + prev2,
const_dim);
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(Handwritten_Distance_Const_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_Distance_Const_Plain(benchmark::State& state) {
+ const auto const_col = make_const_float_array_for_dist(DIST_DIM);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ const auto view1 = ColumnArrayView<TYPE_FLOAT>::create(const_col);
+ const auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ auto a1 = view1[row];
+ auto a2 = view2[row];
+ const float* p1 = a1.get_data();
+ const float* p2 = a2.get_data();
+ dst_data[row] = inline_l2_distance(p1, p2, a1.size());
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(ArrayView_Distance_Const_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_Distance_Const_Plain_Flat(benchmark::State& state) {
+ const auto const_col = make_const_float_array_for_dist(DIST_DIM);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ const auto view1 = ColumnArrayView<TYPE_FLOAT>::create(const_col);
+ const auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
+ const auto* data1 = view1.get_data();
+ const auto* data2 = view2.get_data();
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ size_t begin1 = view1.row_begin(row);
+ size_t begin2 = view2.row_begin(row);
+ size_t dim1 = view1.row_end(row) - begin1;
+ dst_data[row] = inline_l2_distance(data1 + begin1, data2 + begin2,
dim1);
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(ArrayView_Distance_Const_Plain_Flat)->Unit(benchmark::kNanosecond);
+
+// ============================================================
+// 3. Nullable(Array) vs plain Array
+// ============================================================
+
+static ColumnPtr wrap_nullable_for_dist(const ColumnPtr& col) {
+ return ColumnNullable::create(col->assume_mutable(),
ColumnUInt8::create(col->size(), 0));
+}
+
+static void Handwritten_Distance_Nullable_Plain(benchmark::State& state) {
+ const auto base1 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+ const auto nullable_col1 = wrap_nullable_for_dist(base1);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ // Unwrap nullable
+ const auto& nullable1 = assert_cast<const ColumnNullable&>(*nullable_col1);
+ const auto& arr1 = assert_cast<const
ColumnArray&>(nullable1.get_nested_column());
+ const auto& nested1 = assert_cast<const ColumnNullable&>(arr1.get_data());
+ const auto& float1 = assert_cast<const
ColumnFloat32&>(nested1.get_nested_column());
+ const auto* fdata1 = float1.get_data().data();
+ const auto& offsets1 = arr1.get_offsets();
+
+ const auto& arr2 = assert_cast<const ColumnArray&>(*col2);
+ const auto& nested2 = assert_cast<const ColumnNullable&>(arr2.get_data());
+ const auto& float2 = assert_cast<const
ColumnFloat32&>(nested2.get_nested_column());
+ const auto* fdata2 = float2.get_data().data();
+ const auto& offsets2 = arr2.get_offsets();
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ auto prev1 = offsets1[row - 1];
+ auto prev2 = offsets2[row - 1];
+ auto size1 = offsets1[row] - prev1;
+ dst_data[row] = inline_l2_distance(fdata1 + prev1, fdata2 + prev2,
size1);
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(Handwritten_Distance_Nullable_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_Distance_Nullable_Plain(benchmark::State& state) {
+ const auto base1 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+ const auto nullable_col1 = wrap_nullable_for_dist(base1);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ const auto view1 = ColumnArrayView<TYPE_FLOAT>::create(nullable_col1);
+ const auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ auto a1 = view1[row];
+ auto a2 = view2[row];
+ const float* p1 = a1.get_data();
+ const float* p2 = a2.get_data();
+ dst_data[row] = inline_l2_distance(p1, p2, a1.size());
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(ArrayView_Distance_Nullable_Plain)->Unit(benchmark::kNanosecond);
+
+static void ArrayView_Distance_Nullable_Plain_Flat(benchmark::State& state) {
+ const auto base1 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+ const auto nullable_col1 = wrap_nullable_for_dist(base1);
+ const auto col2 = make_float_array_column_for_dist(DIST_NUM_ROWS,
DIST_DIM);
+
+ const auto view1 = ColumnArrayView<TYPE_FLOAT>::create(nullable_col1);
+ const auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
+ const auto* data1 = view1.get_data();
+ const auto* data2 = view2.get_data();
+
+ auto dst = ColumnFloat32::create(DIST_NUM_ROWS);
+ auto& dst_data = dst->get_data();
+
+ for (auto _ : state) {
+ for (size_t row = 0; row < DIST_NUM_ROWS; ++row) {
+ size_t begin1 = view1.row_begin(row);
+ size_t begin2 = view2.row_begin(row);
+ size_t dim1 = view1.row_end(row) - begin1;
+ dst_data[row] = inline_l2_distance(data1 + begin1, data2 + begin2,
dim1);
+ }
+ benchmark::ClobberMemory();
+ }
+}
+BENCHMARK(ArrayView_Distance_Nullable_Plain_Flat)->Unit(benchmark::kNanosecond);
+
+} // namespace doris
diff --git a/be/benchmark/benchmark_main.cpp b/be/benchmark/benchmark_main.cpp
index 8d68717fedf..f3d0aa5001d 100644
--- a/be/benchmark/benchmark_main.cpp
+++ b/be/benchmark/benchmark_main.cpp
@@ -20,6 +20,8 @@
#include "benchmark_bit_pack.hpp"
#include "benchmark_bits.hpp"
#include "benchmark_block_bloom_filter.hpp"
+#include "benchmark_column_array_view.hpp"
+#include "benchmark_column_array_view_distance.hpp"
#include "benchmark_column_view.hpp"
#include "benchmark_fastunion.hpp"
#include "benchmark_hll_merge.hpp"
diff --git a/be/benchmark/binary_cast_benchmark.hpp
b/be/benchmark/binary_cast_benchmark.hpp
index a130b3fc528..00f2dc53980 100644
--- a/be/benchmark/binary_cast_benchmark.hpp
+++ b/be/benchmark/binary_cast_benchmark.hpp
@@ -53,51 +53,10 @@ To old_binary_cast(From from) {
from_decv2_to_i128 || from_decv2_to_i256 ||
from_ui32_to_date_v2 ||
from_date_v2_to_ui32 || from_ui64_to_datetime_v2 ||
from_datetime_v2_to_ui64);
- if constexpr (from_u64_to_db) {
- TypeConverter conv;
- conv.u64 = from;
- return conv.dbl;
- } else if constexpr (from_i64_to_db) {
- TypeConverter conv;
- conv.i64 = from;
- return conv.dbl;
- } else if constexpr (from_db_to_i64) {
- TypeConverter conv;
- conv.dbl = from;
- return conv.i64;
- } else if constexpr (from_db_to_u64) {
- TypeConverter conv;
- conv.dbl = from;
- return conv.u64;
- } else if constexpr (from_i64_to_vec_dt) {
- VecDateTimeInt64Union conv = {.i64 = from};
- return conv.dt;
- } else if constexpr (from_ui32_to_date_v2) {
- DateV2UInt32Union conv = {.ui32 = from};
- return conv.dt;
- } else if constexpr (from_date_v2_to_ui32) {
- DateV2UInt32Union conv = {.dt = from};
- return conv.ui32;
- } else if constexpr (from_ui64_to_datetime_v2) {
- DateTimeV2UInt64Union conv = {.ui64 = from};
- return conv.dt;
- } else if constexpr (from_datetime_v2_to_ui64) {
- DateTimeV2UInt64Union conv = {.dt = from};
- return conv.ui64;
- } else if constexpr (from_vec_dt_to_i64) {
- VecDateTimeInt64Union conv = {.dt = from};
- return conv.i64;
- } else if constexpr (from_i128_to_decv2) {
- DecimalInt128Union conv;
- conv.i128 = from;
- return conv.decimal;
- } else if constexpr (from_decv2_to_i128) {
- DecimalInt128Union conv;
- conv.decimal = from;
- return conv.i128;
- } else {
- throw Exception(Status::FatalError("__builtin_unreachable"));
- }
+ static_assert(sizeof(From) == sizeof(To));
+ To to;
+ std::memcpy(&to, &from, sizeof(To));
+ return to;
}
// Generate random datetime values in uint64_t format for testing
diff --git a/be/src/core/column/column_array_view.h
b/be/src/core/column/column_array_view.h
new file mode 100644
index 00000000000..cc74d6e3c70
--- /dev/null
+++ b/be/src/core/column/column_array_view.h
@@ -0,0 +1,135 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "core/column/column_array.h"
+#include "core/column/column_execute_util.h"
+
+namespace doris {
+
+// ArrayDataView represents a read-only view of a single row's array data
+// (a slice of ColumnArray's flat nested data).
+// Used as the return type of ColumnArrayView::operator[].
+template <PrimitiveType PType>
+struct ArrayDataView {
+ using ElementType = typename ColumnElementView<PType>::ElementType;
+
+ const ColumnElementView<PType>& data;
+ const NullMap& nested_null_map;
+ const size_t offset;
+ const size_t length;
+
+ size_t size() const { return length; }
+
+ const ElementType* get_data() const {
+ const ElementType* raw_data = data.get_data();
+ return raw_data + offset;
+ }
+
+ const UInt8* get_null_map_data() const { return nested_null_map.data() +
offset; }
+
+ // ColumnArray's data column is always Nullable, no need to check nullptr
+ bool is_null_at(size_t idx) const { return nested_null_map[offset + idx]; }
+
+ ElementType value_at(size_t idx) const { return data.get_element(offset +
idx); }
+};
+
+// ColumnArrayView provides a read-only view over a column of
Array<scalar_type>,
+// handling Const / Nullable wrapping automatically.
+//
+// Supports index-based access: operator[](row) returns ArrayDataView, uses
offsets[row-1] (sentinel)
+template <PrimitiveType PType>
+struct ColumnArrayView {
+ const ColumnElementView<PType> element_data;
+ const ColumnArray::Offsets64& offsets;
+ const NullMap* outer_null_map;
+ const NullMap& nested_null_map;
+ const bool is_const;
+ const size_t count;
+
+ static ColumnArrayView create(const ColumnPtr& column_ptr) {
+ // Step 1: unpack const
+ const auto& [unpacked, is_const] = unpack_if_const(column_ptr);
+
+ // Step 2: unpack outer nullable
+ const NullMap* outer_null_map = nullptr;
+ const IColumn* array_raw = nullptr;
+ if (const auto* nullable =
check_and_get_column<ColumnNullable>(unpacked.get())) {
+ outer_null_map = &nullable->get_null_map_data();
+ array_raw = nullable->get_nested_column_ptr().get();
+ } else {
+ array_raw = unpacked.get();
+ }
+
+ // Step 3: get ColumnArray
+ const auto& array_column = assert_cast<const ColumnArray&>(*array_raw);
+
+ // Step 4: unpack inner nullable (data column is always Nullable)
+ if (!array_column.get_data().is_nullable()) {
+ throw doris::Exception(ErrorCode::INTERNAL_ERROR,
+ "ColumnArray's data column is expected to
be Nullable");
+ }
+
+ const auto& nested_nullable = assert_cast<const
ColumnNullable&>(array_column.get_data());
+ const NullMap& nested_null_map = nested_nullable.get_null_map_data();
+ const IColumn* data_column =
nested_nullable.get_nested_column_ptr().get();
+
+ return ColumnArrayView {.element_data =
ColumnElementView<PType>(*data_column),
+ .offsets = array_column.get_offsets(),
+ .outer_null_map = outer_null_map,
+ .nested_null_map = nested_null_map,
+ .is_const = is_const,
+ .count = column_ptr->size()};
+ }
+
+ size_t size() const { return count; }
+
+ auto get_data() const { return element_data.get_data(); }
+
+ const UInt8* get_null_map_data() const { return nested_null_map.data(); }
+
+ size_t row_begin(size_t idx) const {
+ size_t actual = is_const ? 0 : idx;
+ return offsets[actual - 1];
+ }
+
+ size_t row_end(size_t idx) const {
+ size_t actual = is_const ? 0 : idx;
+ return offsets[actual];
+ }
+
+ bool is_null_at(size_t idx) const {
+ if (outer_null_map) {
+ return (*outer_null_map)[is_const ? 0 : idx];
+ }
+ return false;
+ }
+
+ // Index-based access: uses offsets[actual - 1] (PaddedPODArray sentinel
guarantees [-1] is valid)
+ ArrayDataView<PType> operator[](size_t idx) const {
+ size_t actual = is_const ? 0 : idx;
+ size_t off = offsets[actual - 1];
+ size_t len = offsets[actual] - off;
+ return ArrayDataView<PType> {.data = element_data,
+ .nested_null_map = nested_null_map,
+ .offset = off,
+ .length = len};
+ }
+};
+
+} // namespace doris
diff --git a/be/src/core/column/column_execute_util.h
b/be/src/core/column/column_execute_util.h
index 187f439d2f7..a9966807455 100644
--- a/be/src/core/column/column_execute_util.h
+++ b/be/src/core/column/column_execute_util.h
@@ -39,6 +39,7 @@ struct ColumnElementView {
using ElementType = typename ColumnType::value_type;
const typename ColumnType::Container& data;
ElementType get_element(size_t idx) const { return data[idx]; }
+ const ElementType* get_data() const { return data.data(); }
ColumnElementView(const IColumn& column)
: data(assert_cast<const ColumnType&>(column).get_data()) {}
diff --git a/be/src/exprs/function/array/function_array_distance.h
b/be/src/exprs/function/array/function_array_distance.h
index 8749364d51a..12969c23e34 100644
--- a/be/src/exprs/function/array/function_array_distance.h
+++ b/be/src/exprs/function/array/function_array_distance.h
@@ -21,13 +21,12 @@
#include <faiss/utils/distances.h>
#include <gen_cpp/Types_types.h>
-#include <optional>
-
#include "common/exception.h"
#include "common/status.h"
#include "core/assert_cast.h"
#include "core/column/column.h"
#include "core/column/column_array.h"
+#include "core/column/column_array_view.h"
#include "core/column/column_const.h"
#include "core/column/column_nullable.h"
#include "core/data_type/data_type.h"
@@ -37,7 +36,6 @@
#include "core/data_type/primitive_type.h"
#include "core/types.h"
#include "exec/common/util.hpp"
-#include "exprs/function/array/function_array_utils.h"
#include "exprs/function/function.h"
namespace doris {
@@ -118,133 +116,64 @@ public:
// We want to make sure throw exception if input columns contain NULL.
bool use_default_implementation_for_nulls() const override { return false;
}
- // Extract the ColumnArray from a column, unwrapping Nullable if present.
- // Validates that no NULL values exist.
- static const ColumnArray* _extract_array_column(const IColumn* col, const
char* arg_name,
- const String& func_name) {
- if (col->is_nullable()) {
- if (col->has_null()) {
- throw doris::Exception(ErrorCode::INVALID_ARGUMENT,
- "{} for function {} cannot be null",
arg_name, func_name);
- }
- auto nullable = assert_cast<const ColumnNullable*>(col);
- return assert_cast<const
ColumnArray*>(nullable->get_nested_column_ptr().get());
+ // Validate that neither outer column nor inner array elements contain
NULL.
+ // Distance functions always throw on NULL input.
+ static void _validate_no_nulls(const ColumnPtr& col, const char* arg_name,
+ const String& func_name) {
+ const IColumn* raw = col.get();
+
+ // Unwrap const
+ if (is_column_const(*raw)) {
+ raw = assert_cast<const
ColumnConst*>(raw)->get_data_column_ptr().get();
}
- return assert_cast<const ColumnArray*>(col);
- }
- // Extract the ColumnFloat32 data from an array column, unwrapping
Nullable if present.
- // Validates that no NULL elements exist within the array.
- static const ColumnFloat32* _extract_float_data(const ColumnArray* arr,
const char* arg_name,
- const String& func_name) {
- if (arr->get_data_ptr()->is_nullable()) {
- if (arr->get_data_ptr()->has_null()) {
+ // Check outer nullable
+ if (raw->is_nullable()) {
+ if (raw->has_null()) {
throw doris::Exception(ErrorCode::INVALID_ARGUMENT,
- "{} for function {} cannot have null",
arg_name, func_name);
+ "{} for function {} cannot be null",
arg_name, func_name);
}
- auto nullable = assert_cast<const
ColumnNullable*>(arr->get_data_ptr().get());
- return assert_cast<const
ColumnFloat32*>(nullable->get_nested_column_ptr().get());
+ raw = assert_cast<const
ColumnNullable*>(raw)->get_nested_column_ptr().get();
}
- return assert_cast<const ColumnFloat32*>(arr->get_data_ptr().get());
- }
- // Holds the extracted float data pointer and dimension for a const array
argument,
- // avoiding repeated per-row extraction.
- struct ConstArrayInfo {
- const float* data = nullptr;
- ssize_t dim = 0;
- };
-
- // Try to extract const array info from a column. If the column is
ColumnConst,
- // extract the float data pointer and dimension once; otherwise return
nullopt.
- std::optional<ConstArrayInfo> _try_extract_const(const ColumnPtr& col,
- const char* arg_name)
const {
- if (!is_column_const(*col)) {
- return std::nullopt;
+ // Check inner nullable (array elements)
+ const auto& array_col = assert_cast<const ColumnArray&>(*raw);
+ if (array_col.get_data_ptr()->is_nullable() &&
array_col.get_data_ptr()->has_null()) {
+ throw doris::Exception(ErrorCode::INVALID_ARGUMENT,
+ "{} for function {} cannot have null",
arg_name, func_name);
}
- auto const_col = assert_cast<const ColumnConst*>(col.get());
- const IColumn* inner = const_col->get_data_column_ptr().get();
- const ColumnArray* arr = _extract_array_column(inner, arg_name,
get_name());
- const ColumnFloat32* float_col = _extract_float_data(arr, arg_name,
get_name());
- ssize_t dim = static_cast<ssize_t>(float_col->size());
- return ConstArrayInfo {float_col->get_data().data(), dim};
}
Status execute_impl(FunctionContext* context, Block& block, const
ColumnNumbers& arguments,
uint32_t result, size_t input_rows_count) const
override {
- const auto& arg1 = block.get_by_position(arguments[0]);
- const auto& arg2 = block.get_by_position(arguments[1]);
-
- // Try to handle const columns without expanding them.
- auto const_info1 = _try_extract_const(arg1.column, "First argument");
- auto const_info2 = _try_extract_const(arg2.column, "Second argument");
-
- // For non-const columns, expand and extract normally.
- ColumnPtr materialized_col1, materialized_col2;
- const ColumnArray* arr1 = nullptr;
- const ColumnArray* arr2 = nullptr;
- const ColumnFloat32* float1 = nullptr;
- const ColumnFloat32* float2 = nullptr;
- const ColumnOffset64* offset1 = nullptr;
- const ColumnOffset64* offset2 = nullptr;
- const IColumn::Offsets64* offsets_data1 = nullptr;
- const IColumn::Offsets64* offsets_data2 = nullptr;
- const float* float_data1 = nullptr;
- const float* float_data2 = nullptr;
-
- if (!const_info1) {
- materialized_col1 = arg1.column->convert_to_full_column_if_const();
- arr1 = _extract_array_column(materialized_col1.get(), "First
argument", get_name());
- float1 = _extract_float_data(arr1, "First argument", get_name());
- offset1 = assert_cast<const
ColumnArray::ColumnOffsets*>(arr1->get_offsets_ptr().get());
- offsets_data1 = &offset1->get_data();
- float_data1 = float1->get_data().data();
- }
+ const auto& col1 = block.get_by_position(arguments[0]).column;
+ const auto& col2 = block.get_by_position(arguments[1]).column;
- if (!const_info2) {
- materialized_col2 = arg2.column->convert_to_full_column_if_const();
- arr2 = _extract_array_column(materialized_col2.get(), "Second
argument", get_name());
- float2 = _extract_float_data(arr2, "Second argument", get_name());
- offset2 = assert_cast<const
ColumnArray::ColumnOffsets*>(arr2->get_offsets_ptr().get());
- offsets_data2 = &offset2->get_data();
- float_data2 = float2->get_data().data();
- }
+ // Validate no NULLs (distance functions always throw on NULL input)
+ _validate_no_nulls(col1, "First argument", get_name());
+ _validate_no_nulls(col2, "Second argument", get_name());
+
+ // Create views — handles Const/Nullable unwrapping automatically
+ auto view1 = ColumnArrayView<TYPE_FLOAT>::create(col1);
+ auto view2 = ColumnArrayView<TYPE_FLOAT>::create(col2);
- // prepare return data
auto dst = ColumnType::create(input_rows_count);
auto& dst_data = dst->get_data();
for (size_t row = 0; row < input_rows_count; ++row) {
- const float* data_ptr1;
- const float* data_ptr2;
- ssize_t size1, size2;
- const auto idx = static_cast<ssize_t>(row);
-
- if (const_info1) {
- data_ptr1 = const_info1->data;
- size1 = const_info1->dim;
- } else {
- // -1 is valid for PaddedPODArray-backed offsets.
- const auto prev_offset1 = (*offsets_data1)[idx - 1];
- size1 = (*offsets_data1)[idx] - prev_offset1;
- data_ptr1 = float_data1 + prev_offset1;
- }
-
- if (const_info2) {
- data_ptr2 = const_info2->data;
- size2 = const_info2->dim;
- } else {
- const auto prev_offset2 = (*offsets_data2)[idx - 1];
- size2 = (*offsets_data2)[idx] - prev_offset2;
- data_ptr2 = float_data2 + prev_offset2;
- }
-
- if (size1 != size2) [[unlikely]] {
+ auto a1 = view1[row];
+ auto a2 = view2[row];
+ const float* p1 = a1.get_data();
+ const float* p2 = a2.get_data();
+ auto dim1 = a1.size();
+ auto dim2 = a2.size();
+
+ if (dim1 != dim2) [[unlikely]] {
return Status::InvalidArgument(
"function {} have different input element sizes of
array: {} and {}",
- get_name(), size1, size2);
+ get_name(), dim1, dim2);
}
- dst_data[row] = DistanceImpl::distance(data_ptr1, data_ptr2,
size1);
+ dst_data[row] = DistanceImpl::distance(p1, p2, dim1);
}
block.replace_by_position(result, std::move(dst));
diff --git a/be/src/exprs/function/array/function_array_join.h
b/be/src/exprs/function/array/function_array_join.h
index a51b056cf40..d674851060c 100644
--- a/be/src/exprs/function/array/function_array_join.h
+++ b/be/src/exprs/function/array/function_array_join.h
@@ -18,12 +18,12 @@
#include "core/block/block.h"
#include "core/column/column_array.h"
+#include "core/column/column_array_view.h"
#include "core/column/column_const.h"
#include "core/column/column_execute_util.h"
#include "core/data_type/data_type_array.h"
#include "core/data_type/data_type_string.h"
#include "core/string_ref.h"
-#include "exprs/function/array/function_array_utils.h"
namespace doris {
@@ -58,22 +58,15 @@ public:
static Status execute(Block& block, const ColumnNumbers& arguments,
uint32_t result,
const DataTypeArray* data_type_array, const
ColumnArray& array) {
- ColumnPtr src_column =
-
block.get_by_position(arguments[0]).column->convert_to_full_column_if_const();
- ColumnArrayExecutionData src;
- if (!extract_column_array_info(*src_column, src)) {
- return Status::RuntimeError(fmt::format(
- "execute failed, unsupported types for function {}({})",
"array_join",
- block.get_by_position(arguments[0]).type->get_name()));
- }
+ ColumnPtr src_column = block.get_by_position(arguments[0]).column;
+ auto array_view = ColumnArrayView<TYPE_STRING>::create(src_column);
- auto nested_type = data_type_array->get_nested_type();
auto dest_column_ptr = ColumnString::create();
auto& dest_chars = dest_column_ptr->get_chars();
auto& dest_offsets = dest_column_ptr->get_offsets();
- dest_offsets.resize_fill(src_column->size(), 0);
+ dest_offsets.resize_fill(array_view.size(), 0);
auto sep_column =
ColumnView<TYPE_STRING>::create(block.get_by_position(arguments[1]).column);
@@ -82,8 +75,7 @@ public:
auto null_replace_column =
ColumnView<TYPE_STRING>::create(block.get_by_position(arguments[2]).column);
- _execute_string(*src.nested_col, *src.offsets_ptr,
src.nested_nullmap_data, sep_column,
- null_replace_column, dest_chars, dest_offsets);
+ _execute_string(array_view, sep_column, null_replace_column,
dest_chars, dest_offsets);
} else {
auto tmp_column_string = ColumnString::create();
@@ -94,8 +86,7 @@ public:
auto null_replace_column =
ColumnView<TYPE_STRING>::create(tmp_const_column);
- _execute_string(*src.nested_col, *src.offsets_ptr,
src.nested_nullmap_data, sep_column,
- null_replace_column, dest_chars, dest_offsets);
+ _execute_string(array_view, sep_column, null_replace_column,
dest_chars, dest_offsets);
}
block.replace_by_position(result, std::move(dest_column_ptr));
@@ -129,27 +120,23 @@ private:
}
}
- static void _execute_string(const IColumn& src_column,
- const ColumnArray::Offsets64& src_offsets,
- const UInt8* src_null_map,
ColumnView<TYPE_STRING>& sep_column,
+ static void _execute_string(const ColumnArrayView<TYPE_STRING>& array_view,
+ ColumnView<TYPE_STRING>& sep_column,
ColumnView<TYPE_STRING>& null_replace_column,
ColumnString::Chars& dest_chars,
ColumnString::Offsets& dest_offsets) {
- const auto& src_data = assert_cast<const ColumnString&>(src_column);
-
uint32_t total_size = 0;
- for (int64_t i = 0; i < src_offsets.size(); ++i) {
- auto begin = src_offsets[i - 1];
- auto end = src_offsets[i];
+ for (int64_t i = 0; i < array_view.size(); ++i) {
+ auto arr = array_view[i];
auto sep_str = sep_column.value_at(i);
auto null_replace_str = null_replace_column.value_at(i);
bool is_first_elem = true;
- for (size_t j = begin; j < end; ++j) {
- if (src_null_map && src_null_map[j]) {
+ for (size_t j = 0; j < arr.size(); ++j) {
+ if (arr.is_null_at(j)) {
if (null_replace_str.size != 0) {
_fill_result_string(i, null_replace_str, sep_str,
dest_chars, total_size,
is_first_elem);
@@ -157,7 +144,7 @@ private:
continue;
}
- StringRef src_str_ref = src_data.get_data_at(j);
+ StringRef src_str_ref = arr.value_at(j);
_fill_result_string(i, src_str_ref, sep_str, dest_chars,
total_size, is_first_elem);
}
diff --git a/be/test/core/column/column_array_view_test.cpp
b/be/test/core/column/column_array_view_test.cpp
new file mode 100644
index 00000000000..57492c95817
--- /dev/null
+++ b/be/test/core/column/column_array_view_test.cpp
@@ -0,0 +1,290 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "core/column/column_array_view.h"
+
+#include <gtest/gtest.h>
+
+#include "core/column/column_array.h"
+#include "core/column/column_const.h"
+#include "core/column/column_nullable.h"
+#include "core/column/column_string.h"
+#include "core/column/column_vector.h"
+#include "core/data_type/data_type_number.h"
+#include "core/data_type/data_type_string.h"
+#include "testutil/column_helper.h"
+
+namespace doris {
+
+// Helper: build a ColumnArray with Nullable(ColumnInt32) nested data.
+// arrays: each inner vector is one row's array elements
+// element_nulls: parallel to the flattened data, 1 = null
+// row_nulls: per-row outer null (empty means no outer nullable wrapper)
+static ColumnPtr build_int32_array_column(const
std::vector<std::vector<int32_t>>& arrays,
+ const std::vector<uint8_t>&
element_nulls,
+ const std::vector<uint8_t>&
row_nulls = {}) {
+ // Build nested data column (Nullable(Int32))
+ auto data_col = ColumnInt32::create();
+ auto null_col = ColumnUInt8::create();
+ size_t flat_idx = 0;
+ for (const auto& arr : arrays) {
+ for (auto val : arr) {
+ data_col->insert_value(val);
+ null_col->insert_value(flat_idx < element_nulls.size() ?
element_nulls[flat_idx] : 0);
+ flat_idx++;
+ }
+ }
+ auto nullable_data = ColumnNullable::create(std::move(data_col),
std::move(null_col));
+
+ // Build offsets
+ auto offsets = ColumnArray::ColumnOffsets::create();
+ size_t offset = 0;
+ for (const auto& arr : arrays) {
+ offset += arr.size();
+ offsets->insert_value(offset);
+ }
+
+ ColumnPtr array_col = ColumnArray::create(std::move(nullable_data),
std::move(offsets));
+
+ // Wrap in outer Nullable if row_nulls provided
+ if (!row_nulls.empty()) {
+ auto outer_null = ColumnUInt8::create();
+ for (auto v : row_nulls) {
+ outer_null->insert_value(v);
+ }
+ array_col = ColumnNullable::create(array_col->assume_mutable(),
std::move(outer_null));
+ }
+ return array_col;
+}
+
+// Helper: build a ColumnArray with Nullable(ColumnString) nested data.
+static ColumnPtr build_string_array_column(const
std::vector<std::vector<std::string>>& arrays,
+ const std::vector<uint8_t>&
element_nulls,
+ const std::vector<uint8_t>&
row_nulls = {}) {
+ auto data_col = ColumnString::create();
+ auto null_col = ColumnUInt8::create();
+ size_t flat_idx = 0;
+ for (const auto& arr : arrays) {
+ for (const auto& val : arr) {
+ data_col->insert_data(val.data(), val.size());
+ null_col->insert_value(flat_idx < element_nulls.size() ?
element_nulls[flat_idx] : 0);
+ flat_idx++;
+ }
+ }
+ auto nullable_data = ColumnNullable::create(std::move(data_col),
std::move(null_col));
+
+ auto offsets = ColumnArray::ColumnOffsets::create();
+ size_t offset = 0;
+ for (const auto& arr : arrays) {
+ offset += arr.size();
+ offsets->insert_value(offset);
+ }
+
+ ColumnPtr array_col = ColumnArray::create(std::move(nullable_data),
std::move(offsets));
+
+ if (!row_nulls.empty()) {
+ auto outer_null = ColumnUInt8::create();
+ for (auto v : row_nulls) {
+ outer_null->insert_value(v);
+ }
+ array_col = ColumnNullable::create(array_col->assume_mutable(),
std::move(outer_null));
+ }
+ return array_col;
+}
+
+// ==================== ArrayDataView (index-based) Tests ====================
+
+// Test basic non-nullable, non-const array column
+// Row 0: [10, 20, 30], Row 1: [40], Row 2: [50, 60]
+TEST(ColumnArrayViewTest, IndexAccess_basic) {
+ auto col = build_int32_array_column({{10, 20, 30}, {40}, {50, 60}}, {0, 0,
0, 0, 0, 0});
+ auto view = ColumnArrayView<TYPE_INT>::create(col);
+
+ EXPECT_EQ(view.size(), 3);
+ EXPECT_FALSE(view.is_const);
+
+ // Row 0
+ EXPECT_FALSE(view.is_null_at(0));
+ auto arr0 = view[0];
+ EXPECT_EQ(arr0.size(), 3);
+ EXPECT_EQ(arr0.value_at(0), 10);
+ EXPECT_EQ(arr0.value_at(1), 20);
+ EXPECT_EQ(arr0.value_at(2), 30);
+ EXPECT_FALSE(arr0.is_null_at(0));
+ EXPECT_FALSE(arr0.is_null_at(1));
+ EXPECT_FALSE(arr0.is_null_at(2));
+
+ // Row 1
+ auto arr1 = view[1];
+ EXPECT_EQ(arr1.size(), 1);
+ EXPECT_EQ(arr1.value_at(0), 40);
+
+ // Row 2
+ auto arr2 = view[2];
+ EXPECT_EQ(arr2.size(), 2);
+ EXPECT_EQ(arr2.value_at(0), 50);
+ EXPECT_EQ(arr2.value_at(1), 60);
+}
+
+TEST(ColumnArrayViewTest, IndexAccess_get_data) {
+ auto col = build_int32_array_column({{10, 20, 30}, {40}, {50, 60}}, {0, 0,
0, 0, 0, 0});
+ auto view = ColumnArrayView<TYPE_INT>::create(col);
+
+ auto arr0 = view[0];
+ const auto* data0 = arr0.get_data();
+ ASSERT_NE(data0, nullptr);
+ EXPECT_EQ(data0[0], 10);
+ EXPECT_EQ(data0[1], 20);
+ EXPECT_EQ(data0[2], 30);
+
+ auto arr1 = view[1];
+ const auto* data1 = arr1.get_data();
+ ASSERT_NE(data1, nullptr);
+ EXPECT_EQ(data1[0], 40);
+
+ auto arr2 = view[2];
+ const auto* data2 = arr2.get_data();
+ ASSERT_NE(data2, nullptr);
+ EXPECT_EQ(data2[0], 50);
+ EXPECT_EQ(data2[1], 60);
+}
+
+// Test with null elements inside arrays
+// Row 0: [1, NULL, 3], Row 1: [NULL]
+TEST(ColumnArrayViewTest, IndexAccess_with_null_elements) {
+ auto col = build_int32_array_column({{1, 0, 3}, {0}}, {0, 1, 0, 1});
+ auto view = ColumnArrayView<TYPE_INT>::create(col);
+
+ EXPECT_EQ(view.size(), 2);
+
+ auto arr0 = view[0];
+ EXPECT_EQ(arr0.size(), 3);
+ EXPECT_FALSE(arr0.is_null_at(0));
+ EXPECT_TRUE(arr0.is_null_at(1));
+ EXPECT_FALSE(arr0.is_null_at(2));
+ EXPECT_EQ(arr0.value_at(0), 1);
+ EXPECT_EQ(arr0.value_at(2), 3);
+
+ auto arr1 = view[1];
+ EXPECT_EQ(arr1.size(), 1);
+ EXPECT_TRUE(arr1.is_null_at(0));
+}
+
+// Test with outer nullable (some rows are entirely null)
+// Row 0: [1, 2], Row 1: NULL, Row 2: [5]
+TEST(ColumnArrayViewTest, IndexAccess_outer_nullable) {
+ auto col = build_int32_array_column({{1, 2}, {0}, {5}}, {0, 0, 0, 0}, {0,
1, 0});
+ auto view = ColumnArrayView<TYPE_INT>::create(col);
+
+ EXPECT_EQ(view.size(), 3);
+ EXPECT_FALSE(view.is_null_at(0));
+ EXPECT_TRUE(view.is_null_at(1));
+ EXPECT_FALSE(view.is_null_at(2));
+
+ auto arr0 = view[0];
+ EXPECT_EQ(arr0.size(), 2);
+ EXPECT_EQ(arr0.value_at(0), 1);
+ EXPECT_EQ(arr0.value_at(1), 2);
+
+ auto arr2 = view[2];
+ EXPECT_EQ(arr2.size(), 1);
+ EXPECT_EQ(arr2.value_at(0), 5);
+}
+
+// Test const column: Const(Array([10, 20])) with 4 rows
+TEST(ColumnArrayViewTest, IndexAccess_const) {
+ auto inner = build_int32_array_column({{10, 20}}, {0, 0});
+ ColumnPtr const_col = ColumnConst::create(inner, 4);
+ auto view = ColumnArrayView<TYPE_INT>::create(const_col);
+
+ EXPECT_EQ(view.size(), 4);
+ EXPECT_TRUE(view.is_const);
+
+ for (size_t i = 0; i < 4; ++i) {
+ EXPECT_FALSE(view.is_null_at(i));
+ auto arr = view[i];
+ EXPECT_EQ(arr.size(), 2);
+ EXPECT_EQ(arr.value_at(0), 10);
+ EXPECT_EQ(arr.value_at(1), 20);
+ }
+}
+
+// Test Const(Nullable(Array([7, 8, 9]))) with 3 rows, non-null
+TEST(ColumnArrayViewTest, IndexAccess_const_nullable) {
+ auto inner = build_int32_array_column({{7, 8, 9}}, {0, 0, 0}, {0});
+ ColumnPtr const_col = ColumnConst::create(inner, 3);
+ auto view = ColumnArrayView<TYPE_INT>::create(const_col);
+
+ EXPECT_EQ(view.size(), 3);
+ EXPECT_TRUE(view.is_const);
+
+ for (size_t i = 0; i < 3; ++i) {
+ EXPECT_FALSE(view.is_null_at(i));
+ auto arr = view[i];
+ EXPECT_EQ(arr.size(), 3);
+ EXPECT_EQ(arr.value_at(0), 7);
+ EXPECT_EQ(arr.value_at(1), 8);
+ EXPECT_EQ(arr.value_at(2), 9);
+ }
+}
+
+// Test Const(Nullable(NULL)) with 3 rows, all null
+TEST(ColumnArrayViewTest, IndexAccess_const_nullable_null) {
+ // Build one-row array, then wrap as nullable with null=1, then const
+ auto inner = build_int32_array_column({{0}}, {0}, {1});
+ ColumnPtr const_col = ColumnConst::create(inner, 3);
+ auto view = ColumnArrayView<TYPE_INT>::create(const_col);
+
+ EXPECT_EQ(view.size(), 3);
+ EXPECT_TRUE(view.is_const);
+
+ for (size_t i = 0; i < 3; ++i) {
+ EXPECT_TRUE(view.is_null_at(i));
+ }
+}
+
+// Test empty array rows
+// Row 0: [], Row 1: [100], Row 2: []
+TEST(ColumnArrayViewTest, IndexAccess_empty_arrays) {
+ auto col = build_int32_array_column({{}, {100}, {}}, {0});
+ auto view = ColumnArrayView<TYPE_INT>::create(col);
+
+ EXPECT_EQ(view.size(), 3);
+ EXPECT_EQ(view[0].size(), 0);
+ EXPECT_EQ(view[1].size(), 1);
+ EXPECT_EQ(view[1].value_at(0), 100);
+ EXPECT_EQ(view[2].size(), 0);
+}
+
+// Test string array
+// Row 0: ["hello", "world"], Row 1: ["test"]
+TEST(ColumnArrayViewTest, IndexAccess_string) {
+ auto col = build_string_array_column({{"hello", "world"}, {"test"}}, {0,
0, 0});
+ auto view = ColumnArrayView<TYPE_STRING>::create(col);
+
+ EXPECT_EQ(view.size(), 2);
+ auto arr0 = view[0];
+ EXPECT_EQ(arr0.size(), 2);
+ EXPECT_EQ(arr0.value_at(0).to_string(), "hello");
+ EXPECT_EQ(arr0.value_at(1).to_string(), "world");
+
+ auto arr1 = view[1];
+ EXPECT_EQ(arr1.size(), 1);
+ EXPECT_EQ(arr1.value_at(0).to_string(), "test");
+}
+
+} // namespace doris
\ No newline at end of file
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]