(doris) branch master updated: [Refactor](block) remove all index_by_name usage (#57860)

panxiaolei Sun, 16 Nov 2025 20:33:27 -0800

This is an automated email from the ASF dual-hosted git repository.

panxiaolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git



The following commit(s) were added to refs/heads/master by this push:
     new d4287941c51 [Refactor](block) remove all index_by_name usage (#57860)
d4287941c51 is described below

commit d4287941c5188778d22486f37bae4d9edc9e161d
Author: Pxl <[email protected]>
AuthorDate: Mon Nov 17 12:33:13 2025 +0800

    [Refactor](block) remove all index_by_name usage (#57860)
    
    This pull request refactors how columns are accessed by name in the
    `Block` class and related code, removing the internal name-to-index map
    and switching to linear search. It also removes temporary column
    handling and updates code throughout the backend to use the new column
    access pattern. Additionally, it improves error handling and updates
    some operator logic for schema scans.
    
    ### Block class refactor (core theme):
    
    * Removed the internal `index_by_name` map from the `Block` class,
    replacing all column name lookups with linear search via the new
    `get_position_by_name` method. This affects methods like `get_by_name`,
    `try_get_by_name`, `has`, and related insert/erase logic.
    
[[1]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L85-R87)
    
[[2]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L164-L193)
    
[[3]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L204-L227)
    
[[4]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L241-L247)
    
[[5]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L259-L288)
    
[[6]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L310-R268)
    
[[7]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L774)
    
[[8]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L795-L803)
    
[[9]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L825-L835)
    
    * Removed the ability to erase columns by name and the logic for
    maintaining the name-to-index map, simplifying column management.
    
[[1]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L259-L288)
    
[[2]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L241-L247)
    
[[3]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L795-L803)
    
    ### Temporary column handling (cleanup theme):
    
    * Removed the definition and usage of temporary block column prefixes
    and the method `erase_tmp_columns`, as well as related cleanup logic in
    scan operators.
    
[[1]](diffhunk://#diff-d569a6c78638e9c539783057051e89bcb135cc254c763e3bc5e3a71e334df883L27-L28)
    
[[2]](diffhunk://#diff-ac38473f151ccd99db4ed80dfbfa176f4f957ae2f5335ce8fbb4f5dbb0e932e9L1332-L1338)
    
[[3]](diffhunk://#diff-76dba768c36c76cce660f7fe39514a98b509030a8976aabc9ac47d58bc923976L795-L803)
    
    ### Column access updates in backend logic (consistency theme):
    
    * Updated all code that previously accessed columns by name using the
    old map to use the new position-based access, including changes in
    partial update logic, push broker reader, and schema scan operator.
    
[[1]](diffhunk://#diff-23fa0193d626ba712c4186c66bcd1809c7e55bfc04ea10f5a91c691ed3e04727L944-R947)
    
[[2]](diffhunk://#diff-2ed235dda16244dccd76626375b4512b6ade1724933269c40a2953c29dd95c61L314-R314)
    
[[3]](diffhunk://#diff-2ed235dda16244dccd76626375b4512b6ade1724933269c40a2953c29dd95c61L387-R388)
    
[[4]](diffhunk://#diff-2ed235dda16244dccd76626375b4512b6ade1724933269c40a2953c29dd95c61L423-L443)
    
[[5]](diffhunk://#diff-aa97e616efca2638565e6a58aa9295252714265fcc1ed904d503425ef4930ce7L481-R484)
    
[[6]](diffhunk://#diff-aa97e616efca2638565e6a58aa9295252714265fcc1ed904d503425ef4930ce7L494-R494)
    
[[7]](diffhunk://#diff-991f87a3acb91e58d14e3b30520de5dd07ecade8d4ecf885c02bf436873b4262R182)
    
[[8]](diffhunk://#diff-991f87a3acb91e58d14e3b30520de5dd07ecade8d4ecf885c02bf436873b4262L253-R257)
    
    ### Schema scan operator improvements (feature theme):
    
    * Added a slot-to-column index mapping (`_slot_offsets`) to
    `SchemaScanOperatorX` for more efficient column mapping during scans.
    
[[1]](diffhunk://#diff-82ea988d6a5179d6698c2844ad8f06167ebdb362225b07b35e9a5ba0a7bb173aR22-R23)
    
[[2]](diffhunk://#diff-82ea988d6a5179d6698c2844ad8f06167ebdb362225b07b35e9a5ba0a7bb173aR90-R92)
    
[[3]](diffhunk://#diff-991f87a3acb91e58d14e3b30520de5dd07ecade8d4ecf885c02bf436873b4262R182)
    
[[4]](diffhunk://#diff-991f87a3acb91e58d14e3b30520de5dd07ecade8d4ecf885c02bf436873b4262L253-R257)
    
    ### Error handling and robustness (quality theme):
    
    * Improved error handling for missing columns, notably in partial update
    logic, by returning internal errors with block structure dumps when
    columns are not found.
    
    ---
    
    This refactor simplifies the `Block` class, improves code consistency,
    and enhances error handling across the backend.
---
 be/src/common/consts.h                             |   2 -
 be/src/olap/base_tablet.cpp                        |   7 +-
 be/src/olap/partial_update_info.cpp                |  25 +--
 be/src/olap/push_handler.cpp                       |   4 +-
 be/src/olap/tablet_schema.h                        |   1 -
 be/src/pipeline/exec/scan_operator.cpp             |   7 -
 be/src/pipeline/exec/schema_scan_operator.cpp      |   4 +-
 be/src/pipeline/exec/schema_scan_operator.h        |   5 +
 be/src/vec/core/block.cpp                          | 193 ++-------------------
 be/src/vec/core/block.h                            |  49 +-----
 be/src/vec/core/sort_block.cpp                     |   9 +-
 be/src/vec/core/sort_description.h                 |  11 +-
 .../vec/exec/format/arrow/arrow_stream_reader.cpp  |   9 +-
 be/src/vec/exec/format/orc/vorc_reader.cpp         |  27 ++-
 .../exec/format/parquet/vparquet_group_reader.cpp  |  23 ++-
 be/src/vec/exec/format/table/equality_delete.cpp   |  35 ++--
 be/src/vec/exec/format/table/iceberg_reader.cpp    |   9 +-
 be/src/vec/exec/scan/file_scanner.cpp              |   8 +-
 be/src/vec/exec/scan/olap_scanner.cpp              |  16 +-
 be/src/vec/exec/scan/scanner.cpp                   |   7 -
 be/src/vec/functions/function_helpers.cpp          |   8 -
 be/src/vec/olap/block_reader.cpp                   |   1 -
 be/src/vec/olap/vcollect_iterator.cpp              |  15 +-
 be/src/vec/olap/vertical_block_reader.cpp          |   1 -
 be/src/vec/sink/vtablet_block_convertor.cpp        |   4 -
 be/test/vec/core/block_test.cpp                    |  51 +-----
 .../vec/exec/format/parquet/parquet_read_lines.cpp |   2 +-
 be/test/vec/exec/orc/orc_read_lines.cpp            |   2 +-
 28 files changed, 144 insertions(+), 391 deletions(-)

diff --git a/be/src/common/consts.h b/be/src/common/consts.h
index 4618ffe7d74..ca4662c839e 100644
--- a/be/src/common/consts.h
+++ b/be/src/common/consts.h
@@ -24,8 +24,6 @@ namespace BeConsts {
 const std::string CSV = "csv";
 const std::string CSV_WITH_NAMES = "csv_with_names";
 const std::string CSV_WITH_NAMES_AND_TYPES = "csv_with_names_and_types";
-const std::string BLOCK_TEMP_COLUMN_PREFIX = "__TEMP__";
-const std::string BLOCK_TEMP_COLUMN_SCANNER_FILTERED = 
"__TEMP__scanner_filtered";
 const std::string ROWID_COL = "__DORIS_ROWID_COL__";
 const std::string GLOBAL_ROWID_COL = "__DORIS_GLOBAL_ROWID_COL__";
 const std::string ROW_STORE_COL = "__DORIS_ROW_STORE_COL__";
diff --git a/be/src/olap/base_tablet.cpp b/be/src/olap/base_tablet.cpp
index 9cde2bbdf45..2c858e557bb 100644
--- a/be/src/olap/base_tablet.cpp
+++ b/be/src/olap/base_tablet.cpp
@@ -941,11 +941,10 @@ Status BaseTablet::fetch_value_by_rowids(RowsetSharedPtr 
input_rowset, uint32_t
 
 const signed char* BaseTablet::get_delete_sign_column_data(const 
vectorized::Block& block,
                                                            size_t 
rows_at_least) {
-    if (const vectorized::ColumnWithTypeAndName* delete_sign_column =
-                block.try_get_by_name(DELETE_SIGN);
-        delete_sign_column != nullptr) {
+    if (int pos = block.get_position_by_name(DELETE_SIGN); pos != -1) {
+        const vectorized::ColumnWithTypeAndName& delete_sign_column = 
block.get_by_position(pos);
         const auto& delete_sign_col =
-                reinterpret_cast<const 
vectorized::ColumnInt8&>(*(delete_sign_column->column));
+                assert_cast<const 
vectorized::ColumnInt8&>(*(delete_sign_column.column));
         if (delete_sign_col.size() >= rows_at_least) {
             return delete_sign_col.get_data().data();
         }
diff --git a/be/src/olap/partial_update_info.cpp 
b/be/src/olap/partial_update_info.cpp
index 5e6082c5c13..518b29dcaa4 100644
--- a/be/src/olap/partial_update_info.cpp
+++ b/be/src/olap/partial_update_info.cpp
@@ -311,9 +311,7 @@ Status FixedReadPlan::read_columns_by_plan(
         const signed char* __restrict cur_delete_signs) const {
     if (force_read_old_delete_signs) {
         // always read delete sign column from historical data
-        if (const vectorized::ColumnWithTypeAndName* old_delete_sign_column =
-                    block.try_get_by_name(DELETE_SIGN);
-            old_delete_sign_column == nullptr) {
+        if (block.get_position_by_name(DELETE_SIGN) == -1) {
             auto del_col_cid = tablet_schema.field_index(DELETE_SIGN);
             cids_to_read.emplace_back(del_col_cid);
             block.swap(tablet_schema.create_block_by_cids(cids_to_read));
@@ -384,7 +382,10 @@ Status FixedReadPlan::fill_missing_columns(
                                          old_value_block, &read_index, true, 
nullptr));
 
     const auto* old_delete_signs = 
BaseTablet::get_delete_sign_column_data(old_value_block);
-    DCHECK(old_delete_signs != nullptr);
+    if (old_delete_signs == nullptr) {
+        return Status::InternalError("old delete signs column not found, 
block: {}",
+                                     old_value_block.dump_structure());
+    }
     // build default value columns
     auto default_value_block = old_value_block.clone_empty();
     RETURN_IF_ERROR(BaseTablet::generate_default_value_block(
@@ -420,27 +421,27 @@ Status FixedReadPlan::fill_missing_columns(
             }
 
             if (should_use_default) {
-                // clang-format off
                 if (tablet_column.has_default_value()) {
                     
missing_col->insert_from(*mutable_default_value_columns[i], 0);
                 } else if (tablet_column.is_nullable()) {
-                    auto* nullable_column = 
assert_cast<vectorized::ColumnNullable*, 
TypeCheckOnRelease::DISABLE>(missing_col.get());
+                    auto* nullable_column =
+                            
assert_cast<vectorized::ColumnNullable*>(missing_col.get());
                     nullable_column->insert_many_defaults(1);
                 } else if (tablet_schema.auto_increment_column() == 
tablet_column.name()) {
-                    const auto& column = 
*DORIS_TRY(rowset_ctx->tablet_schema->column(tablet_column.name()));
+                    const auto& column =
+                            
*DORIS_TRY(rowset_ctx->tablet_schema->column(tablet_column.name()));
                     DCHECK(column.type() == FieldType::OLAP_FIELD_TYPE_BIGINT);
                     auto* auto_inc_column =
-                            assert_cast<vectorized::ColumnInt64*, 
TypeCheckOnRelease::DISABLE>(missing_col.get());
-                    
auto_inc_column->insert(vectorized::Field::create_field<TYPE_BIGINT>(
-assert_cast<const vectorized::ColumnInt64*, TypeCheckOnRelease::DISABLE>(
-block->get_by_name(BeConsts::PARTIAL_UPDATE_AUTO_INC_COL).column.get())->get_element(idx)));
+                            
assert_cast<vectorized::ColumnInt64*>(missing_col.get());
+                    auto_inc_column->insert_from(
+                            
*block->get_by_name(BeConsts::PARTIAL_UPDATE_AUTO_INC_COL).column.get(),
+                            idx);
                 } else {
                     // If the control flow reaches this branch, the column 
neither has default value
                     // nor is nullable. It means that the row's delete sign is 
marked, and the value
                     // columns are useless and won't be read. So we can just 
put arbitary values in the cells
                     
missing_col->insert(tablet_column.get_vec_type()->get_default());
                 }
-                // clang-format on
             } else {
                 
missing_col->insert_from(*old_value_block.get_by_position(i).column,
                                          pos_in_old_block);
diff --git a/be/src/olap/push_handler.cpp b/be/src/olap/push_handler.cpp
index 677531d8df7..22d51630e00 100644
--- a/be/src/olap/push_handler.cpp
+++ b/be/src/olap/push_handler.cpp
@@ -478,10 +478,10 @@ Status PushBrokerReader::_cast_to_input_block() {
         if (slot_desc->type()->get_primitive_type() == 
PrimitiveType::TYPE_VARIANT) {
             continue;
         }
-        auto& arg = _src_block_ptr->get_by_name(slot_desc->col_name());
         // remove nullable here, let the get_function decide whether nullable
         auto return_type = slot_desc->get_data_type_ptr();
         idx = _src_block_name_to_idx[slot_desc->col_name()];
+        auto& arg = _src_block_ptr->get_by_position(idx);
         // bitmap convert：src -> to_base64 -> bitmap_from_base64
         if (slot_desc->type()->get_primitive_type() == TYPE_BITMAP) {
             auto base64_return_type = 
vectorized::DataTypeFactory::instance().create_data_type(
@@ -491,7 +491,7 @@ Status PushBrokerReader::_cast_to_input_block() {
             RETURN_IF_ERROR(func_to_base64->execute(nullptr, *_src_block_ptr, 
{idx}, idx,
                                                     arg.column->size()));
             _src_block_ptr->get_by_position(idx).type = 
std::move(base64_return_type);
-            auto& arg_base64 = 
_src_block_ptr->get_by_name(slot_desc->col_name());
+            auto& arg_base64 = _src_block_ptr->get_by_position(idx);
             auto func_bitmap_from_base64 =
                     vectorized::SimpleFunctionFactory::instance().get_function(
                             "bitmap_from_base64", {arg_base64}, return_type);
diff --git a/be/src/olap/tablet_schema.h b/be/src/olap/tablet_schema.h
index 57463316cdf..b839231464b 100644
--- a/be/src/olap/tablet_schema.h
+++ b/be/src/olap/tablet_schema.h
@@ -463,7 +463,6 @@ public:
     void set_skip_write_index_on_load(bool skip) { _skip_write_index_on_load = 
skip; }
     bool skip_write_index_on_load() const { return _skip_write_index_on_load; }
     int32_t delete_sign_idx() const { return _delete_sign_idx; }
-    void set_delete_sign_idx(int32_t delete_sign_idx) { _delete_sign_idx = 
delete_sign_idx; }
     bool has_sequence_col() const { return _sequence_col_idx != -1; }
     int32_t sequence_col_idx() const { return _sequence_col_idx; }
     void set_version_col_idx(int32_t version_col_idx) { _version_col_idx = 
version_col_idx; }
diff --git a/be/src/pipeline/exec/scan_operator.cpp 
b/be/src/pipeline/exec/scan_operator.cpp
index d6b91723884..886d2f919e5 100644
--- a/be/src/pipeline/exec/scan_operator.cpp
+++ b/be/src/pipeline/exec/scan_operator.cpp
@@ -1332,13 +1332,6 @@ Status 
ScanOperatorX<LocalStateType>::get_block(RuntimeState* state, vectorized:
                                                 bool* eos) {
     auto& local_state = get_local_state(state);
     SCOPED_TIMER(local_state.exec_time_counter());
-    // in inverted index apply logic, in order to optimize query performance,
-    // we built some temporary columns into block, these columns only used in 
scan node level,
-    // remove them when query leave scan node to avoid other nodes use 
block->columns() to make a wrong decision
-    Defer drop_block_temp_column {[&]() {
-        std::unique_lock l(local_state._block_lock);
-        block->erase_tmp_columns();
-    }};
 
     if (state->is_cancelled()) {
         if (local_state._scanner_ctx) {
diff --git a/be/src/pipeline/exec/schema_scan_operator.cpp 
b/be/src/pipeline/exec/schema_scan_operator.cpp
index e57fbd75573..79987c001de 100644
--- a/be/src/pipeline/exec/schema_scan_operator.cpp
+++ b/be/src/pipeline/exec/schema_scan_operator.cpp
@@ -179,6 +179,7 @@ Status SchemaScanOperatorX::prepare(RuntimeState* state) {
         int j = 0;
         for (; j < columns_desc.size(); ++j) {
             if (boost::iequals(_dest_tuple_desc->slots()[i]->col_name(), 
columns_desc[j].name)) {
+                _slot_offsets[i] = j;
                 break;
             }
         }
@@ -250,11 +251,10 @@ Status SchemaScanOperatorX::get_block(RuntimeState* 
state, vectorized::Block* bl
         if (src_block.rows()) {
             // block->check_number_of_rows();
             for (int i = 0; i < _slot_num; ++i) {
-                auto* dest_slot_desc = _dest_tuple_desc->slots()[i];
                 vectorized::MutableColumnPtr column_ptr =
                         std::move(*block->get_by_position(i).column).mutate();
                 column_ptr->insert_range_from(
-                        
*src_block.get_by_name(dest_slot_desc->col_name()).column, 0,
+                        
*src_block.safe_get_by_position(_slot_offsets[i]).column, 0,
                         src_block.rows());
             }
             RETURN_IF_ERROR(local_state.filter_block(local_state._conjuncts, 
block,
diff --git a/be/src/pipeline/exec/schema_scan_operator.h 
b/be/src/pipeline/exec/schema_scan_operator.h
index 6846d9b59af..0634b9d5693 100644
--- a/be/src/pipeline/exec/schema_scan_operator.h
+++ b/be/src/pipeline/exec/schema_scan_operator.h
@@ -19,6 +19,8 @@
 
 #include <stdint.h>
 
+#include <unordered_map>
+
 #include "common/status.h"
 #include "exec/schema_scanner.h"
 #include "operator.h"
@@ -85,6 +87,9 @@ private:
     int _tuple_idx;
     // slot num need to fill in and return
     int _slot_num;
+
+    // slot index mapping to src column index
+    std::unordered_map<int, int> _slot_offsets;
 };
 
 #include "common/compile_check_end.h"
diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp
index 5914ffe3f31..6a513d9517b 100644
--- a/be/src/vec/core/block.cpp
+++ b/be/src/vec/core/block.cpp
@@ -82,13 +82,9 @@ template void 
clear_blocks<Block>(moodycamel::ConcurrentQueue<Block>&,
 template void clear_blocks<BlockUPtr>(moodycamel::ConcurrentQueue<BlockUPtr>&,
                                       RuntimeProfile::Counter* 
memory_used_counter);
 
-Block::Block(std::initializer_list<ColumnWithTypeAndName> il) : data {il} {
-    initialize_index_by_name();
-}
+Block::Block(std::initializer_list<ColumnWithTypeAndName> il) : data {il} {}
 
-Block::Block(ColumnsWithTypeAndName data_) : data {std::move(data_)} {
-    initialize_index_by_name();
-}
+Block::Block(ColumnsWithTypeAndName data_) : data {std::move(data_)} {}
 
 Block::Block(const std::vector<SlotDescriptor*>& slots, size_t block_size,
              bool ignore_trivial_slot) {
@@ -161,22 +157,14 @@ Status Block::deserialize(const PBlock& pblock, size_t* 
uncompressed_bytes,
                 buf = type->deserialize(buf, &data_column, 
pblock.be_exec_version()));
         data.emplace_back(data_column->get_ptr(), type, pcol_meta.name());
     }
-    initialize_index_by_name();
 
     return Status::OK();
 }
 
 void Block::reserve(size_t count) {
-    index_by_name.reserve(count);
     data.reserve(count);
 }
 
-void Block::initialize_index_by_name() {
-    for (size_t i = 0, size = data.size(); i < size; ++i) {
-        index_by_name[data[i].name] = i;
-    }
-}
-
 void Block::insert(size_t position, const ColumnWithTypeAndName& elem) {
     if (position > data.size()) {
         throw Exception(ErrorCode::INTERNAL_ERROR,
@@ -184,13 +172,6 @@ void Block::insert(size_t position, const 
ColumnWithTypeAndName& elem) {
                         data.size(), dump_names());
     }
 
-    for (auto& name_pos : index_by_name) {
-        if (name_pos.second >= position) {
-            ++name_pos.second;
-        }
-    }
-
-    index_by_name.emplace(elem.name, position);
     data.emplace(data.begin() + position, elem);
 }
 
@@ -201,30 +182,20 @@ void Block::insert(size_t position, 
ColumnWithTypeAndName&& elem) {
                         data.size(), dump_names());
     }
 
-    for (auto& name_pos : index_by_name) {
-        if (name_pos.second >= position) {
-            ++name_pos.second;
-        }
-    }
-
-    index_by_name.emplace(elem.name, position);
     data.emplace(data.begin() + position, std::move(elem));
 }
 
 void Block::clear_names() {
-    index_by_name.clear();
     for (auto& entry : data) {
         entry.name.clear();
     }
 }
 
 void Block::insert(const ColumnWithTypeAndName& elem) {
-    index_by_name.emplace(elem.name, data.size());
     data.emplace_back(elem);
 }
 
 void Block::insert(ColumnWithTypeAndName&& elem) {
-    index_by_name.emplace(elem.name, data.size());
     data.emplace_back(std::move(elem));
 }
 
@@ -238,13 +209,6 @@ void Block::erase_tail(size_t start) {
     DCHECK(start <= data.size()) << fmt::format(
             "Position out of bound in Block::erase(), max position = {}", 
data.size());
     data.erase(data.begin() + start, data.end());
-    for (auto it = index_by_name.begin(); it != index_by_name.end();) {
-        if (it->second >= start) {
-            index_by_name.erase(it++);
-        } else {
-            ++it;
-        }
-    }
 }
 
 void Block::erase(size_t position) {
@@ -256,36 +220,7 @@ void Block::erase(size_t position) {
 }
 
 void Block::erase_impl(size_t position) {
-    bool need_maintain_index_by_name = true;
-    if (position + 1 == data.size()) {
-        index_by_name.erase(data.back().name);
-        need_maintain_index_by_name = false;
-    }
-
     data.erase(data.begin() + position);
-
-    if (need_maintain_index_by_name) {
-        for (auto it = index_by_name.begin(); it != index_by_name.end();) {
-            if (it->second == position) {
-                index_by_name.erase(it++);
-            } else {
-                if (it->second > position) {
-                    --it->second;
-                }
-                ++it;
-            }
-        }
-    }
-}
-
-void Block::erase(const String& name) {
-    auto index_it = index_by_name.find(name);
-    if (index_it == index_by_name.end()) {
-        throw Exception(ErrorCode::INTERNAL_ERROR, "No such name in Block, 
name={}, block_names={}",
-                        name, dump_names());
-    }
-
-    erase_impl(index_it->second);
 }
 
 ColumnWithTypeAndName& Block::safe_get_by_position(size_t position) {
@@ -307,53 +242,30 @@ const ColumnWithTypeAndName& 
Block::safe_get_by_position(size_t position) const
 }
 
 ColumnWithTypeAndName& Block::get_by_name(const std::string& name) {
-    auto it = index_by_name.find(name);
-    if (index_by_name.end() == it) {
+    int pos = get_position_by_name(name);
+    if (pos == -1) {
         throw Exception(ErrorCode::INTERNAL_ERROR, "No such name in Block, 
name={}, block_names={}",
                         name, dump_names());
     }
-
-    return data[it->second];
+    return data[pos];
 }
 
 const ColumnWithTypeAndName& Block::get_by_name(const std::string& name) const 
{
-    auto it = index_by_name.find(name);
-    if (index_by_name.end() == it) {
+    int pos = get_position_by_name(name);
+    if (pos == -1) {
         throw Exception(ErrorCode::INTERNAL_ERROR, "No such name in Block, 
name={}, block_names={}",
                         name, dump_names());
     }
-
-    return data[it->second];
-}
-
-ColumnWithTypeAndName* Block::try_get_by_name(const std::string& name) {
-    auto it = index_by_name.find(name);
-    if (index_by_name.end() == it) {
-        return nullptr;
-    }
-    return &data[it->second];
-}
-
-const ColumnWithTypeAndName* Block::try_get_by_name(const std::string& name) 
const {
-    auto it = index_by_name.find(name);
-    if (index_by_name.end() == it) {
-        return nullptr;
-    }
-    return &data[it->second];
-}
-
-bool Block::has(const std::string& name) const {
-    return index_by_name.end() != index_by_name.find(name);
+    return data[pos];
 }
 
-size_t Block::get_position_by_name(const std::string& name) const {
-    auto it = index_by_name.find(name);
-    if (index_by_name.end() == it) {
-        throw Exception(ErrorCode::INTERNAL_ERROR, "No such name in Block, 
name={}, block_names={}",
-                        name, dump_names());
+int Block::get_position_by_name(const std::string& name) const {
+    for (int i = 0; i < data.size(); i++) {
+        if (data[i].name == name) {
+            return i;
+        }
     }
-
-    return it->second;
+    return -1;
 }
 
 void Block::check_number_of_rows(bool allow_null_columns) const {
@@ -771,7 +683,6 @@ DataTypes Block::get_data_types() const {
 
 void Block::clear() {
     data.clear();
-    index_by_name.clear();
 }
 
 void Block::clear_column_data(int64_t column_size) noexcept {
@@ -792,15 +703,6 @@ void Block::clear_column_data(int64_t column_size) 
noexcept {
     }
 }
 
-void Block::erase_tmp_columns() noexcept {
-    auto all_column_names = get_names();
-    for (auto& name : all_column_names) {
-        if (name.rfind(BeConsts::BLOCK_TEMP_COLUMN_PREFIX, 0) == 0) {
-            erase(name);
-        }
-    }
-}
-
 void Block::clear_column_mem_not_keep(const std::vector<bool>& 
column_keep_flags,
                                       bool need_keep_first) {
     if (data.size() >= column_keep_flags.size()) {
@@ -822,17 +724,14 @@ void Block::clear_column_mem_not_keep(const 
std::vector<bool>& column_keep_flags
 void Block::swap(Block& other) noexcept {
     SCOPED_SKIP_MEMORY_CHECK();
     data.swap(other.data);
-    index_by_name.swap(other.index_by_name);
 }
 
 void Block::swap(Block&& other) noexcept {
     SCOPED_SKIP_MEMORY_CHECK();
     data = std::move(other.data);
-    index_by_name = std::move(other.index_by_name);
 }
 
 void Block::shuffle_columns(const std::vector<int>& result_column_ids) {
-    index_by_name.clear();
     Container tmp_data;
     tmp_data.reserve(result_column_ids.size());
     for (const int result_column_id : result_column_ids) {
@@ -1034,24 +933,6 @@ Status Block::serialize(int be_exec_version, PBlock* 
pblock,
     return Status::OK();
 }
 
-MutableBlock::MutableBlock(const std::vector<TupleDescriptor*>& tuple_descs, 
int reserve_size,
-                           bool ignore_trivial_slot) {
-    for (auto* const tuple_desc : tuple_descs) {
-        for (auto* const slot_desc : tuple_desc->slots()) {
-            if (ignore_trivial_slot && !slot_desc->is_materialized()) {
-                continue;
-            }
-            _data_types.emplace_back(slot_desc->get_data_type_ptr());
-            _columns.emplace_back(_data_types.back()->create_column());
-            if (reserve_size != 0) {
-                _columns.back()->reserve(reserve_size);
-            }
-            _names.push_back(slot_desc->col_name());
-        }
-    }
-    initialize_index_by_name();
-}
-
 size_t MutableBlock::rows() const {
     for (const auto& column : _columns) {
         if (column) {
@@ -1067,7 +948,6 @@ void MutableBlock::swap(MutableBlock& another) noexcept {
     _columns.swap(another._columns);
     _data_types.swap(another._data_types);
     _names.swap(another._names);
-    index_by_name.swap(another.index_by_name);
 }
 
 void MutableBlock::add_row(const Block* block, int row) {
@@ -1130,31 +1010,6 @@ Status MutableBlock::add_rows(const Block* block, const 
std::vector<int64_t>& ro
     return Status::OK();
 }
 
-void MutableBlock::erase(const String& name) {
-    auto index_it = index_by_name.find(name);
-    if (index_it == index_by_name.end()) {
-        throw Exception(ErrorCode::INTERNAL_ERROR, "No such name in Block, 
name={}, block_names={}",
-                        name, dump_names());
-    }
-
-    auto position = index_it->second;
-
-    _columns.erase(_columns.begin() + position);
-    _data_types.erase(_data_types.begin() + position);
-    _names.erase(_names.begin() + position);
-
-    for (auto it = index_by_name.begin(); it != index_by_name.end();) {
-        if (it->second == position) {
-            index_by_name.erase(it++);
-        } else {
-            if (it->second > position) {
-                --it->second;
-            }
-            ++it;
-        }
-    }
-}
-
 Block MutableBlock::to_block(int start_column) {
     return to_block(start_column, (int)_columns.size());
 }
@@ -1294,26 +1149,6 @@ void MutableBlock::clear_column_data() noexcept {
     }
 }
 
-void MutableBlock::initialize_index_by_name() {
-    for (size_t i = 0, size = _names.size(); i < size; ++i) {
-        index_by_name[_names[i]] = i;
-    }
-}
-
-bool MutableBlock::has(const std::string& name) const {
-    return index_by_name.end() != index_by_name.find(name);
-}
-
-size_t MutableBlock::get_position_by_name(const std::string& name) const {
-    auto it = index_by_name.find(name);
-    if (index_by_name.end() == it) {
-        throw Exception(ErrorCode::INTERNAL_ERROR, "No such name in Block, 
name={}, block_names={}",
-                        name, dump_names());
-    }
-
-    return it->second;
-}
-
 std::string MutableBlock::dump_names() const {
     std::string out;
     for (auto it = _names.begin(); it != _names.end(); ++it) {
diff --git a/be/src/vec/core/block.h b/be/src/vec/core/block.h
index a0d325ae547..d2128b964e4 100644
--- a/be/src/vec/core/block.h
+++ b/be/src/vec/core/block.h
@@ -73,9 +73,7 @@ class Block {
 
 private:
     using Container = ColumnsWithTypeAndName;
-    using IndexByName = phmap::flat_hash_map<String, size_t>;
     Container data;
-    IndexByName index_by_name;
 
 public:
     Block() = default;
@@ -108,8 +106,6 @@ public:
     void erase_tail(size_t start);
     /// remove the columns at the specified positions
     void erase(const std::set<size_t>& positions);
-    /// remove the column with the specified name
-    void erase(const String& name);
     // T was std::set<int>, std::vector<int>, std::list<int>
     template <class T>
     void erase_not_in(const T& container) {
@@ -124,8 +120,6 @@ public:
     // This is a temporary compromise; index_by_name may be removed in the 
future
     void simple_insert(const ColumnWithTypeAndName& elem) { 
data.emplace_back(elem); }
 
-    void initialize_index_by_name();
-
     /// References are invalidated after calling functions above.
     ColumnWithTypeAndName& get_by_position(size_t position) {
         DCHECK(data.size() > position)
@@ -150,13 +144,11 @@ public:
     ColumnWithTypeAndName& safe_get_by_position(size_t position);
     const ColumnWithTypeAndName& safe_get_by_position(size_t position) const;
 
+    // Get column by name. Throws an exception if there is no column with that 
name.
+    // ATTN: this method is O(N). better maintain name -> position map in 
caller if you need to call it frequently.
     ColumnWithTypeAndName& get_by_name(const std::string& name);
     const ColumnWithTypeAndName& get_by_name(const std::string& name) const;
 
-    // return nullptr when no such column name
-    ColumnWithTypeAndName* try_get_by_name(const std::string& name);
-    const ColumnWithTypeAndName* try_get_by_name(const std::string& name) 
const;
-
     Container::iterator begin() { return data.begin(); }
     Container::iterator end() { return data.end(); }
     Container::const_iterator begin() const { return data.begin(); }
@@ -164,9 +156,9 @@ public:
     Container::const_iterator cbegin() const { return data.cbegin(); }
     Container::const_iterator cend() const { return data.cend(); }
 
-    bool has(const std::string& name) const;
-
-    size_t get_position_by_name(const std::string& name) const;
+    // Get position of column by name. Returns -1 if there is no column with 
that name.
+    // ATTN: this method is O(N). better maintain name -> position map in 
caller if you need to call it frequently.
+    int get_position_by_name(const std::string& name) const;
 
     const ColumnsWithTypeAndName& get_columns_with_type_and_name() const;
 
@@ -362,11 +354,6 @@ public:
     // for String type or Array<String> type
     void shrink_char_type_column_suffix_zero(const std::vector<size_t>& 
char_type_idx);
 
-    // remove tmp columns in block
-    // in inverted index apply logic, in order to optimize query performance,
-    // we built some temporary columns into block
-    void erase_tmp_columns() noexcept;
-
     void clear_column_mem_not_keep(const std::vector<bool>& column_keep_flags,
                                    bool need_keep_first);
 
@@ -387,9 +374,6 @@ private:
     DataTypes _data_types;
     std::vector<std::string> _names;
 
-    using IndexByName = phmap::flat_hash_map<String, size_t>;
-    IndexByName index_by_name;
-
 public:
     static MutableBlock build_mutable_block(Block* block) {
         return block == nullptr ? MutableBlock() : MutableBlock(block);
@@ -397,27 +381,19 @@ public:
     MutableBlock() = default;
     ~MutableBlock() = default;
 
-    MutableBlock(const std::vector<TupleDescriptor*>& tuple_descs, int 
reserve_size = 0,
-                 bool igore_trivial_slot = false);
-
     MutableBlock(Block* block)
             : _columns(block->mutate_columns()),
               _data_types(block->get_data_types()),
-              _names(block->get_names()) {
-        initialize_index_by_name();
-    }
+              _names(block->get_names()) {}
     MutableBlock(Block&& block)
             : _columns(block.mutate_columns()),
               _data_types(block.get_data_types()),
-              _names(block.get_names()) {
-        initialize_index_by_name();
-    }
+              _names(block.get_names()) {}
 
     void operator=(MutableBlock&& m_block) {
         _columns = std::move(m_block._columns);
         _data_types = std::move(m_block._data_types);
         _names = std::move(m_block._names);
-        initialize_index_by_name();
     }
 
     size_t rows() const;
@@ -548,7 +524,6 @@ public:
                     _columns[i] = _data_types[i]->create_column();
                 }
             }
-            initialize_index_by_name();
         } else {
             if (_columns.size() != block.columns()) {
                 return Status::Error<ErrorCode::INTERNAL_ERROR>(
@@ -595,9 +570,6 @@ public:
     Status add_rows(const Block* block, size_t row_begin, size_t length);
     Status add_rows(const Block* block, const std::vector<int64_t>& rows);
 
-    /// remove the column with the specified name
-    void erase(const String& name);
-
     std::string dump_data(size_t row_limit = 100) const;
     std::string dump_data_json(size_t row_limit = 100) const;
 
@@ -623,15 +595,8 @@ public:
 
     std::vector<std::string>& get_names() { return _names; }
 
-    bool has(const std::string& name) const;
-
-    size_t get_position_by_name(const std::string& name) const;
-
     /** Get a list of column names separated by commas. */
     std::string dump_names() const;
-
-private:
-    void initialize_index_by_name();
 };
 
 struct IteratorRowRef {
diff --git a/be/src/vec/core/sort_block.cpp b/be/src/vec/core/sort_block.cpp
index 20bf1f952af..75aa9d85a12 100644
--- a/be/src/vec/core/sort_block.cpp
+++ b/be/src/vec/core/sort_block.cpp
@@ -32,10 +32,7 @@ ColumnsWithSortDescriptions 
get_columns_with_sort_description(const Block& block
 
     for (size_t i = 0; i < size; ++i) {
         const IColumn* column =
-                !description[i].column_name.empty()
-                        ? 
block.get_by_name(description[i].column_name).column.get()
-                        : 
block.safe_get_by_position(description[i].column_number).column.get();
-
+                
block.safe_get_by_position(description[i].column_number).column.get();
         res.emplace_back(column, description[i]);
     }
 
@@ -53,9 +50,7 @@ void sort_block(Block& src_block, Block& dest_block, const 
SortDescription& desc
         bool reverse = description[0].direction == -1;
 
         const IColumn* column =
-                !description[0].column_name.empty()
-                        ? 
src_block.get_by_name(description[0].column_name).column.get()
-                        : 
src_block.safe_get_by_position(description[0].column_number).column.get();
+                
src_block.safe_get_by_position(description[0].column_number).column.get();
 
         IColumn::Permutation perm;
         column->get_permutation(reverse, limit, 
description[0].nulls_direction, perm);
diff --git a/be/src/vec/core/sort_description.h 
b/be/src/vec/core/sort_description.h
index cdee17f7651..4d1543d6904 100644
--- a/be/src/vec/core/sort_description.h
+++ b/be/src/vec/core/sort_description.h
@@ -20,20 +20,15 @@
 
 #pragma once
 
-#include "cstddef"
-#include "memory"
-#include "string"
-#include "vec/core/field.h"
 #include "vector"
 
 namespace doris::vectorized {
 
 /// Description of the sorting rule by one column.
 struct SortColumnDescription {
-    std::string column_name; /// The name of the column.
-    int column_number;       /// Column number (used if no name is given).
-    int direction;           /// 1 - ascending, -1 - descending.
-    int nulls_direction;     /// 1 - NULLs and NaNs are greater, -1 - less.
+    int column_number;   /// Column number (used if no name is given).
+    int direction;       /// 1 - ascending, -1 - descending.
+    int nulls_direction; /// 1 - NULLs and NaNs are greater, -1 - less.
 
     SortColumnDescription(int column_number_, int direction_, int 
nulls_direction_)
             : column_number(column_number_),
diff --git a/be/src/vec/exec/format/arrow/arrow_stream_reader.cpp 
b/be/src/vec/exec/format/arrow/arrow_stream_reader.cpp
index def1645f323..dfd4cdcfc91 100644
--- a/be/src/vec/exec/format/arrow/arrow_stream_reader.cpp
+++ b/be/src/vec/exec/format/arrow/arrow_stream_reader.cpp
@@ -103,12 +103,17 @@ Status ArrowStreamReader::get_next_block(Block* block, 
size_t* read_rows, bool*
         auto num_columns = batch.num_columns();
         for (int c = 0; c < num_columns; ++c) {
             arrow::Array* column = batch.column(c).get();
-
             std::string column_name = batch.schema()->field(c)->name();
 
             try {
                 const vectorized::ColumnWithTypeAndName& column_with_name =
-                        block->get_by_name(column_name);
+                        block->safe_get_by_position(c);
+
+                if (column_with_name.name != column_name) {
+                    return Status::InternalError("Column name mismatch: 
expected {}, got {}",
+                                                 column_with_name.name, 
column_name);
+                }
+
                 
RETURN_IF_ERROR(column_with_name.type->get_serde()->read_column_from_arrow(
                         column_with_name.column->assume_mutable_ref(), column, 
0, num_rows, _ctzz));
             } catch (Exception& e) {
diff --git a/be/src/vec/exec/format/orc/vorc_reader.cpp 
b/be/src/vec/exec/format/orc/vorc_reader.cpp
index bd996f2189b..59d3ece9356 100644
--- a/be/src/vec/exec/format/orc/vorc_reader.cpp
+++ b/be/src/vec/exec/format/orc/vorc_reader.cpp
@@ -1337,9 +1337,13 @@ Status OrcReader::_fill_missing_columns(
                 result_column_ptr = 
result_column_ptr->convert_to_full_column_if_const();
                 auto origin_column_type = block->get_by_name(kv.first).type;
                 bool is_nullable = origin_column_type->is_nullable();
+                int pos = block->get_position_by_name(kv.first);
+                if (pos == -1) {
+                    return Status::InternalError("Failed to find column: {}, 
block: {}", kv.first,
+                                                 block->dump_structure());
+                }
                 block->replace_by_position(
-                        block->get_position_by_name(kv.first),
-                        is_nullable ? make_nullable(result_column_ptr) : 
result_column_ptr);
+                        pos, is_nullable ? make_nullable(result_column_ptr) : 
result_column_ptr);
                 block->erase(result_column_id);
             }
         }
@@ -2054,7 +2058,12 @@ Status OrcReader::_get_next_block_impl(Block* block, 
size_t* read_rows, bool* eo
         if (!_dict_cols_has_converted && !_dict_filter_cols.empty()) {
             for (auto& dict_filter_cols : _dict_filter_cols) {
                 MutableColumnPtr dict_col_ptr = ColumnInt32::create();
-                size_t pos = 
block->get_position_by_name(dict_filter_cols.first);
+                int pos = block->get_position_by_name(dict_filter_cols.first);
+                if (pos == -1) {
+                    return Status::InternalError(
+                            "Failed to find dict filter column '{}' in block 
{}",
+                            dict_filter_cols.first, block->dump_structure());
+                }
                 auto& column_with_type_and_name = block->get_by_position(pos);
                 auto& column_type = column_with_type_and_name.type;
                 if (column_type->is_nullable()) {
@@ -2215,7 +2224,11 @@ Status OrcReader::filter(orc::ColumnVectorBatch& data, 
uint16_t* sel, uint16_t s
     if (!_dict_cols_has_converted && !_dict_filter_cols.empty()) {
         for (auto& dict_filter_cols : _dict_filter_cols) {
             MutableColumnPtr dict_col_ptr = ColumnInt32::create();
-            size_t pos = block->get_position_by_name(dict_filter_cols.first);
+            int pos = block->get_position_by_name(dict_filter_cols.first);
+            if (pos == -1) {
+                return Status::InternalError("Wrong read column '{}' in orc 
file, block: {}",
+                                             dict_filter_cols.first, 
block->dump_structure());
+            }
             auto& column_with_type_and_name = block->get_by_position(pos);
             auto& column_type = column_with_type_and_name.type;
             if (column_type->is_nullable()) {
@@ -2615,7 +2628,11 @@ Status OrcReader::_convert_dict_cols_to_string_cols(
     }
     if (!_dict_filter_cols.empty()) {
         for (auto& dict_filter_cols : _dict_filter_cols) {
-            size_t pos = block->get_position_by_name(dict_filter_cols.first);
+            int pos = block->get_position_by_name(dict_filter_cols.first);
+            if (pos == -1) {
+                return Status::InternalError("Wrong read column '{}' in orc 
file, block: {}",
+                                             dict_filter_cols.first, 
block->dump_structure());
+            }
             ColumnWithTypeAndName& column_with_type_and_name = 
block->get_by_position(pos);
             const ColumnPtr& column = column_with_type_and_name.column;
 
diff --git a/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp 
b/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp
index 7d1f9d27430..f9de3648d07 100644
--- a/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp
+++ b/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp
@@ -400,7 +400,12 @@ Status RowGroupReader::_read_column_data(Block* block,
         for (auto& _dict_filter_col : _dict_filter_cols) {
             if (_dict_filter_col.first == read_col_name) {
                 MutableColumnPtr dict_column = ColumnInt32::create();
-                size_t pos = block->get_position_by_name(read_col_name);
+                int pos = block->get_position_by_name(read_col_name);
+                if (pos == -1) {
+                    return Status::InternalError(
+                            "Wrong read column '{}' in parquet file, block: 
{}", read_col_name,
+                            block->dump_structure());
+                }
                 if (column_type->is_nullable()) {
                     block->get_by_position(pos).type =
                             
std::make_shared<DataTypeNullable>(std::make_shared<DataTypeInt32>());
@@ -714,9 +719,14 @@ Status RowGroupReader::_fill_missing_columns(
                 result_column_ptr = 
result_column_ptr->convert_to_full_column_if_const();
                 auto origin_column_type = block->get_by_name(kv.first).type;
                 bool is_nullable = origin_column_type->is_nullable();
+                int pos = block->get_position_by_name(kv.first);
+                if (pos == -1) {
+                    return Status::InternalError(
+                            "Wrong missing column '{}' in parquet file, block: 
{}", kv.first,
+                            block->dump_structure());
+                }
                 block->replace_by_position(
-                        block->get_position_by_name(kv.first),
-                        is_nullable ? make_nullable(result_column_ptr) : 
result_column_ptr);
+                        pos, is_nullable ? make_nullable(result_column_ptr) : 
result_column_ptr);
                 block->erase(result_column_id);
             }
         }
@@ -1073,7 +1083,12 @@ Status 
RowGroupReader::_rewrite_dict_conjuncts(std::vector<int32_t>& dict_codes,
 
 void RowGroupReader::_convert_dict_cols_to_string_cols(Block* block) {
     for (auto& dict_filter_cols : _dict_filter_cols) {
-        size_t pos = block->get_position_by_name(dict_filter_cols.first);
+        int pos = block->get_position_by_name(dict_filter_cols.first);
+        if (pos == -1) {
+            throw Exception(ErrorCode::INTERNAL_ERROR,
+                            "Wrong read column '{}' in parquet file, block: 
{}",
+                            dict_filter_cols.first, block->dump_structure());
+        }
         ColumnWithTypeAndName& column_with_type_and_name = 
block->get_by_position(pos);
         const ColumnPtr& column = column_with_type_and_name.column;
         if (auto* nullable_column = 
check_and_get_column<ColumnNullable>(*column)) {
diff --git a/be/src/vec/exec/format/table/equality_delete.cpp 
b/be/src/vec/exec/format/table/equality_delete.cpp
index 7f8452f4b18..b2bc1408fc5 100644
--- a/be/src/vec/exec/format/table/equality_delete.cpp
+++ b/be/src/vec/exec/format/table/equality_delete.cpp
@@ -45,15 +45,11 @@ Status SimpleEqualityDelete::_build_set() {
 
 Status SimpleEqualityDelete::filter_data_block(Block* data_block) {
     SCOPED_TIMER(equality_delete_time);
-    auto* column_and_type = data_block->try_get_by_name(_delete_column_name);
-    if (column_and_type == nullptr) {
-        return Status::InternalError("Can't find the delete column '{}' in 
data file",
-                                     _delete_column_name);
-    }
-    if (column_and_type->type->get_primitive_type() != _delete_column_type) {
+    auto column_and_type = data_block->get_by_name(_delete_column_name);
+    if (column_and_type.type->get_primitive_type() != _delete_column_type) {
         return Status::InternalError(
                 "Not support type change in column '{}', src type: {}, target 
type: {}",
-                _delete_column_name, column_and_type->type->get_name(), 
(int)_delete_column_type);
+                _delete_column_name, column_and_type.type->get_name(), 
(int)_delete_column_type);
     }
     size_t rows = data_block->rows();
     // _filter: 1 => in _hybrid_set; 0 => not in _hybrid_set
@@ -64,12 +60,12 @@ Status SimpleEqualityDelete::filter_data_block(Block* 
data_block) {
         _filter->assign(rows, UInt8(0));
     }
 
-    if (column_and_type->column->is_nullable()) {
+    if (column_and_type.column->is_nullable()) {
         const NullMap& null_map =
-                reinterpret_cast<const 
ColumnNullable*>(column_and_type->column.get())
+                reinterpret_cast<const 
ColumnNullable*>(column_and_type.column.get())
                         ->get_null_map_data();
         _hybrid_set->find_batch_nullable(
-                
remove_nullable(column_and_type->column)->assume_mutable_ref(), rows, null_map,
+                remove_nullable(column_and_type.column)->assume_mutable_ref(), 
rows, null_map,
                 *_filter);
         if (_hybrid_set->contain_null()) {
             auto* filter_data = _filter->data();
@@ -78,7 +74,7 @@ Status SimpleEqualityDelete::filter_data_block(Block* 
data_block) {
             }
         }
     } else {
-        _hybrid_set->find_batch(column_and_type->column->assume_mutable_ref(), 
rows, *_filter);
+        _hybrid_set->find_batch(column_and_type.column->assume_mutable_ref(), 
rows, *_filter);
     }
     // should reverse _filter
     auto* filter_data = _filter->data();
@@ -109,18 +105,19 @@ Status MultiEqualityDelete::filter_data_block(Block* 
data_block) {
     SCOPED_TIMER(equality_delete_time);
     size_t column_index = 0;
     for (std::string column_name : _delete_block->get_names()) {
-        auto* column_and_type = data_block->try_get_by_name(column_name);
-        if (column_and_type == nullptr) {
-            return Status::InternalError("Can't find the delete column '{}' in 
data file",
-                                         column_name);
-        }
-        if 
(!_delete_block->get_by_name(column_name).type->equals(*column_and_type->type)) 
{
+        auto column_and_type = data_block->get_by_name(column_name);
+        if 
(!_delete_block->get_by_name(column_name).type->equals(*column_and_type.type)) {
             return Status::InternalError(
                     "Not support type change in column '{}', src type: {}, 
target type: {}",
                     column_name, 
_delete_block->get_by_name(column_name).type->get_name(),
-                    column_and_type->type->get_name());
+                    column_and_type.type->get_name());
+        }
+        int pos = data_block->get_position_by_name(column_name);
+        if (pos == -1) {
+            return Status::InternalError("Column '{}' not found in data block: 
{}", column_name,
+                                         data_block->dump_structure());
         }
-        _data_column_index[column_index++] = 
data_block->get_position_by_name(column_name);
+        _data_column_index[column_index++] = pos;
     }
     size_t rows = data_block->rows();
     _data_hashes.clear();
diff --git a/be/src/vec/exec/format/table/iceberg_reader.cpp 
b/be/src/vec/exec/format/table/iceberg_reader.cpp
index db6e44c3d4b..f26cb2d7c44 100644
--- a/be/src/vec/exec/format/table/iceberg_reader.cpp
+++ b/be/src/vec/exec/format/table/iceberg_reader.cpp
@@ -225,7 +225,7 @@ void IcebergTableReader::_generate_equality_delete_block(
 Status IcebergTableReader::_expand_block_if_need(Block* block) {
     for (auto& col : _expand_columns) {
         col.column->assume_mutable()->clear();
-        if (block->try_get_by_name(col.name)) {
+        if (block->get_position_by_name(col.name) != -1) {
             return Status::InternalError("Wrong expand column '{}'", col.name);
         }
         block->insert(col);
@@ -235,7 +235,12 @@ Status IcebergTableReader::_expand_block_if_need(Block* 
block) {
 
 Status IcebergTableReader::_shrink_block_if_need(Block* block) {
     for (const std::string& expand_col : _expand_col_names) {
-        block->erase(expand_col);
+        int pos = block->get_position_by_name(expand_col);
+        if (pos == -1) {
+            return Status::InternalError("Wrong erase column '{}', block: {}", 
expand_col,
+                                         block->dump_names());
+        }
+        block->erase(pos);
     }
     return Status::OK();
 }
diff --git a/be/src/vec/exec/scan/file_scanner.cpp 
b/be/src/vec/exec/scan/file_scanner.cpp
index 3ee1208994f..db06796d78f 100644
--- a/be/src/vec/exec/scan/file_scanner.cpp
+++ b/be/src/vec/exec/scan/file_scanner.cpp
@@ -693,9 +693,13 @@ Status FileScanner::_fill_missing_columns(size_t rows) {
                 result_column_ptr = 
result_column_ptr->convert_to_full_column_if_const();
                 auto origin_column_type = 
_src_block_ptr->get_by_name(kv.first).type;
                 bool is_nullable = origin_column_type->is_nullable();
+                int pos = _src_block_ptr->get_position_by_name(kv.first);
+                if (pos == -1) {
+                    return Status::InternalError("Column {} not found in src 
block {}", kv.first,
+                                                 
_src_block_ptr->dump_structure());
+                }
                 _src_block_ptr->replace_by_position(
-                        _src_block_ptr->get_position_by_name(kv.first),
-                        is_nullable ? make_nullable(result_column_ptr) : 
result_column_ptr);
+                        pos, is_nullable ? make_nullable(result_column_ptr) : 
result_column_ptr);
                 _src_block_ptr->erase(result_column_id);
             }
         }
diff --git a/be/src/vec/exec/scan/olap_scanner.cpp 
b/be/src/vec/exec/scan/olap_scanner.cpp
index b1b67e8300d..0bdf793872f 100644
--- a/be/src/vec/exec/scan/olap_scanner.cpp
+++ b/be/src/vec/exec/scan/olap_scanner.cpp
@@ -428,8 +428,8 @@ Status OlapScanner::_init_tablet_reader_params(
     DBUG_EXECUTE_IF("NewOlapScanner::_init_tablet_reader_params.block", 
DBUG_BLOCK);
 
     if (!_state->skip_storage_engine_merge()) {
-        TOlapScanNode& olap_scan_node =
-                
((pipeline::OlapScanLocalState*)_local_state)->olap_scan_node();
+        auto* olap_scan_local_state = 
(pipeline::OlapScanLocalState*)_local_state;
+        TOlapScanNode& olap_scan_node = 
olap_scan_local_state->olap_scan_node();
         // order by table keys optimization for topn
         // will only read head/tail of data file since it's already sorted by 
keys
         if (olap_scan_node.__isset.sort_info && 
!olap_scan_node.sort_info.is_asc_order.empty()) {
@@ -441,16 +441,20 @@ Status OlapScanner::_init_tablet_reader_params(
             _tablet_reader_params.read_orderby_key_num_prefix_columns =
                     olap_scan_node.sort_info.is_asc_order.size();
             _tablet_reader_params.read_orderby_key_limit = _limit;
-            _tablet_reader_params.filter_block_conjuncts = _conjuncts;
+
+            if (_tablet_reader_params.read_orderby_key_limit > 0 &&
+                olap_scan_local_state->_storage_no_merge()) {
+                _tablet_reader_params.filter_block_conjuncts = _conjuncts;
+                _conjuncts.clear();
+            }
         }
 
         // set push down topn filter
         _tablet_reader_params.topn_filter_source_node_ids =
-                ((pipeline::OlapScanLocalState*)_local_state)
-                        ->get_topn_filter_source_node_ids(_state, true);
+                olap_scan_local_state->get_topn_filter_source_node_ids(_state, 
true);
         if (!_tablet_reader_params.topn_filter_source_node_ids.empty()) {
             _tablet_reader_params.topn_filter_target_node_id =
-                    
((pipeline::OlapScanLocalState*)_local_state)->parent()->node_id();
+                    olap_scan_local_state->parent()->node_id();
         }
     }
 
diff --git a/be/src/vec/exec/scan/scanner.cpp b/be/src/vec/exec/scan/scanner.cpp
index 1e163dd80d2..1f4dace2521 100644
--- a/be/src/vec/exec/scan/scanner.cpp
+++ b/be/src/vec/exec/scan/scanner.cpp
@@ -113,8 +113,6 @@ Status Scanner::get_block(RuntimeState* state, Block* 
block, bool* eof) {
                 RETURN_IF_ERROR(_get_block_impl(state, block, eof));
                 if (*eof) {
                     DCHECK(block->rows() == 0);
-                    // clear TEMP columns to avoid column align problem
-                    block->erase_tmp_columns();
                     break;
                 }
                 _num_rows_read += block->rows();
@@ -145,11 +143,6 @@ Status Scanner::get_block(RuntimeState* state, Block* 
block, bool* eof) {
 }
 
 Status Scanner::_filter_output_block(Block* block) {
-    Defer clear_tmp_block([&]() { block->erase_tmp_columns(); });
-    if (block->has(BeConsts::BLOCK_TEMP_COLUMN_SCANNER_FILTERED)) {
-        // scanner filter_block is already done (only by _topn_next 
currently), just skip it
-        return Status::OK();
-    }
     auto old_rows = block->rows();
     Status st = VExprContext::filter_block(_conjuncts, block, 
block->columns());
     _counter.num_rows_unselected += old_rows - block->rows();
diff --git a/be/src/vec/functions/function_helpers.cpp 
b/be/src/vec/functions/function_helpers.cpp
index 6862e9addb9..877e47b5318 100644
--- a/be/src/vec/functions/function_helpers.cpp
+++ b/be/src/vec/functions/function_helpers.cpp
@@ -97,14 +97,6 @@ std::tuple<Block, ColumnNumbers> 
create_block_with_nested_columns(const Block& b
         }
     }
 
-    // TODO: only support match function, rethink the logic
-    for (const auto& ctn : block) {
-        if (ctn.name.size() > BeConsts::BLOCK_TEMP_COLUMN_PREFIX.size() &&
-            starts_with(ctn.name, BeConsts::BLOCK_TEMP_COLUMN_PREFIX)) {
-            res.insert(ctn);
-        }
-    }
-
     return {std::move(res), std::move(res_args)};
 }
 
diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp
index 7f51bf7c1f0..a3186d2d982 100644
--- a/be/src/vec/olap/block_reader.cpp
+++ b/be/src/vec/olap/block_reader.cpp
@@ -411,7 +411,6 @@ Status BlockReader::_unique_key_next_block(Block* block, 
bool* eof) {
         block->insert(column_with_type_and_name);
         RETURN_IF_ERROR(Block::filter_block(block, target_columns_size, 
target_columns_size));
         _stats.rows_del_filtered += target_block_row - block->rows();
-        DCHECK(block->try_get_by_name("__DORIS_COMPACTION_FILTER__") == 
nullptr);
         if (UNLIKELY(_reader_context.record_rowids)) {
             DCHECK_EQ(_block_row_locations.size(), block->rows() + 
delete_count);
         }
diff --git a/be/src/vec/olap/vcollect_iterator.cpp 
b/be/src/vec/olap/vcollect_iterator.cpp
index 3f375eb2c82..deb384de0ba 100644
--- a/be/src/vec/olap/vcollect_iterator.cpp
+++ b/be/src/vec/olap/vcollect_iterator.cpp
@@ -93,6 +93,7 @@ void VCollectIterator::init(TabletReader* reader, bool 
ori_data_overlapping, boo
         _topn_limit = _reader->_reader_context.read_orderby_key_limit;
     } else {
         _topn_limit = 0;
+        DCHECK_EQ(_reader->_reader_context.filter_block_conjuncts.size(), 0);
     }
 }
 
@@ -259,8 +260,6 @@ Status VCollectIterator::_topn_next(Block* block) {
         return Status::Error<END_OF_FILE>("");
     }
 
-    // clear TEMP columns to avoid column align problem
-    block->erase_tmp_columns();
     auto clone_block = block->clone_empty();
     /*
     select id, "${tR2}",
@@ -316,8 +315,6 @@ Status VCollectIterator::_topn_next(Block* block) {
                 if (status.is<END_OF_FILE>()) {
                     eof = true;
                     if (block->rows() == 0) {
-                        // clear TEMP columns to avoid column align problem in 
segment iterator
-                        block->erase_tmp_columns();
                         break;
                     }
                 } else {
@@ -328,8 +325,6 @@ Status VCollectIterator::_topn_next(Block* block) {
             // filter block
             RETURN_IF_ERROR(VExprContext::filter_block(
                     _reader->_reader_context.filter_block_conjuncts, block, 
block->columns()));
-            // clear TMPE columns to avoid column align problem in 
mutable_block.add_rows bellow
-            block->erase_tmp_columns();
 
             // update read rows
             read_rows += block->rows();
@@ -452,12 +447,6 @@ Status VCollectIterator::_topn_next(Block* block) {
                << " sorted_row_pos.size()=" << sorted_row_pos.size()
                << " mutable_block.rows()=" << mutable_block.rows();
     *block = mutable_block.to_block();
-    // append a column to indicate scanner filter_block is already done
-    auto filtered_datatype = std::make_shared<DataTypeUInt8>();
-    auto filtered_column = filtered_datatype->create_column_const(
-            block->rows(), Field::create_field<TYPE_BOOLEAN>(1));
-    block->insert(
-            {filtered_column, filtered_datatype, 
BeConsts::BLOCK_TEMP_COLUMN_SCANNER_FILTERED});
 
     _topn_eof = true;
     return block->rows() > 0 ? Status::OK() : Status::Error<END_OF_FILE>("");
@@ -894,8 +883,6 @@ Status 
VCollectIterator::Level1Iterator::_normal_next(Block* block) {
     while (res.is<END_OF_FILE>() && !_children.empty()) {
         _cur_child = std::move(*(_children.begin()));
         _children.pop_front();
-        // clear TEMP columns to avoid column align problem
-        block->erase_tmp_columns();
         res = _cur_child->next(block);
     }
 
diff --git a/be/src/vec/olap/vertical_block_reader.cpp 
b/be/src/vec/olap/vertical_block_reader.cpp
index 5f6e376367d..24bdf4e87a6 100644
--- a/be/src/vec/olap/vertical_block_reader.cpp
+++ b/be/src/vec/olap/vertical_block_reader.cpp
@@ -515,7 +515,6 @@ Status VerticalBlockReader::_unique_key_next_block(Block* 
block, bool* eof) {
             RETURN_IF_ERROR(
                     Block::filter_block(block, target_columns.size(), 
target_columns.size()));
             _stats.rows_del_filtered += block_rows - block->rows();
-            DCHECK(block->try_get_by_name("__DORIS_COMPACTION_FILTER__") == 
nullptr);
             if (UNLIKELY(_reader_context.record_rowids)) {
                 DCHECK_EQ(_block_row_locations.size(), block->rows() + 
delete_count);
             }
diff --git a/be/src/vec/sink/vtablet_block_convertor.cpp 
b/be/src/vec/sink/vtablet_block_convertor.cpp
index 7c61024af76..e9bff30dba9 100644
--- a/be/src/vec/sink/vtablet_block_convertor.cpp
+++ b/be/src/vec/sink/vtablet_block_convertor.cpp
@@ -691,10 +691,6 @@ Status 
OlapTableBlockConvertor::_fill_auto_inc_cols(vectorized::Block* block, si
 
 Status 
OlapTableBlockConvertor::_partial_update_fill_auto_inc_cols(vectorized::Block* 
block,
                                                                    size_t 
rows) {
-    // avoid duplicate PARTIAL_UPDATE_AUTO_INC_COL
-    if (block->has(BeConsts::PARTIAL_UPDATE_AUTO_INC_COL)) {
-        return Status::OK();
-    }
     auto dst_column = vectorized::ColumnInt64::create();
     vectorized::ColumnInt64::Container& dst_values = dst_column->get_data();
     size_t null_value_count = rows;
diff --git a/be/test/vec/core/block_test.cpp b/be/test/vec/core/block_test.cpp
index 001ff8710ef..ba5e99c9d2b 100644
--- a/be/test/vec/core/block_test.cpp
+++ b/be/test/vec/core/block_test.cpp
@@ -1125,26 +1125,6 @@ TEST(BlockTest, ctor) {
     ASSERT_EQ(block.columns(), 2);
     ASSERT_EQ(block.get_by_position(0).type->get_primitive_type(), TYPE_INT);
     ASSERT_TRUE(block.get_by_position(1).type->is_nullable());
-
-    {
-        auto mutable_block =
-                
vectorized::MutableBlock::create_unique(tbl->get_tuple_descs(), 10, false);
-        ASSERT_EQ(mutable_block->columns(), 2);
-        auto mutable_block2 = vectorized::MutableBlock::create_unique();
-        mutable_block->swap(*mutable_block2);
-        ASSERT_EQ(mutable_block->columns(), 0);
-        ASSERT_EQ(mutable_block2->columns(), 2);
-    }
-
-    {
-        auto mutable_block =
-                
vectorized::MutableBlock::create_unique(tbl->get_tuple_descs(), 10, true);
-        ASSERT_EQ(mutable_block->columns(), 1);
-        auto mutable_block2 = vectorized::MutableBlock::create_unique();
-        mutable_block->swap(*mutable_block2);
-        ASSERT_EQ(mutable_block->columns(), 0);
-        ASSERT_EQ(mutable_block2->columns(), 1);
-    }
 }
 
 TEST(BlockTest, insert_erase) {
@@ -1175,39 +1155,23 @@ TEST(BlockTest, insert_erase) {
     block.erase_tail(0);
     ASSERT_EQ(block.columns(), 0);
 
-    EXPECT_ANY_THROW(block.erase("column"));
     column_with_name =
             
vectorized::ColumnHelper::create_column_with_name<vectorized::DataTypeString>({});
     block.insert(0, column_with_name);
-    EXPECT_NO_THROW(block.erase("column"));
-    ASSERT_EQ(block.columns(), 0);
-
-    EXPECT_ANY_THROW(block.safe_get_by_position(0));
+    ASSERT_EQ(block.columns(), 1);
 
-    ASSERT_EQ(block.try_get_by_name("column"), nullptr);
-    EXPECT_ANY_THROW(block.get_by_name("column"));
-    EXPECT_ANY_THROW(block.get_position_by_name("column"));
     block.insert(0, column_with_name);
 
     EXPECT_NO_THROW(auto item = block.get_by_name("column"));
-    ASSERT_NE(block.try_get_by_name("column"), nullptr);
     EXPECT_EQ(block.get_position_by_name("column"), 0);
 
-    block.insert({nullptr, nullptr, BeConsts::BLOCK_TEMP_COLUMN_PREFIX});
-    EXPECT_NO_THROW(auto item = 
block.get_by_name(BeConsts::BLOCK_TEMP_COLUMN_PREFIX));
-
-    block.erase_tmp_columns();
-    ASSERT_EQ(block.try_get_by_name(BeConsts::BLOCK_TEMP_COLUMN_PREFIX), 
nullptr);
-
     {
         // test const block
         const auto const_block = block;
-        EXPECT_EQ(const_block.try_get_by_name("column2"), nullptr);
         EXPECT_ANY_THROW(const_block.get_by_name("column2"));
-        EXPECT_ANY_THROW(const_block.get_position_by_name("column2"));
+        EXPECT_EQ(const_block.get_position_by_name("column2"), -1);
 
         EXPECT_NO_THROW(auto item = const_block.get_by_name("column"));
-        ASSERT_NE(const_block.try_get_by_name("column"), nullptr);
         EXPECT_EQ(const_block.get_position_by_name("column"), 0);
     }
 
@@ -1216,14 +1180,7 @@ TEST(BlockTest, insert_erase) {
 
     block.insert({nullptr, std::make_shared<vectorized::DataTypeString>(), 
"col2"});
 
-    vectorized::MutableBlock mutable_block(&block);
-    mutable_block.erase("col1");
-    ASSERT_EQ(mutable_block.columns(), 2);
-
-    EXPECT_ANY_THROW(mutable_block.erase("col1"));
-    ASSERT_EQ(mutable_block.columns(), 2);
-    mutable_block.erase("col2");
-    ASSERT_EQ(mutable_block.columns(), 1);
+    ASSERT_EQ(block.columns(), 3);
 }
 
 TEST(BlockTest, check_number_of_rows) {
@@ -1402,8 +1359,6 @@ TEST(BlockTest, others) {
 
     mutable_block.clear_column_data();
     ASSERT_EQ(mutable_block.get_column_by_position(0)->size(), 0);
-    ASSERT_TRUE(mutable_block.has("column"));
-    ASSERT_EQ(mutable_block.get_position_by_name("column"), 0);
 
     auto dumped_names = mutable_block.dump_names();
     ASSERT_TRUE(dumped_names.find("column") != std::string::npos);
diff --git a/be/test/vec/exec/format/parquet/parquet_read_lines.cpp 
b/be/test/vec/exec/format/parquet/parquet_read_lines.cpp
index 9eec232f75f..19416fe84e8 100644
--- a/be/test/vec/exec/format/parquet/parquet_read_lines.cpp
+++ b/be/test/vec/exec/format/parquet/parquet_read_lines.cpp
@@ -184,7 +184,7 @@ static void read_parquet_lines(std::vector<std::string> 
numeric_types,
         EXPECT_EQ(info.backend_id, BackendOptions::get_backend_id());
         EXPECT_EQ(info.version, IdManager::ID_VERSION);
     }
-    block->erase("row_id");
+    block->erase(block->get_position_by_name("row_id"));
 
     EXPECT_EQ(block->dump_data(), block_dump);
     std::cout << block->dump_data();
diff --git a/be/test/vec/exec/orc/orc_read_lines.cpp 
b/be/test/vec/exec/orc/orc_read_lines.cpp
index 7bdd529c7bd..f81fdc076bc 100644
--- a/be/test/vec/exec/orc/orc_read_lines.cpp
+++ b/be/test/vec/exec/orc/orc_read_lines.cpp
@@ -167,7 +167,7 @@ static void read_orc_line(int64_t line, std::string 
block_dump) {
         EXPECT_EQ(info.backend_id, BackendOptions::get_backend_id());
         EXPECT_EQ(info.version, IdManager::ID_VERSION);
     }
-    block->erase("row_id");
+    block->erase(block->get_position_by_name("row_id"));
 
     std::cout << block->dump_data();
     EXPECT_EQ(block->dump_data(), block_dump);


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(doris) branch master updated: [Refactor](block) remove all index_by_name usage (#57860)

Reply via email to