airborne12 commented on code in PR #63389:
URL: https://github.com/apache/doris/pull/63389#discussion_r3497252105


##########
be/src/storage/segment/segment.cpp:
##########
@@ -85,6 +89,63 @@ namespace doris::segment_v2 {
 
 class InvertedIndexIterator;
 
+namespace {
+
+Status build_segment_zonemap_context(Segment* segment, const Schema& schema,
+                                     const StorageReadOptions& read_options,
+                                     const VExprContextSPtrs& conjuncts, 
ZoneMapEvalContext* ctx) {
+    DORIS_CHECK(segment != nullptr);
+    DORIS_CHECK(ctx != nullptr);
+    std::set<int> slot_indexes;
+    for (const auto& conjunct : conjuncts) {
+        DORIS_CHECK(conjunct != nullptr);
+        const auto& root = conjunct->root();
+        DORIS_CHECK(root != nullptr);
+        if (!root->can_evaluate_zonemap_filter()) {
+            continue;
+        }
+        // Segment zone maps have one min/max/null summary per column for the 
whole segment, so a
+        // segment-level context can safely hold every slot referenced by a 
compound expression.
+        // Page zone maps are page-aligned per column and still use 
single-slot filtering in
+        // SegmentIterator.
+        root->collect_slot_column_ids(slot_indexes);
+    }
+    for (const int slot_index : slot_indexes) {
+        if (slot_index < 0 || cast_set<size_t>(slot_index) >= 
schema.num_column_ids()) {
+            continue;
+        }
+        const auto column_id = schema.column_id(cast_set<size_t>(slot_index));

Review Comment:
   At this point, segment-level expression zonemap pruning runs before the 
pushed-down expression is rebound to the reader schema. For non-direct AGG / 
normal UNIQUE-MOR scans, the reader schema may be expanded to all key columns 
plus requested value columns, so the expression slot ordinal may not match the 
ordinal in this schema. In that case, schema.column_id(slot_index) could read 
the zonemap of a different column, and a kNoMatch result might incorrectly skip 
the whole segment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to