github-actions[bot] commented on code in PR #59263:
URL: https://github.com/apache/doris/pull/59263#discussion_r3487344195


##########
regression-test/suites/datatype_p0/complex_types/test_pruned_columns.groovy:
##########
@@ -98,10 +213,87 @@ suite("test_pruned_columns") {
     """
 
     qt_sql6 """
-        select count(element_at(dynamic_attributes['theme_preference'], 
'confidence_score')) from `tbl_test_pruned_columns_map`;
+        select
+            id
+            , element_at(element_at(s, 'data')[2][3], 'b')
+        from `tbl_test_pruned_columns`
+        where element_at(s, 'city') = 'chengdu'
+        order by 1, 2 limit 0, 20;
+    """
+
+    sql "clean all profile"

Review Comment:
   This global profile cleanup can make the new assertion interfere with other 
profile-based regression suites. `clean all profile` routes to 
`ProfileManager.cleanProfile()`, which clears the shared FE 
`queryIdToProfileMap` and `queryIdToExecutionProfiles`; this suite is declared 
only as `suite("test_pruned_columns")`, so it is not isolated from concurrent 
suites that may be polling `/rest/v1/query_profile` for their own UUID-tagged 
query. Since the code below already searches by `lazyPrunedToken`, please drop 
the global cleanup (or mark the suite `nonConcurrent` if clearing all profiles 
is required) so this new test does not cause unrelated profile assertions to 
flake.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/NestedColumnPruning.java:
##########
@@ -304,7 +302,7 @@ private static Map<Integer, AccessPathInfo> pruneDataType(
                     buildColumnAccessPaths(slot, predicateAccessPaths);
             AccessPathInfo accessPathInfo = 
result.get(slot.getExprId().asInt());
             if (accessPathInfo != null) {
-                retainPredicatePathsInFinalAllAccessPaths(
+                addPredicatePathsToFinalAllAccessPaths(

Review Comment:
   This changes the FE/BE access-path contract without a mixed-version guard. 
The new BE reader code knows how to handle `allAccessPaths` that contain both 
data paths and predicate metadata paths, but an old BE does not: its 
`_check_and_set_meta_read_mode()` switches to `OFFSET_ONLY`/`NULL_MAP_ONLY` as 
soon as it sees any current-level `OFFSET` or `NULL` subpath, and then skips 
the child iterators. During a rolling upgrade, a new FE can therefore send 
paths such as `[s.NULL, s.city]` (or map/array data paths plus parent metadata) 
to an old BE, which will default-fill or skip the projected child data. Please 
either keep the old stripped path shape until all selected BEs advertise 
support for mixed metadata/data access paths, or gate this new superset form on 
a BE capability/version.



##########
be/src/storage/segment/column_reader.cpp:
##########
@@ -2255,6 +2633,14 @@ Status FileColumnIterator::next_batch(size_t* n, 
MutableColumnPtr& dst, bool* ha
 
 Status FileColumnIterator::read_by_rowids(const rowid_t* rowids, const size_t 
count,
                                           MutableColumnPtr& dst) {
+    if (!need_to_read()) {
+        DLOG(INFO) << "File column iterator column " << _column_name << " skip 
reading.";
+        _convert_to_place_holder_column(dst, count);
+        return Status::OK();
+    }
+
+    _recovery_from_place_holder_column(dst);
+
     if (read_null_map_only()) {

Review Comment:
   This `NULL_MAP_ONLY` rowid path can return the wrong null map for sparse 
rowid batches. In `NULL_MAP_ONLY` mode, 
`seek_to_ordinal(rowids[total_read_count])` positions on the first requested 
row's page, but when that page has no null bitmap the branch below fills 
`min(remaining, _page.remaining())` zeros without checking whether the next 
requested rowids are still covered by the current page. 
`SegmentIterator::_read_columns_by_index()` calls `read_by_rowids()` for 
non-contiguous batches, so a batch like `[row in a no-null page, row in a later 
page that does contain a null]` is consumed entirely there and the later null 
row is reported as non-null. Please mirror the `_page.has_null` / normal data 
path behavior by only advancing over requested rowids that belong to the 
current page before filling zeros.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to