kaka11chen opened a new issue, #16023:
URL: https://github.com/apache/doris/issues/16023

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Description
   
   Currently, after obtaining the delete position through Iceberg position 
delete files, the parquet reader adopts a mechanism of dividing the delete 
position into different row ranges to filter. When there are too many delete 
positions, the number of virtual function calls to read decoded columns will 
increase, and `column_data.resize()` will be called too many times。
   
   parquet_common.h
   ```
   template <typename Numeric>
   Status FixLengthDecoder::_decode_numeric(MutableColumnPtr& doris_column,
                                            ColumnSelectVector& select_vector) {
       auto& column_data = 
static_cast<ColumnVector<Numeric>&>(*doris_column).get_data();
       size_t data_index = column_data.size();
       column_data.resize(data_index + select_vector.num_values() - 
select_vector.num_filtered());
       ...
   ```
   
   ### Solution
   
   Merge delete position filter with condition filter to handle it.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to