alamb commented on code in PR #21907:
URL: https://github.com/apache/datafusion/pull/21907#discussion_r3170158713


##########
datafusion/datasource-parquet/src/row_group_filter.rs:
##########
@@ -272,6 +274,7 @@ impl RowGroupAccessPlanFilter {
             parquet_schema,
             row_group_metadatas,
             arrow_schema,
+            missing_null_counts_as_zero: true,

Review Comment:
   > It also threads with_missing_null_counts_as_zero through 
RowGroupPruningStatistics so normal row-group pruning keeps the existing 
default behavior, while fully matched proofs treat missing null counts as 
unknown. This reuses the existing statistics conversion path instead of adding 
a separate null-count conversion pass.
   
   I think this is a setting on the reader that controls how missing statistics 
are interpreted (as older versions of arrow-rs didn't write null counts when 
there were 0 nulls)
   
   I am not sure why this code path is changing its value



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to