alamb commented on code in PR #21907:
URL: https://github.com/apache/datafusion/pull/21907#discussion_r3170158713
##########
datafusion/datasource-parquet/src/row_group_filter.rs:
##########
@@ -272,6 +274,7 @@ impl RowGroupAccessPlanFilter {
parquet_schema,
row_group_metadatas,
arrow_schema,
+ missing_null_counts_as_zero: true,
Review Comment:
> It also threads with_missing_null_counts_as_zero through
RowGroupPruningStatistics so normal row-group pruning keeps the existing
default behavior, while fully matched proofs treat missing null counts as
unknown. This reuses the existing statistics conversion path instead of adding
a separate null-count conversion pass.
I think this is a setting on the reader that controls how missing statistics
are interpreted (as older versions of arrow-rs didn't write null counts when
there were 0 nulls)
I am not sure why this code path is changing its value
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]