aihuaxu commented on PR #12658:
URL: https://github.com/apache/iceberg/pull/12658#issuecomment-2808132998

   > > @aihuaxu and @rdblue is there a reason we need to explicitly restrict 
the lower/upper bounds to shredded fields? I would think that the stats pruning 
would be useful for any field that a writer would want to include in the bound 
(regardless of whether it was shredded or not).
   > 
   > What we were thinking is that the bounds are collected from shredded 
column stats during shredding process. But it does seem reasonable to me to 
bounds and shredding can be separated: if a writer has the knowledge of the 
bounds and chooses not to shred, the bounds can still be used in pruning.
   
   @danielcweeks We rethink about this approach. That would cause the stats 
mismatch between Iceberg manifest files and Parquet footer, i.e., Parquet 
footer may not have the stats while Iceberg manifest files do. Do you see 
common use cases that the fields are not shredded while the writers may know 
the stats? I would prefer to keep the stats in sync between manifest files and 
Parquet footer, same as the other columns and keep it simpler.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to