kevinjqliu commented on issue #1637:
URL: 
https://github.com/apache/iceberg-python/issues/1637#issuecomment-2652677058

   hey @iyad-f sure thing. 
   
   Iceberg has the concept of sort order 
https://iceberg.apache.org/spec/#sorting
   An Iceberg table can declare the data is sorted in certain way so that the 
engine can read the data more effectively. 
   Write support for sort order is in #271
   In this issue, I want to explore read support. Given an iceberg table that 
is sorted, can we efficiently leverage the sort order when reading? 
   
   I think there are two components to this. 
   1. pruning manifests. use the table's sort order to efficiently skip 
manifest's based on its min/max values (i think this should be part of 
_InclusiveMetricsEvaluator above)
   2. pruning data. push down the sort order to the [data file 
scan](https://github.com/apache/iceberg-python/blob/86b83e85754b32d864cde764364c2022a0bab92b/pyiceberg/io/pyarrow.py#L1379)
 (we should investigate whether this is supported in pyarrow)
   
   Let me know if that's clear. Happy to chat more 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to