corleyma commented on PR #1614:
URL: https://github.com/apache/iceberg-python/pull/1614#issuecomment-2641053258

   > Polars 'scan_iceberg' uses PyIceberg to create the LazyFrame:
   > 
https://github.com/pola-rs/polars/blob/9359ed576d972dce257346fcd62c8857f3d23277/py-polars/polars/io/iceberg.py#L139
   > The filtering can be done in PyIceberg, so aren't the2 approaches similar?
   
   The difference is the approach as documented is encouraging folks to write 
their own filter predicates for pyiceberg before materializing a dataframe with 
polars, whereas the "polars way" (as a lazy dataframe API) would be to just 
create the lazyframe, construct your compute graph with whatever polars 
predicates/etc make sense for you, and rely on polars to push that down at 
`.collect()` time to appropriately filter data before load where possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to