tustvold commented on issue #172:
URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2482458320

   > point about the existing Datafusion machinery
   
   DataFusion provides an 
[ObjectStoreRegistry](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreRegistry.html)
 as part of the 
[SessionContext](https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html).
 This is then what various abstractions like 
[ParquetExec](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/parquet/struct.ParquetExec.html)
 hook into.
   
   By integrating with this iceberg-rs would better interoperate with the rest 
of the DataFusion ecosystem, be they other catalogs like listing table, 
deltalake, Hive, etc... or unusual deployment scenarios with custom caching 
object stores, etc... It seems unfortunate for users to need to configure 
iceberg-rs separately from the rest of DataFusion. It would also benefit from 
the ongoing work to improve those components and systems. 
   
   I don't know to what extent the desire is to make iceberg-rs a standalone 
library that mirrors the Java APIs and configuration, but I thought it 
worthwhile to at least make the case for closer integration with DataFusion. It 
seems like quite a lot of undifferentiated toil to rebuild the quite subtle 
logic around predicate pushdown, concurrent decode, etc...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to