tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2482458320
> point about the existing Datafusion machinery DataFusion provides an [ObjectStoreRegistry](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreRegistry.html) as part of the [SessionContext](https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html). This is then what various abstractions like [ParquetExec](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/parquet/struct.ParquetExec.html) hook into. By integrating with this iceberg-rs would better interoperate with the rest of the DataFusion ecosystem, be they other catalogs like listing table, deltalake, Hive, etc... or unusual deployment scenarios with custom caching object stores, etc... It seems unfortunate for users to need to configure iceberg-rs separately from the rest of DataFusion. It would also benefit from the ongoing work to improve those components and systems. I don't know to what extent the desire is to make iceberg-rs a standalone library that mirrors the Java APIs and configuration, but I thought it worthwhile to at least make the case for closer integration with DataFusion. It seems like quite a lot of undifferentiated toil to rebuild the quite subtle logic around predicate pushdown, concurrent decode, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org