liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2483125592
> > point about the existing Datafusion machinery > > DataFusion provides an [ObjectStoreRegistry](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreRegistry.html) as part of the [SessionContext](https://docs.rs/datafusion/latest/datafusion/execution/context/struct.SessionContext.html). This is then what various abstractions like [ParquetExec](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/parquet/struct.ParquetExec.html) hook into. > > By integrating with this iceberg-rs would better interoperate with the rest of the DataFusion ecosystem, be they other catalogs like listing table, deltalake, Hive, etc... or unusual deployment scenarios with custom caching object stores, etc... It seems unfortunate for users to need to configure iceberg-rs separately from the rest of DataFusion. It would also benefit from the ongoing work to improve those components and systems. > > I don't know to what extent the desire is to make iceberg-rs a standalone library that mirrors the Java APIs and configuration, but I thought it worthwhile to at least make the case for closer integration with DataFusion. It seems like quite a lot of undifferentiated toil to rebuild the quite subtle logic around predicate pushdown, concurrent decode, etc... > > Edit: to ground this a bit more, the advantage of a trait based approach, is the DF bindings could provide a component wrapping SessionContext or similar, without forcing iceberg to take a dependency on DF or maintain this mapping If we want to allow integrating with `ObjectStoreRegistry`, we would need one more trait like `StorageProvider`: ```rust #[async_trait] pub trait StorageProvider { async fn build(&self, configs: &HashMap<String, String>, url: &str) -> Arc<dyn Stroage>; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org