tustvold opened a new issue, #172: URL: https://github.com/apache/iceberg-rust/issues/172
I have debated filing this ticket for a while, but largely held off as I wasn't sure how well it would be received, especially as I am acutely aware that this crate currently makes use of OpenDAL and @Xuanwo is an active contributor to both repositories. However, I feel it is important to have these discussions, and part of my role as a maintainer of object_store is to engage with others in the community and hear about how its offering could be made more compelling. That all being said, I think [object_store](https://crates.io/crates/object_store) provides some quite compelling functionality that might be of particular interest to this project: * First-party integration with [arrow-rs](https://docs.rs/arrow-csv/latest/arrow_csv/reader/index.html#async-usage), [parquet](https://docs.rs/parquet/50.0.0/parquet/arrow/async_reader/struct.ParquetObjectReader.html), [DataFusion](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreRegistry.html) and [polars](https://docs.rs/polars-io/0.36.2/polars_io/cloud/fn.build_object_store.html), including sophisticated [vectored](https://docs.rs/object_store/latest/object_store/#vectored-read) and [streaming](https://docs.rs/object_store/latest/object_store/struct.GetResult.html#method.into_stream) IO * Support for [conditional writes](https://docs.rs/object_store/latest/object_store/#conditional-put), which would allow iceberg-rs to support multiple concurrent writers directly against object storage, without needing an external catalog * A flexible [configuration system](https://docs.rs/object_store/latest/object_store/#configuration-system) developed in partnership with, and used by both the polars and delta-rs communities * Extensive support for the various cloud provider credential sources, with [extension points](https://docs.rs/object_store/latest/object_store/trait.CredentialProvider.html) for users to further customise this * APIs that mirror that of object stores and not [filesystems](https://docs.rs/object_store/latest/object_store/#why-not-a-filesystem-interface), which helps to understand what and how IO is being performed, and allows support for object store specific functionality like [tags](https://docs.rs/object_store/latest/object_store/struct.PutOptions.html#structfield.tags), [partial range](https://docs.rs/object_store/latest/object_store/enum.GetRange.html) requests, and [more](https://docs.rs/object_store/latest/object_store/#conditional-fetch)... * Battle-tested in multiple production systems, and with a substantial and [growing](https://crates.io/crates/object_store/reverse_dependencies) user-base The major area object_store is limited, somewhat intentionally, is in the number of first-party implementations; only supporting S3-compatible stores, Google Cloud Storage, Azure Blob Storage, in-memory and local filesystems. However, the object-safe design does allow for third-party implementations, for things like [HDFS](https://github.com/datafusion-contrib/datafusion-objectstore-hdfs). I look forward to hearing your thoughts, but also fully understand if this is not a discussion you would like to engage with at this time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org