tustvold opened a new issue, #172:
URL: https://github.com/apache/iceberg-rust/issues/172

   I have debated filing this ticket for a while, but largely held off as I 
wasn't sure how well it would be received, especially as I am acutely aware 
that this crate currently makes use of OpenDAL and @Xuanwo is an active 
contributor to both repositories. However, I feel it is important to have these 
discussions, and part of my role as a maintainer of object_store is to engage 
with others in the community and hear about how its offering could be made more 
compelling.
   
   That all being said, I think 
[object_store](https://crates.io/crates/object_store) provides some quite 
compelling functionality that might be of particular interest to this project:
   
   * First-party integration with 
[arrow-rs](https://docs.rs/arrow-csv/latest/arrow_csv/reader/index.html#async-usage),
 
[parquet](https://docs.rs/parquet/50.0.0/parquet/arrow/async_reader/struct.ParquetObjectReader.html),
 
[DataFusion](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreRegistry.html)
 and 
[polars](https://docs.rs/polars-io/0.36.2/polars_io/cloud/fn.build_object_store.html),
 including sophisticated 
[vectored](https://docs.rs/object_store/latest/object_store/#vectored-read) and 
[streaming](https://docs.rs/object_store/latest/object_store/struct.GetResult.html#method.into_stream)
 IO
   * Support for [conditional 
writes](https://docs.rs/object_store/latest/object_store/#conditional-put), 
which would allow iceberg-rs to support multiple concurrent writers directly 
against object storage, without needing an external catalog
   * A flexible [configuration 
system](https://docs.rs/object_store/latest/object_store/#configuration-system) 
developed in partnership with, and used by both the polars and delta-rs 
communities
   * Extensive support for the various cloud provider credential sources, with 
[extension 
points](https://docs.rs/object_store/latest/object_store/trait.CredentialProvider.html)
 for users to further customise this
   * APIs that mirror that of object stores and not 
[filesystems](https://docs.rs/object_store/latest/object_store/#why-not-a-filesystem-interface),
 which helps to understand what and how IO is being performed, and allows 
support for object store specific functionality like 
[tags](https://docs.rs/object_store/latest/object_store/struct.PutOptions.html#structfield.tags),
 [partial 
range](https://docs.rs/object_store/latest/object_store/enum.GetRange.html) 
requests, and 
[more](https://docs.rs/object_store/latest/object_store/#conditional-fetch)...
   * Battle-tested in multiple production systems, and with a substantial and 
[growing](https://crates.io/crates/object_store/reverse_dependencies) user-base
   
   The major area object_store is limited, somewhat intentionally, is in the 
number of first-party implementations; only supporting S3-compatible stores, 
Google Cloud Storage, Azure Blob Storage, in-memory and local filesystems. 
However, the object-safe design does allow for third-party implementations, for 
things like 
[HDFS](https://github.com/datafusion-contrib/datafusion-objectstore-hdfs).
   
   I look forward to hearing your thoughts, but also fully understand if this 
is not a discussion you would like to engage with at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to