liurenjie1024 commented on issue #172:
URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2481267603

   Thanks for everyone joining the discussion here.  I think we have reached 
some conclusions here:
   1. We need to support different storages, like s3, google cloud storage.
   2. We need to support differnt io layer, like opendal, `object_store`.
   
   One thing undetermined is about the relationship with java api. The current 
design tries to be close to java api, but keep idiomatic for rust users. That's 
why we are not making `FileIO`, `InputFile`, `OutputFile` structs rather traits.
   
   While both @tustvold and @alamb 's suggestions are great, one of my concerns 
is breaknig current api. In fact, in current we have enough room for 
extensions. For example, if we want to use `object-store` for s3, we could 
extend the `Storage` enum, which is invisible to user. Currently `InputFile`, 
`OutputFile` has a field for `opendal`'s `Operator`, but they are private 
fields and hidden from user. So, I would suggest following changes to extend 
`FileIO` to support `object_store` crate:
   
   1. Make `Storage` a trait, rather an enum like following:
   ```
   #[async_trait]
   pub(crate) trait Storage {
     async fn  create_reader(&self, path: &str) -> Result<Arc<dyn FileRead>>;
     async fn create_writer(&self, path: &str) -> Result<Arc<dyn FileWrite>>;
   }
   ```
   
   2. Change `FileIOBuilder`'s behavior to take into one extra parameter into 
account:
   ```
   s3.provider = (opendal) | (object_store)
   ```
   
   3. Add different implements for `Storage` trait, for example 
`OpenDALStroage`, `ObjectStoreStorage`
   
   
   One missing point point of this design is that we don't allow user to 
provide external `Storage` implementation not included in this crate. While I'm 
not sure if there is  requirements for this, this is still possible to allow 
user to inject sth like `StorageProvider` in `FileIOBuilder`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to