CTTY commented on code in PR #1755:
URL: https://github.com/apache/iceberg-rust/pull/1755#discussion_r2473332732


##########
crates/iceberg/src/io/storage_gcs.rs:
##########
@@ -104,3 +110,104 @@ pub(crate) fn gcs_config_build(cfg: &GcsConfig, path: 
&str) -> Result<Operator>
     cfg.bucket = bucket.to_string();
     Ok(Operator::from_config(cfg)?.finish())
 }
+
+/// GCS storage implementation using OpenDAL
+#[derive(Debug, Clone)]
+pub struct OpenDALGcsStorage {
+    config: Arc<GcsConfig>,
+}
+
+impl OpenDALGcsStorage {
+    /// Creates operator from path.
+    fn create_operator<'a>(&self, path: &'a str) -> Result<(Operator, &'a 
str)> {
+        let operator = gcs_config_build(&self.config, path)?;
+        let prefix = format!("gs://{}/", operator.info().name());
+
+        if path.starts_with(&prefix) {
+            let op = operator.layer(opendal::layers::RetryLayer::new());
+            Ok((op, &path[prefix.len()..]))
+        } else {
+            Err(Error::new(
+                ErrorKind::DataInvalid,
+                format!("Invalid gcs url: {}, should start with {}", path, 
prefix),
+            ))
+        }
+    }
+}
+
+#[async_trait]
+impl Storage for OpenDALGcsStorage {

Review Comment:
   I think this thread is relavant: 
https://docs.google.com/document/d/1-CEvRvb52vPTDLnzwJRBx5KLpej7oSlTu_rg0qKEGZ8/edit?disco=AAABrRO9Prk
   
   The benefit of doing this is that users are allowed to only implement 
`Storage` for certain schemes. The annoying part of having duplicate code for 
multiple schemes will mostly apply to a versatile storage implementation like 
OpenDAL, which already has a convenient operator layer. For custom storage, I 
don't expect them to implement all schemes anyway (I may be wrong on this 
assumption)
   
   For code duplication, I consider OpenDAL Storage to be the "managed" default 
storage that lives in this repo and we will have more control over the 
implementation. Once we have a new crate for each storage 
implementation(iceberg-storage-opendal), we can add some helpers to reduce the 
code duplication



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to