blackmwk commented on code in PR #2216:
URL: https://github.com/apache/iceberg-rust/pull/2216#discussion_r2902733920
##########
crates/storage/opendal/tests/file_io_s3_test.rs:
##########
@@ -203,4 +204,46 @@ mod tests {
}
}
}
+
+ #[tokio::test]
+ async fn test_file_io_s3_delete_stream() {
Review Comment:
Why only s3?
##########
crates/storage/opendal/src/lib.rs:
##########
@@ -400,6 +497,33 @@ impl Storage for OpenDalStorage {
Ok(op.remove_all(&path).await.map_err(from_opendal_error)?)
}
+ async fn delete_stream(&self, mut paths: BoxStream<'static, String>) ->
Result<()> {
+ // Get the first path to create the operator
+ let Some(first_path) = paths.next().await else {
+ return Ok(());
+ };
+
+ let (op, first_relative) = self.create_operator(&first_path)?;
+
+ let mut deleter = op.deleter().await.map_err(from_opendal_error)?;
+ deleter
+ .delete(first_relative)
+ .await
+ .map_err(from_opendal_error)?;
+
+ // Use relativize_path for remaining paths to avoid rebuilding the
operator each time.
+ while let Some(path) = paths.next().await {
+ let relative_path = self.relativize_path(&path)?;
Review Comment:
I see some problems with this approach, what if we are deleting things as
following:
```
s3://bucket1/a.txt
s3://bucketb/b.txt
```
##########
crates/iceberg/src/io/file_io.rs:
##########
@@ -146,6 +147,15 @@ impl FileIO {
self.get_storage()?.delete_prefix(path.as_ref()).await
}
+ /// Delete multiple files from a stream of paths.
+ ///
+ /// # Arguments
+ ///
+ /// * paths: A stream of absolute paths starting with the scheme string
used to construct [`FileIO`].
+ pub async fn delete_stream(&self, paths: BoxStream<'static, String>) ->
Result<()> {
Review Comment:
nit: We could make this api more general as following:
```rust
fn delete_stream(&self, paths: impl Stream<Item=String>)
```
##########
crates/storage/opendal/tests/file_io_s3_test.rs:
##########
@@ -203,4 +204,46 @@ mod tests {
}
}
}
+
Review Comment:
Sorry, I don't get your point.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]