steveloughran commented on code in PR #11021: URL: https://github.com/apache/iceberg/pull/11021#discussion_r1734701790
########## aws/src/integration/java/org/apache/iceberg/aws/s3/S3TestUtil.java: ########## @@ -29,4 +33,12 @@ public static String getBucketFromUri(String s3Uri) { public static String getKeyFromUri(String s3Uri) { return new S3URI(s3Uri).key(); } + + /** + * A helper method that checks if the s3 bucket is a general purpose bucket or directory bucket based on the suffix. + * @return true if it is an s3 express bucket otherwise false. + */ + public static boolean checkIfS3Express() { + return AwsIntegTestUtil.testBucketName().endsWith(EXPRESS_BUCKET_SUFFIX); Review Comment: fwiw hadoop org.apache.hadoop.fs.s3a.impl.S3ExpressStorage looks at endpoint and only switches to s3express if its an aws store, irrespective of name. probably overkill ```java public static boolean isAwsEndpoint(final String endpoint) { return (endpoint.isEmpty() || endpoint.endsWith(".amazonaws.com") || endpoint.endsWith(".amazonaws.com.cn")); } ``` ########## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ########## @@ -127,6 +129,39 @@ public static void cleanS3Bucket(S3Client s3, String bucketName, String prefix) } } + public static void cleanS3ExpressBucket(S3Client s3, String bucketName, String prefix) { + // For S3 express we just need to delete the objects not the versions. + ListObjectsV2Request listRequest = + ListObjectsV2Request.builder() + .bucket(bucketName) + .prefix(prefix) + .build(); + + ListObjectsV2Iterable paginatedListResponse = s3.listObjectsV2Paginator(listRequest); + List<ObjectIdentifier> objectsToDelete = Lists.newArrayList(); + paginatedListResponse.contents().stream() + .forEach(s3Object -> { + if (objectsToDelete.size() == BATCH_DELETION_SIZE) { + deleteObjects(s3, bucketName, objectsToDelete); Review Comment: this is potentially a good implicit test for `SupportsBulkOperations.deleteFiles()` which, being used in production, is something to stress as much as possible ########## aws/src/integration/java/org/apache/iceberg/aws/AwsIntegTestUtil.java: ########## @@ -127,6 +129,39 @@ public static void cleanS3Bucket(S3Client s3, String bucketName, String prefix) } } + public static void cleanS3ExpressBucket(S3Client s3, String bucketName, String prefix) { + // For S3 express we just need to delete the objects not the versions. Review Comment: 1. prefer as a javadoc 2. ideally both this and the general cleanup should delete in progress uploads under the path, so maybe a `abortUploadsUnderPath()` operation should do that and be invoked in both places. saves money everywhere and avoids confusion about deleted paths existing on s3 express -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org