[
https://issues.apache.org/jira/browse/HADOOP-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034083#comment-14034083
]
Steve Loughran commented on HADOOP-10714:
-----------------------------------------
* can this just be merged into a new HADOOP-10400 patch ... as that isn't
checked in yet it should just be updated
* scale tests are good -for the swift stuff we have some scalable ones which
you can tune off the test config file. This lets you run smaller tests over
slower links. File size can be kept low for better performance.
Tests for a large set of files should
# verify that the results of a directory listing is complete
# try a rename() (as this has a delete inside)
# do the delete()
> AmazonS3Client.deleteObjects() need to be limited to 1000 entries per call
> --------------------------------------------------------------------------
>
> Key: HADOOP-10714
> URL: https://issues.apache.org/jira/browse/HADOOP-10714
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.5.0
> Reporter: David S. Wang
> Assignee: David S. Wang
> Priority: Critical
> Labels: s3
> Attachments: HADOOP-10714-1.patch
>
>
> In the patch for HADOOP-10400, calls to AmazonS3Client.deleteObjects() need
> to have the number of entries at 1000 or below. Otherwise we get a Malformed
> XML error similar to:
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS
> Service: Amazon S3, AWS Request ID: 6626AD56A3C76F5B, AWS Error Code:
> MalformedXML, AWS Error Message: The XML you provided was not well-formed or
> did not validate against our published schema, S3 Extended Request ID:
> DOt6C+Y84mGSoDuaQTCo33893VaoKGEVC3y1k2zFIQRm+AJkFH2mTyrDgnykSL+v
> at
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
> at
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3480)
> at
> com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:1739)
> at org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:388)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:829)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:874)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:878)
> Note that this is mentioned in the AWS documentation:
> http://docs.aws.amazon.com/AmazonS3/latest/API/multiobjectdeleteapi.html
> "The Multi-Object Delete request contains a list of up to 1000 keys that you
> want to delete. In the XML, you provide the object key names, and optionally,
> version IDs if you want to delete a specific version of the object from a
> versioning-enabled bucket. For each key, Amazon S3….”
> Thanks to Matteo Bertozzi and Rahul Bhartia from AWS for identifying the
> problem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)