[ 
https://issues.apache.org/jira/browse/HADOOP-15628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570910#comment-16570910
 ] 

Steve Loughran commented on HADOOP-15628:
-----------------------------------------

Steve, thanks for the update. 

* If your store is correct, you don't need s3guard, and you won't need to worry 
about any inconsistency from partial delete failures.
* this is potentially a mismatch in the store behaviour vs the AWS S3 
implementation.

bq. the response codes coming back were correct, and the xml was as well, it 
just wasn't being parsed by hdfs. 

we rely on the AWS SDK to parse this stuff, so if it's not doing it right, 
there's problems. Again, the latest SDKs should be best here. FWIW, our role 
assert stuff does seem to pick up partial deletes, though it may be the 
specific details of the failure which we aren't replicating. 

If you've evidence that this isn't  happening, even with the latests SDK 
offered by HADOOP-15642, then that's possibly the best place to fix it.

# raise it with the AWS SDK issue tracker on github; include a ref to this JIRA 
(and add a link from this JIRA to the open issue after)
# include as much diagnostics info there; the s3a docs show how to print out 
the HTTP-level logs, which it sounds like you already have. 

In the S3A code, we're going to need something which checks that return code 
and at least escalates to an IOE. For now I'd suggest a 
{{org.apache.hadoop.fs.PathPermissionException}}, Because this is all happening 
with your store, then you'll have to be the one showing it works there...you'll 
also need to get set up to build/test hadoop branch-3 against that store (which 
is really useful for us for better coverage).

+[~Thomas Demoor] [~ehiggs]

> S3A Filesystem does not check return from AmazonS3Client deleteObjects
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-15628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15628
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.9.1, 2.8.4, 3.1.1, 3.0.3
>         Environment: Hadoop 3.0.2 / Hadoop 2.8.3
> Hive 2.3.2 / Hive 2.3.3 / Hive 3.0.0
>            Reporter: Steve Jacobs
>            Assignee: Steve Loughran
>            Priority: Minor
>
> Deletes in S3A that use the Multi-Delete functionality in the Amazon S3 api 
> do not check to see if all objects have been succesfully delete. In the event 
> of a failure, the api will still return a 200 OK (which isn't checked 
> currently):
> [Delete Code from Hadoop 
> 2.8|https://github.com/apache/hadoop/blob/a0da1ec01051108b77f86799dd5e97563b2a3962/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L574]
>  
> {code:java}
> if (keysToDelete.size() == MAX_ENTRIES_TO_DELETE) {
> DeleteObjectsRequest deleteRequest =
> new DeleteObjectsRequest(bucket).withKeys(keysToDelete);
> s3.deleteObjects(deleteRequest);
> statistics.incrementWriteOps(1);
> keysToDelete.clear();
> }
> {code}
> This should be converted to use the DeleteObjectsResult class from the 
> S3Client: 
> [Amazon Code 
> Example|https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.htm]
> {code:java}
> // Verify that the objects were deleted successfully.
> DeleteObjectsResult delObjRes = 
> s3Client.deleteObjects(multiObjectDeleteRequest); int successfulDeletes = 
> delObjRes.getDeletedObjects().size();
> System.out.println(successfulDeletes + " objects successfully deleted.");
> {code}
> Bucket policies can be misconfigured, and deletes will fail without warning 
> by S3A clients.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to