[ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13786:
------------------------------------
    Attachment: HADOOP-13786-HADOOP-13345-004.patch

Patch 004

this is passing the tests in a suite derived from 
{{org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter}};still 
looking at ways to simulate failure conditions and semantics of failure we want.

Essentially: once a pending commit has happened, there is no *retry*. Meaning: 
when a task has committed once, it should fail from then on, which it does with 
an FNFE on the task attempt dir.

Similarly you can only commit a job once, even if all the job does is delete 
all child directories.

One change in this patch is the need to support pending subtrees, eg. map 
output to the directory part-0000/index and part-0000/data in the destination 
dir; this has been done by adding the notion of a {{__base}} path element in 
the pending tree. When a {{__base}} path is a parent. the destination path is 
the parent of the __pending dir, with all children under {{__base}} retained. 
With each task attempt dir being 
{{dest/__pending/$app/$app-attempt/$task_attempt/__base}}, this ensures that 
all data created in the task working dir ends up under the destination in the 
same directory tree.

issues:

* what about cleaning up __pending? Job commit?
* need to stop someone creating a path {{__base/__pending}} and so sneak in 
pending stuff/get very confused. Actually, stop __pending under __pending.

> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, 
> HADOOP-13786-HADOOP-13345-004.patch
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to