[jira] [Commented] (HADOOP-15107) Stabilize/tune S3A committers; review correctness & docs

Steve Loughran (JIRA) Thu, 30 Aug 2018 07:25:37 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597511#comment-16597511
 ]


Steve Loughran commented on HADOOP-15107:
-----------------------------------------

Thanks, committed to 3.1.x & trunk!



bq. New commit attempts always get the same attempt id +1? (I don't know how 
those are allocated)

its how they know to recover from the previous attempt. Yarn App ID is used to 
guarantee uniqueness over apps. Spark always starts off with 0 as its app 
Id/attempt ID, so >1 query from different spark instances can clash.

bq. The mergePathsV1 seems pretty straightforward. Not sure why the actual code 
is so complicated.

unplanned evolution is my guess, possibly with a goal of not breaking any 
explict subclasses of the FileOutputCommitter. I didn't try that for the new 
commit stuff.

v2 resilience? 

It is broken in that nothing can handle a task which fails during commit: its 
(non-atomic) state is unknown. Neither MapReduce nor Spark are aware 
of/resilient to this issue. There's also the partitioning failure mode: task 
doesn't fail during commit, it merely hangs for a while (GC?) then completes 
its commit when it resumes, without noticing that it's been superceded by a 
second task attempt, or indeed, that the entire job has now completed. Oops. 
Bear in mind though that outside object stores with slow renames the 
probability of a failure during task commit is likely to low.

Really committers should expose their semantics here & MR & Spark can handle 
this failure condition. 

V1 doesn't have this problem as the task commit is atomic; job commit is not, 
but as {{isCommitJobRepeatable()}} returns false for that, MR AM restart knows 
to give up then (something is saved to HDFS indicate in-job-commit). Spark 
doesn't restart failed AM/driver, so it's moot there.

S3A committers

* Staging: relies on V1 semantics in cluster HDFS
* Magic. Task commit writes {{PendingSet}} of all files to commit to task in an 
atomic PUT; task commit is therefore also atomic. After a job completes we 
purge all pending uploads under $dest, so any failed tasks' output is deleted.


> Stabilize/tune S3A committers; review correctness & docs
> --------------------------------------------------------
>
>                 Key: HADOOP-15107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15107
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch, 
> HADOOP-15107-003.patch, HADOOP-15107-004.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper 
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the 
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is 
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset 
> lists from committed tasks to the final destination, where they are read and 
> committed.
> # Show the magic committer also works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15107) Stabilize/tune S3A committers; review correctness & docs

Reply via email to