[
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597511#comment-16597511
]
Steve Loughran commented on HADOOP-15107:
-----------------------------------------
Thanks, committed to 3.1.x & trunk!
bq. New commit attempts always get the same attempt id +1? (I don't know how
those are allocated)
its how they know to recover from the previous attempt. Yarn App ID is used to
guarantee uniqueness over apps. Spark always starts off with 0 as its app
Id/attempt ID, so >1 query from different spark instances can clash.
bq. The mergePathsV1 seems pretty straightforward. Not sure why the actual code
is so complicated.
unplanned evolution is my guess, possibly with a goal of not breaking any
explict subclasses of the FileOutputCommitter. I didn't try that for the new
commit stuff.
v2 resilience?
It is broken in that nothing can handle a task which fails during commit: its
(non-atomic) state is unknown. Neither MapReduce nor Spark are aware
of/resilient to this issue. There's also the partitioning failure mode: task
doesn't fail during commit, it merely hangs for a while (GC?) then completes
its commit when it resumes, without noticing that it's been superceded by a
second task attempt, or indeed, that the entire job has now completed. Oops.
Bear in mind though that outside object stores with slow renames the
probability of a failure during task commit is likely to low.
Really committers should expose their semantics here & MR & Spark can handle
this failure condition.
V1 doesn't have this problem as the task commit is atomic; job commit is not,
but as {{isCommitJobRepeatable()}} returns false for that, MR AM restart knows
to give up then (something is saved to HDFS indicate in-job-commit). Spark
doesn't restart failed AM/driver, so it's moot there.
S3A committers
* Staging: relies on V1 semantics in cluster HDFS
* Magic. Task commit writes {{PendingSet}} of all files to commit to task in an
atomic PUT; task commit is therefore also atomic. After a job completes we
purge all pending uploads under $dest, so any failed tasks' output is deleted.
> Stabilize/tune S3A committers; review correctness & docs
> --------------------------------------------------------
>
> Key: HADOOP-15107
> URL: https://issues.apache.org/jira/browse/HADOOP-15107
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.1.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Blocker
> Fix For: 3.1.2
>
> Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch,
> HADOOP-15107-003.patch, HADOOP-15107-004.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset
> lists from committed tasks to the final destination, where they are read and
> committed.
> # Show the magic committer also works.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]