[
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488861#comment-16488861
]
Steve Loughran edited comment on HADOOP-15107 at 5/24/18 11:59 AM:
-------------------------------------------------------------------
Also tune logs @ info. Specifically during cleanup it talks about aborting all
pending commits, which is a bit worrying unless you know its just cleaning up
{code}
18/05/22 16:41:17 INFO DAGScheduler: Job 10 finished: save at
NativeMethodAccessorImpl.java:0, took 0.571971 s
18/05/22 16:41:17 INFO AbstractS3ACommitter: Starting: Task committer
attempt_20180522164116_0000_m_000000_0: commitJob((no job ID))
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Cleanup job (no job ID)
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Aborting all pending
commits under s3a://hwdev-steve-new/spark_shell/csv
18/05/22 16:41:18 INFO AbstractS3ACommitter: Aborting all pending commits under
s3a://hwdev-steve-new/spark_shel/csv: duration 0:00.024s
18/05/22 16:41:18 INFO AbstractS3ACommitter: Cleanup job (no job ID): duration
0:00.024s
{code}
Proposed: don't say anything until you succeed, and then just list how many
were deleted iff > 0 + duration. Nothing to delete => silence except for the
stats on cleanup duration
was (Author: [email protected]):
Also tune logs @ info. Specifically during cleanup it talks about aborting all
pending commits, which is a bit worrying unless you know its just cleaning up
{code}
18/05/22 16:41:17 INFO DAGScheduler: Job 10 finished: save at
NativeMethodAccessorImpl.java:0, took 0.571971 s
18/05/22 16:41:17 INFO AbstractS3ACommitter: Starting: Task committer
attempt_20180522164116_0000_m_000000_0: commitJob((no job ID))
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Cleanup job (no job ID)
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Aborting all pending
commits under s3a://hwdev-steve-new/spark_shell/csv
18/05/22 16:41:18 INFO AbstractS3ACommitter: Aborting all pending commits under
s3a://hwdev-steve-new/qetest/csv: duration 0:00.024s
18/05/22 16:41:18 INFO AbstractS3ACommitter: Cleanup job (no job ID): duration
0:00.024s
{code}
Proposed: don't say anything until you succeed, and then just list how many
were deleted iff > 0 + duration. Nothing to delete => silence except for the
stats on cleanup duration
> Prove the correctness of the new committers, or fix where they are not correct
> ------------------------------------------------------------------------------
>
> Key: HADOOP-15107
> URL: https://issues.apache.org/jira/browse/HADOOP-15107
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.1.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset
> lists from committed tasks to the final destination, where they are read and
> committed.
> # Show the magic committer also works.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]