[ 
https://issues.apache.org/jira/browse/HADOOP-15107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488861#comment-16488861
 ] 

Steve Loughran edited comment on HADOOP-15107 at 5/24/18 11:59 AM:
-------------------------------------------------------------------

Also tune logs @ info. Specifically during cleanup it talks about aborting all 
pending commits, which is a bit worrying unless you know its just cleaning up

{code}
18/05/22 16:41:17 INFO DAGScheduler: Job 10 finished: save at 
NativeMethodAccessorImpl.java:0, took 0.571971 s
18/05/22 16:41:17 INFO AbstractS3ACommitter: Starting: Task committer 
attempt_20180522164116_0000_m_000000_0: commitJob((no job ID))
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Cleanup job (no job ID)
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Aborting all pending 
commits under s3a://hwdev-steve-new/spark_shell/csv
18/05/22 16:41:18 INFO AbstractS3ACommitter: Aborting all pending commits under 
s3a://hwdev-steve-new/spark_shel/csv: duration 0:00.024s
18/05/22 16:41:18 INFO AbstractS3ACommitter: Cleanup job (no job ID): duration 
0:00.024s
{code}

Proposed: don't say anything until you succeed, and then just list how many 
were deleted iff > 0 + duration. Nothing to delete => silence except for the 
stats on cleanup duration


was (Author: [email protected]):
Also tune logs @ info. Specifically during cleanup it talks about aborting all 
pending commits, which is a bit worrying unless you know its just cleaning up

{code}
18/05/22 16:41:17 INFO DAGScheduler: Job 10 finished: save at 
NativeMethodAccessorImpl.java:0, took 0.571971 s
18/05/22 16:41:17 INFO AbstractS3ACommitter: Starting: Task committer 
attempt_20180522164116_0000_m_000000_0: commitJob((no job ID))
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Cleanup job (no job ID)
18/05/22 16:41:18 INFO AbstractS3ACommitter: Starting: Aborting all pending 
commits under s3a://hwdev-steve-new/spark_shell/csv
18/05/22 16:41:18 INFO AbstractS3ACommitter: Aborting all pending commits under 
s3a://hwdev-steve-new/qetest/csv: duration 0:00.024s
18/05/22 16:41:18 INFO AbstractS3ACommitter: Cleanup job (no job ID): duration 
0:00.024s
{code}

Proposed: don't say anything until you succeed, and then just list how many 
were deleted iff > 0 + duration. Nothing to delete => silence except for the 
stats on cleanup duration

> Prove the correctness of the new committers, or fix where they are not correct
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-15107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15107
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15107-001.patch, HADOOP-15107-002.patch
>
>
> I'm writing about the paper on the committers, one which, being a proper 
> paper, requires me to show the committers work.
> # define the requirements of a "Correct" committed job (this applies to the 
> FileOutputCommitter too)
> # show that the Staging committer meets these requirements (most of this is 
> implicit in that it uses the V1 FileOutputCommitter to marshall .pendingset 
> lists from committed tasks to the final destination, where they are read and 
> committed.
> # Show the magic committer also works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to