[
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-13786:
------------------------------------
Attachment: HADOOP-13786-HADOOP-13345-013.patch
HADOOP-13786 patch 013.
h3. All mock tests are working.
More specifically, they were got working, then the commit logic tuned to reduce
the number of S3 calls
* no {{exists(path)}} check before a {{delete()}} if the policy is replace
* no checking for return code of {{delete()}}, as we know it never signals an
error, merely that the destination path didn't exist at the time of call.
The tests have also been tuned to be a bit more explicit about what they are
declaring and asserting; less repetition in mock object setup.
Also: ability to turn up logging of mock operations, including stack trace
level of invocation. Useful to work out things like why more {{delete()}} calls
are made than expected.
h3. Most of the committer IT tests are working
Everything is working except the IT protocol tests
{{testMapFileOutputCommitter}} and {{testConcurrentCommitTaskWithSubDir}},
which expect directories to be handled. I will do that at least for the
Directory Committer, with some mock tests as well as a fixed IT test, and skip
them in the partition committer
h3. {{LambdaTestUtils}} tuning
Ryan's patch had some {{assertThrown()}} assertions which I've been moving to
the common test base.
While {{intercept()}} has some features {{assertThrown()}} lacked, it doesn't
support handing down extra diagnostics messages. Fixed. We could go one step
further and allow callers to provide a closure {{() -> String}} for
diagnostics, perhaps, though maybe we can wait to see what JUnit 5 has first
h3. TODO
* directory trees in the directory committer
* move to direct API calls on S3A
* when s3guard is enabled, make sure PUT commits are updating the entire
metastore tree.
> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13786-HADOOP-13345-001.patch,
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch,
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch,
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch,
> HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch,
> HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch,
> HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch,
> s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the
> presence of failures". Implement it, including whatever is needed to
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard
> provides a consistent view of the presence/absence of blobs, show that we can
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output
> streams (ie. not visible until the close()), if we need to use that to allow
> us to abort commit operations.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]