[
https://issues.apache.org/jira/browse/HADOOP-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839362#comment-15839362
]
Aaron Fabbri commented on HADOOP-14020:
---------------------------------------
Thanks for doing this [~mackrorysd]. Looks really good. Simple logic but huge
improvement in behavior.
Also impressed at your effort writing a test case for this. I would have been
tempted to confirm behavior with log messages then call it good.
Minor optimizations suggested below...
{code}
+ public boolean put(FileStatus childFileStatus) {
Preconditions.checkNotNull(childFileStatus,
"childFileStatus must be non-null");
Path childPath = childStatusToPathKey(childFileStatus);
+ PathMetadata oldValue = listMap.get(childPath);
+ PathMetadata newValue = new PathMetadata(childFileStatus);
+ if (oldValue != null && oldValue.equals(newValue)) {
+ return false;
+ }
listMap.put(childPath, new PathMetadata(childFileStatus));
{code}
How about listMap.put(childPath, newValue) here. Save a little garbage.
{code}
public static FileStatus[] dirListingUnion(MetadataStore ms, Path path,
- List<FileStatus> backingStatuses, DirListingMetadata dirMeta)
- throws IOException {
+ List<FileStatus> backingStatuses, DirListingMetadata dirMeta,
+ Configuration conf) throws IOException {
{code}
Looks like you only use conf to check the value of
Constants.METADATASTORE_AUTHORITATIVE.
Can we just pass in a boolean isMetadataStoreAuthoritative instead of doing a
conf.get() each time? More efficient (and explicit).
> Optimize dirListingUnion
> ------------------------
>
> Key: HADOOP-14020
> URL: https://issues.apache.org/jira/browse/HADOOP-14020
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
> Attachments: HADOOP-14020-HADOOP-13345.001.patch,
> HADOOP-14020-HADOOP-13345.002.patch
>
>
> There's a TODO in dirListingUnion:
> {quote}// TODO optimize for when allowAuthoritative = false{quote}
> There will be cases when we can intelligently avoid a round trip: if S3A
> results are a subset or the metadatastore results (including them being equal
> or empty) then writing back will do nothing (although perhaps that should set
> the authoritative flag if it isn't set already).
> There may also be cases where users want to just skip that altogether. It's
> wasted work if authoritative mode is disabled, so perhaps we want to trigger
> a skip if that's false, or perhaps it should be a separate property. First
> one makes for simpler config, second is more flexible...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]