[
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893273#comment-15893273
]
Sean Mackrory commented on HADOOP-13914:
----------------------------------------
Already discussed this a bit with you while reviewing, but sharing all thoughts
for everyone else's sake:
* In case other reviewers get confused by it, the hints from Git in each hunk
about which function a change is in are misleading S3AFileSystem, and in the
places where putAndReturn is removed it is only removed in the S3-only version
of the function that does no metadatastore operation. putAndReturn now happens
higher in the callstack only when there is a metadatastore context.
* I'm a bit concerned about all the S3AFileStatus -> FileStatus changes
(although I could've sworn that we already removed that assertion in
PathMetadataTranslation...). I think it's definitely a change for the better,
but I'm a little worried there is some application out there that this change
will break, despite being @Private and @Evolving... Let's definitely at least
call this out in a release note or something.
* Along similar lines, I wondered if a `public S3AFileStatus
getFileStatus(Path, bool)` function might be in order in S3AFileSystem? No idea
how much it'll get used, but if anyone IS depending on the current
S3AFileStatus return type and wants that work done it'd be useful.
* I also thought about a S3AFileStatus.setIsEmptyDirectory(bool) wrapper to
avoid any unnecessary change in compatibility, but again, very unlikely to be
used publicly, so feel free to ignore. isEmptyDirectory is probably going break
the same set of applications that might use it anyway (which for all I know is
an empty set), so no perfectly compatible way to do this anyway...
* JavaDoc comment on ITestSGuardEmptyDirs recommends a refactoring after
HADOOP-13345 is merged - we should file a JIRA dependent on HADOOP-13345 for
that when this is committed.
* There's an assertion in TestDynamoDBMetadataStore.java that
isEmptyDirectory() is what we expect. Why is that removed? Is it a Boolean ->
Tristate issue? If so, shouldn't we modify the logic to accept either the
expected one or UNKNOWN?
* +1 to ignoring the findbugs issue that way.
Tests are running now - almost done and no related failures, will report back
if the end result is otherwise.
> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> --------------------------------------------------------
>
> Key: HADOOP-13914
> URL: https://issues.apache.org/jira/browse/HADOOP-13914
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Aaron Fabbri
> Assignee: Aaron Fabbri
> Attachments: HADOOP-13914-HADOOP-13345.000.patch,
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch,
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB
> implementation, and also sacrifices good code separation to minimize
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and
> preferred solution. I suggest we do this work after merging the HADOOP-13345
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing
> case and demonstrates a failure with DynamoDBMetadataStore.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]