[ 
https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893273#comment-15893273
 ] 

Sean Mackrory commented on HADOOP-13914:
----------------------------------------

Already discussed this a bit with you while reviewing, but sharing all thoughts 
for everyone else's sake:
* In case other reviewers get confused by it, the hints from Git in each hunk 
about which function a change is in are misleading S3AFileSystem, and in the 
places where putAndReturn is removed it is only removed in the S3-only version 
of the function that does no metadatastore operation. putAndReturn now happens 
higher in the callstack only when there is a metadatastore context.
* I'm a bit concerned about all the S3AFileStatus -> FileStatus changes 
(although I could've sworn that we already removed that assertion in 
PathMetadataTranslation...). I think it's definitely a change for the better, 
but I'm a little worried there is some application out there that this change 
will break, despite being @Private and @Evolving... Let's definitely at least 
call this out in a release note or something.
* Along similar lines, I wondered if a `public S3AFileStatus 
getFileStatus(Path, bool)` function might be in order in S3AFileSystem? No idea 
how much it'll get used, but if anyone IS depending on the current 
S3AFileStatus return type and wants that work done it'd be useful.
* I also thought about a S3AFileStatus.setIsEmptyDirectory(bool) wrapper to 
avoid any unnecessary change in compatibility, but again, very unlikely to be 
used publicly, so feel free to ignore. isEmptyDirectory is probably going break 
the same set of applications that might use it anyway (which for all I know is 
an empty set), so no perfectly compatible way to do this anyway...
* JavaDoc comment on ITestSGuardEmptyDirs recommends a refactoring after 
HADOOP-13345 is merged - we should file a JIRA dependent on HADOOP-13345 for 
that when this is committed.
* There's an assertion in TestDynamoDBMetadataStore.java that 
isEmptyDirectory() is what we expect. Why is that removed? Is it a Boolean -> 
Tristate issue? If so, shouldn't we modify the logic to accept either the 
expected one or UNKNOWN?
* +1 to ignoring the findbugs issue that way.
Tests are running now - almost done and no related failures, will report back 
if the end result is otherwise.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> --------------------------------------------------------
>
>                 Key: HADOOP-13914
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13914
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13914-HADOOP-13345.000.patch, 
> HADOOP-13914-HADOOP-13345.002.patch, HADOOP-13914-HADOOP-13345.003.patch, 
> s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag 
> stored in S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB 
> implementation, and also sacrifices good code separation to minimize 
> S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and 
> preferred solution.  I suggest we do this work after merging the HADOOP-13345 
> branch to trunk, but am open to suggestions.
> I can also attach a patch of a integration test that exercises the missing 
> case and demonstrates a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to