[
https://issues.apache.org/jira/browse/HADOOP-16384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874226#comment-16874226
]
Steve Loughran commented on HADOOP-16384:
-----------------------------------------
Here are some things I want to verify the correctness of as part of this
# If you try to list a directory which has a tombstone, then the result of
DynamoDBMetadataStore.listChildren is null
# if you are only pruning tombstones then the auth status of the parent dir
doesn't change
# if you are pruning normal entries then the auth status of the parent does
change
# if you list a directory then the auth status is updated
# mkdirs() will create an auth dir entry for the new dir; any on-demand parent
entries will be non-auth
#
Looking also at FileSystem.delete in this world, we can update the table of
each batch of deleted entries, even as we trigger another final enum + delete
of the base path. That final list would be better if the S3Guard API let us
request to exclude tombstones from a listing; for now it ill just be extra
inefficient as the filtering is done client-side. Incremental delete reduces
the risk of entries in the store which aren't in the FS.I'm not going to be
clever here and to a parallel delete page run, though it seems a nice target
for some speedup, especially given the fact that each page will be deleting
1000 DDB entries along with an HTTP Delete request; it's the overhead of that
DDB update which will be more than that of the S3 call.
> ITestS3AContractRootDir failing: inconsistent DDB tables
> --------------------------------------------------------
>
> Key: HADOOP-16384
> URL: https://issues.apache.org/jira/browse/HADOOP-16384
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Attachments: hwdev-ireland-new.csv
>
>
> HADOOP-15183 added detection and rejection of prune updates when the store is
> inconsistent (i.e. when it tries to update an entry twice in the same
> operation, the second time with one that is inconsistent with the first)
> Now that we can detect this, we should address it. We are lucky here in that
> my DDB table is currently inconsistent: prune is failing.
> Plan
> # new test to run in the sequential phase, which does a s3guard prune against
> the bucket used in tests
> # use this to identify/debug the issue
> # replicate the problem in the ITestDDBMetastore tests
> # decide what to do in this world. Tell the user to run fsck? skip?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]