[ 
https://issues.apache.org/jira/browse/HADOOP-16384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874226#comment-16874226
 ] 

Steve Loughran commented on HADOOP-16384:
-----------------------------------------

Here are some things I want to verify the correctness of as part of this

# If you try to list a directory which has a tombstone, then the result of 
DynamoDBMetadataStore.listChildren is null
# if you are only pruning tombstones then the auth status of the parent dir 
doesn't change
# if you are pruning normal entries then the auth status of the parent does 
change
# if you list a directory then the auth status is updated
# mkdirs() will create an auth dir entry for the new dir; any on-demand parent 
entries will be non-auth
# 

Looking also at FileSystem.delete in this world, we can update the table of 
each batch of deleted entries, even as we trigger another final enum + delete 
of the base path. That final list would be better if the S3Guard API let us 
request to exclude tombstones from a listing; for now it ill just be extra 
inefficient as the filtering is done client-side. Incremental delete reduces 
the risk of entries in the store which aren't in the FS.I'm not going to be 
clever here and to a parallel delete page run, though it seems a nice target 
for some speedup, especially given the fact that each page will be deleting 
1000 DDB entries along with an HTTP Delete request; it's the overhead of that 
DDB update which will be more than that of the S3 call.
 



> ITestS3AContractRootDir failing: inconsistent DDB tables
> --------------------------------------------------------
>
>                 Key: HADOOP-16384
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16384
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: hwdev-ireland-new.csv
>
>
> HADOOP-15183 added detection and rejection of prune updates when the store is 
> inconsistent (i.e. when it tries to update an entry twice in the same 
> operation, the second time with one that is inconsistent with the first)
> Now that we can detect this, we should address it. We are lucky here in that 
> my DDB table is currently inconsistent: prune is failing. 
> Plan
> # new test to run in the sequential phase, which does a s3guard prune against 
> the bucket used in tests
> # use this to identify/debug the issue
> # replicate the problem in the ITestDDBMetastore tests
> # decide what to do in this world. Tell the user to run fsck? skip?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to