[
https://issues.apache.org/jira/browse/HADOOP-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123792#comment-16123792
]
Aaron Fabbri commented on HADOOP-14749:
---------------------------------------
{quote}
If we added a field for each entry as to when the record itself was created,
then we could have AWS TTL do the pruning automatically.
{quote}
I think we will want a "entry last written" mod time field in DDB, but I don't
think we can use S3's TTL feature without breaking the "all ancestors of any
path P in DDB must be present" invariant. I chatted with my friend that works
on the DynamoDB team and he did not believe that their TTL deletion feature was
strongly ordered enough to guarantee it, even if we could ensure we always
wrote ancestors before children. Maybe there is another algorithm I'm not
thinking of though.
I do think we want a v2 prune implementation for dynamo which works better
(i.e. actually expires directories properly). I think that the authoritative
mode support for dynamodb will be a big motivator for this, as if you are
relying on DDB as source of truth for listings, then reliable expiry of stale
data becomes more important. I've also been thinking about the online
algorithm variant of prune (doing it on demand in client, probabilistically /
randomized perhaps, or on access).
> review s3guard docs & code prior to merge
> -----------------------------------------
>
> Key: HADOOP-14749
> URL: https://issues.apache.org/jira/browse/HADOOP-14749
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: documentation, fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-14749-HADOOP-13345-001.patch,
> HADOOP-14749-HADOOP-13345-002.patch, HADOOP-14749-HADOOP-13345-003.patch,
> HADOOP-14749-HADOOP-13345-004.patch, HADOOP-14749-HADOOP-13345-005.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Pre-merge cleanup while it's still easy to do
> * Read through all the docs, tune
> * Diff the trunk/branch files to see if we can reduce the delta (and hence
> the changes)
> * Review the new tests
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]