Steve Loughran created HADOOP-15297:
---------------------------------------
Summary: Make s3a etag -> checksum publishing option
Key: HADOOP-15297
URL: https://issues.apache.org/jira/browse/HADOOP-15297
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Affects Versions: 3.1.0
Reporter: Steve Loughran
Assignee: Steve Loughran
HADOOP-15273 shows how distcp doesn't handle non-HDFS filesystems with
checksums.
Exposing Etags as checksums, HADOOP-13282, breaks workflows which back up to
s3a.
Rather than revert I want to make it an option, off by default. Once we are
happy with distcp in future, we can turn it on.
Why an option? Because it lines up for a successor to distcp which saves src
and dest checksums to a file and can then verify whether or not files have
really changed. Currently distcp relies on dest checksum algorithm being the
same as the src for incremental updates, but if either of the stores don't
serve checksums, silently downgrades to not checking.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]