[
https://issues.apache.org/jira/browse/HADOOP-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939235#comment-17939235
]
ASF GitHub Bot commented on HADOOP-15224:
-----------------------------------------
raphaelazzolini commented on PR #7396:
URL: https://github.com/apache/hadoop/pull/7396#issuecomment-2761425196
> +1
>
> we're good! merging to trunk and cherrpicking (manually) to branch-3.4;
where I'll retest.
>
> Thanks for this work @raphaelazzolini -great code and great tests
Thanks, @steveloughran!
This is the cherry-pick PR for branch-3.4:
https://github.com/apache/hadoop/pull/7550.
I resolved https://issues.apache.org/jira/browse/HADOOP-15224, we can also
resolve https://issues.apache.org/jira/browse/HADOOP-19080 since this code
change allows people to use object lock buckets when they set the checksum.
> Add option to set checksum on S3 object uploads
> -----------------------------------------------
>
> Key: HADOOP-15224
> URL: https://issues.apache.org/jira/browse/HADOOP-15224
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
> Assignee: Raphael Azzolini
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.5.0, 3.4.2
>
>
> [~rdblue] reports sometimes he sees corrupt data on S3. Given MD5 checks from
> upload to S3, its likelier to have happened in VM RAM, HDD or nearby.
> If the MD5 checksum for each block was built up as data was written to it,
> and checked against the etag RAM/HDD storage of the saved blocks could be
> removed as sources of corruption
> The obvious place would be
> {{org.apache.hadoop.fs.s3a.S3ADataBlocks.DataBlock}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]