[jira] [Commented] (HADOOP-15273) distcp can't handle remote stores with different checksum algorithms

Steve Loughran (JIRA) Wed, 07 Mar 2018 10:45:27 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389953#comment-16389953
 ]


Steve Loughran commented on HADOOP-15273:
-----------------------------------------

An initial distcp always does a checksum check during upload

This sequence will fail
{code}
hadoop fs -rm -R -skipTrash s3a://hwdev-steve-new/\*
hadoop distcp  /user/steve/data s3a://hwdev-steve-new/data
{code}

Here

{code}
18/03/07 17:08:03 INFO mapreduce.Job: Task Id : 
attempt_1520388269891_0019_m_000004_2, Status : FAILED
Error: java.io.IOException: File copy failed: 
hdfs://mycluster/user/steve/data/example.py --> 
s3a://hwdev-steve-new/data/example.py
        at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:259)
        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:217)
        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying 
hdfs://mycluster/user/steve/data/example.py to 
s3a://hwdev-steve-new/data/example.py
        at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
        at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:256)
        ... 10 more
Caused by: java.io.IOException: Check-sum mismatch between 
hdfs://mycluster/user/steve/data/example.py and 
s3a://hwdev-steve-new/data-connectors/.distcp.tmp.attempt_1520388269891_0019_m_000004_2.
 Source and target differ in block-size. Use -pb to preserve block-sizes during 
copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By 
skipping checksums, one runs the risk of masking data-corruption during 
file-transfer.)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareCheckSums(RetriableFileCopyCommand.java:223)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:133)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
        at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
        ... 11 more
{code}

You cannot use -skipcrccheck as the validator forbids it, yet without it you 
can't upload to hdfs to s3a now that it serves up its checksums as etags.

> distcp can't handle remote stores with different checksum algorithms
> --------------------------------------------------------------------
>
>                 Key: HADOOP-15273
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15273
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Priority: Critical
>         Attachments: HADOOP-15273-001.patch
>
>
> When using distcp without {{-skipcrcchecks}} . If there's a checksum mismatch 
> between src and dest store types (e.g hdfs to s3), then the error message 
> will talk about blocksize, even when its the underlying checksum protocol 
> itself which is the cause for failure
> bq. Source and target differ in block-size. Use -pb to preserve block-sizes 
> during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. 
> (NOTE: By skipping checksums, one runs the risk of masking data-corruption 
> during file-transfer.)
> update:  the CRC check takes always place on a distcp upload before the file 
> is renamed into place. *and you can't disable it then*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15273) distcp can't handle remote stores with different checksum algorithms

Reply via email to