distcp from plain java program

Hendrik Haddorp Sat, 21 Apr 2018 07:01:37 -0700

Hi,

I'm trying to use distcp (org.apache.hadoop.tools.DistCp) out of asimple java program to copy files from HDFS to S3 storage. This worksquite fine, except that it is very slow. Copying the files to the localdisk is also not much faster. It seems like files are copiedsequentially. My understanding was however that distcp would create mapjobs that could be executed in parallel. Is there any configurationsetting required to get the map jobs executed in parallel?


thanks,
Hendrik

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

distcp from plain java program

Reply via email to