Hi,
I'm trying to use distcp (org.apache.hadoop.tools.DistCp) out of a
simple java program to copy files from HDFS to S3 storage. This works
quite fine, except that it is very slow. Copying the files to the local
disk is also not much faster. It seems like files are copied
sequentially. My understanding was however that distcp would create map
jobs that could be executed in parallel. Is there any configuration
setting required to get the map jobs executed in parallel?
thanks,
Hendrik
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]