Hendrik,
Did you try setting maxMaps to a higher number? The default is 20, so you might 
try setting it to a higher value.

-Gour 

On 4/21/18, 7:01 AM, "Hendrik Haddorp" <[email protected]> wrote:

    Hi,
    
    I'm trying to use distcp (org.apache.hadoop.tools.DistCp) out of a 
    simple java program to copy files from HDFS to S3 storage. This works 
    quite fine, except that it is very slow. Copying the files to the local 
    disk is also not much faster. It seems like files are copied 
    sequentially. My understanding was however that distcp would create map 
    jobs that could be executed in parallel. Is there any configuration 
    setting required to get the map jobs executed in parallel?
    
    thanks,
    Hendrik
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
    
    

Reply via email to