improve performance of a script

xeon Mailinglist Wed, 26 Mar 2014 03:34:33 -0700

For each file inside the directory $output, I do a cat to the file and generate 
a sha256 hash. This script takes 9 minutes to read 105 files, with the total 
data of 556MB and generate the digests. Is there a way to make this script 
faster? Maybe generate digests in parallel?


for path in $output
do
    # sha256sum
    digests[$count]=$( $HADOOP_HOME/bin/hdfs dfs -cat "$path" | sha256sum | awk 
'{ print $1 }')
    (( count ++ ))
done


Thanks,

improve performance of a script

Reply via email to