Lennart Schultz wrote: > Bash Version: 4.0 > Patch Level: 10 > Release Status: release > > Description: > > I have a bash script which reads about 250000 lines of xml code generating > about 850 files with information extracted from the xml file. > It uses the construct: > > while read line > do > case "$line" in > .... > done < file > > and this takes a little less than 2 minutes > > Trying to use mapfile I changed the above construct to: > > mapfile < file > for i in "${mapfi...@]}" > do > line=$(echo $i) # strip leading blanks > case "$line" in > .... > done > > With this change the job now takes more than 48 minutes. :(
The most important thing is using the right tool for the job. If you have to introduce a command substitution for each line read with mapfile, you probably don't have the problem mapfile is intended to solve: quickly reading exact copies of lines from a file descriptor into an array. If another approach works better, you should use it. If you're interested in why the mapfile solution is slower, you could run the loop using a version of bash built for profiling and check where the time goes. I believe you'd find that the command substitution is responsible for much of it, and the rest is due to the significant increase in memory usage resulting from the 250000-line array (which also slows down fork and process creation). Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer Chet Ramey, ITS, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/