On 9/23/16 7:15 PM, Tom McCurdy wrote: > Bash Version: 4.3 > Patch Level: 11 > Release Status: release > > Description: > Two nearly identical "For Loop" setups have large deltas in > performance. See test program below. Was confirmed in IRC chat room by > multiple users. Input is very noticeable with around 100,000 values. > > The script was originally used to take an input file where the first line > would determine how many int values were to follow. The remaining lines > each had values that were sorted, and then the smallest difference between > any two values were found. > > Repeat-By: > > Using script below comment/uncomment each method to test. > On my machine > Method-1 finishes in around 2 seconds > Method-2 finishes in around 72 seconds > > > ---- Code Below (also attached script and test input file) --- > START=$(date +%s.%N) > > read -r N && mapfile -t Pi < <(sort -n) > #echo "Done" > > min=10000000 > > #Find the smallest difference between two numbers in sorted array with max > array element = 10000000 > > # For-Loop Method-1: is super fast > for (( i=0; i<N-1; i++ )) > do > current=${Pi[$i]} > next=${Pi[$i+1]} > diff=$(( next - current)) > if [ $diff -lt $min ] > then > min=$diff > fi > done
Yes, bash array access is heavily optimized for (increasing) sequential access, the most common case. Bash arrays don't have a size limit, and they're allowed to be very sparse, so the internal implementation is a doubly-linked list. You can make this very fast by keeping a roving pointer that indicates the last-accessed element. Now, what bash does as a consequence of this roving pointer optimization is bias towards references that increase monotonically. Christian Franke hit on the reason that the first construct is so much faster: the reference to Pi[i+1] in the second for loop construct increments the last-referenced element, and the immediate reference of Pi[i] forces bash into a search for it from the beginning of the array. This potentially happens twice: once for the test, and once for the assignment. There are a couple of other ways to speed this up, but that doesn't help you right now, of course. I'll be looking at additional optimizations before bash-5.0 gets released. You'd be better off sticking with method 1 for the time being. The other details don't really make all that much difference, though I will look at the `let' anomaly Christian noticed. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/