Hello, I am reporting a problem with performance, not correctness.
While preparing some examples for a course lecture where I code the same algorithm in many languages to compare languages, I ran some code and while it was reasonably quick with ksh, it would just apparently hang at 100% cpu in bash. I finally let it run overnight and it does complete correctly in bash, but what takes ksh less than a minute takes bash 6 1/2 hours to complete (and keeping one core at 100% the entire 6.5 hours) on the same hardware. I suspect there may be some special way to compile bash that I don't know about that maybe works with arrays differently, so I reporting this. I am not subscribed, so please cc: me. I cannot use bashbug since my university blocks outgoing mail. I used exactly the same file unmodified for the tests in ksh and bash. My hope is that bash would be at least 'competitive' and complete it without being more than 10x slower. As it is, I cannot use bash in the lecture (for this) since it is only 3 hours long and the program won't complete in that amount of time. BLS2 $bash --version GNU bash, version 4.2.20(2)-release (x86_64-unknown-linux-gnu) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. BLS2 $file /bin/bash /bin/bash: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.38, stripped Compiled like this: CC="gcc" \ CFLAGS="-O2 -march=x86-64 -mtune=generic -pipe -fPIC -DPIC" \ ./configure \ --prefix=/usr \ --sysconfdir=/etc \ --libdir=/lib \ --sbindir=/sbin \ --bindir=/bin \ --mandir=/usr/man \ --infodir=/usr/info \ --with-curses The program run was a simple prime number sieve program using an array with only 500000 elements: #! /bin/bash if ((${#}==1)) then n=${1} else n=500000 fi for ((i=1;i<=n;i++)) do ((a[i]=1)) done for ((i=2;i<=(n/2);i++)) do for ((j=2;j<=(n/i);j++)) do ((k=i*j)) ((a[k]=0)) done done for ((i=1;i<=n;i++)) do if ((a[i]!=0)) then printf "%d\n" ${i} fi done exit 0 When run like: time bash ./sieve.sh ..... 499717 499729 499739 499747 499781 499787 499801 499819 499853 499879 499883 499897 499903 499927 499943 499957 499969 499973 499979 real 396m9.884s user 395m43.102s sys 0m8.913s The exact same file run with time ksh ./sieve.sh .... 499717 499729 499739 499747 499781 499787 499801 499819 499853 499879 499883 499897 499903 499927 499943 499957 499969 499973 499979 real 0m34.835s user 0m34.368s sys 0m0.147s The ksh in question, if it matters, is LS2 $ksh --version version sh (AT&T Research) 93u 2011-02-08 BLS2 $file /bin/ksh /bin/ksh: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not stripped Computer is a Core2 Duo with 3GB of ram running a pure64bit linux distribution with kernel 3.2.9 with gcc 4.5.3 and glibc 2.14.1. Both programs get the same answer, so this is not a correctness issue, but instead a performance issue. System limits for the user running the code (sorry about the way gmail ruins column alignment) show there were no small limits, and I've got overcommit off for the machine and a 3GB swap partition 'just in case' that wasn't used on either run: BLS2 $ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 24069 max locked memory (kbytes, -l) 1048576 max memory size (kbytes, -m) unlimited open files (-n) 24934 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Experimenting a bit shows me that at 9999 elements bash is still reasonably fast, but at 29999 elements it takes: real 0m39.077s user 0m38.807s sys 0m0.150s For that, ksh takes: real 0m1.631s user 0m1.560s sys 0m0.007s Perhaps that shorter total time still shows the problem dramatically enough that runs that size can be used to track down the problem without having to wait hours for the test runs. JGH