On 9/24/10 8:19 AM, Greg Wooledge wrote: > On Thu, Sep 23, 2010 at 10:12:28PM +0900, sky wrote: >> # >> # prepare 1000 strings of 6 digits >> # >> TEST_LIST=`seq 100100 100 200000` >> echo $TEST_LIST | wc > > Actually, this is one gigantic string, not 1000 strings. > >> # >> # delete "150000" >> # >> T0=$SECONDS >> A=${TEST_LIST//150000} >> T1=$SECONDS >> B=`echo $TEST_LIST | sed s/150000//g` >> T2=$SECONDS > > Yes, it's known that operations on very large strings in bash can take > a long time. (Chet may be able to address that problem; I can't.)
The problem involves performing pattern substitution on very large strings in an environment that the locale indicates supports multibyte characters. The solution is to bound the number of comparisons the matcher has to do, which reduces the number of multibyte string conversions and wide character comparisons necessary. I've done a lot of work on this, and the changes will be in bash-4.2. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/