Bruno Haible wrote: > Hello Ralf, > > Thank you for your speedups to gnulib-tool. At first I was, of course, > excited about the 2x speedup. But when looking at the maintainability > of the code that you propose, I'm not fine with all of it any more. > > My four objections are: > > 1) You observe that forking programs in a shell script is slow, and > therefore propose to use more shell built-ins. The problem with it > is that I chose to implement gnulib-tool in shell (for the control > structure) and sed (for the text processing). > > If you want to achieve good speedups for scripts that use 'sed': > can you work towards making 'sed' a bash built-in? This is challenging, > but if you are after performance, that would be promising.
The only microoptimization that is worthwhile doing to speed up shell scripts, is to avoid forking. This is *no exaggeration*. (I am not talking about algorithmic improvements; though in some cases, as Ralf showed, forking can hide the benefits of algorithmic improvements). We (Ralf, Eric and I) saved over 30% of execution time of Autoconf scripts on Cygwin, and a few percent on Linux too, just by removing one or two forks here and there. It's not about making sed a bash builtin, it's about not forking for things such as (...) echo abc | ... A while ago I made a lot of timings regarding the speed of various shell constructs; you can find them on the Autoconf list. Here are the relevant ones: $ time sh -c 'for i in `seq 1 1000`; do :; done' user 0m0.034s sys 0m0.024s $ time sh -c 'for i in `seq 1 1000`; do (:); done' user 0m0.486s sys 0m2.377s $ time sh -c 'for i in `seq 1 1000`; do echo abc | :; done' user 0m0.958s sys 0m4.657s echo and : are shell builtins, but they fork, so they're slow. s/:/sed/ and you see my point. If this 10x-30x improvement affected 20% of the shell execution time, one could expect a decent speedup. That would probably amount to a rewrite of bash, dash, or whatever else. You would have to make the main shell loop centered around event-driven processing of file descriptors, to provide all the pipes with a single process. A fun project, but probably not one that I or anyone else will attempt without funding. :-) Paolo