Re: why is gnulib-tool.sh slow

Bruno Haible Fri, 15 Mar 2024 19:27:47 -0700

Collin Funk wrote:
> > Wow it is very much faster!  \o/
> 
> That result is interesting. I wonder what part takes the shell script
> so long.


Simon's oath-toolkit package invokes gnulib-tool 5 times. So, the average
run time of each gnulib-tool.sh invocation was 9 sec. Which is not
terribly long.

It looks like each of the steps (computing the set of modules, computing
the set of files, actually copying the files) take a couple of seconds.

IMO, the main problem with sh or bash as a scripting language is that
it lacks in-process string processing facilities. Simply dissecting
a string like
   'foo    [test $HAVE_FOO = 0]'
into 'foo' and 'test $HAVE_FOO = 0' takes one or more pipes and 'sed'
invocations. Even if 'sed', like 'echo', was a shell built-in, there
would still be the overhead of a fork() and a pipe() system call.

This way of doing computations — delegate computations to
specialized programs — is influenced by
  (1) the "Unix philosophy" [1],
  (2) composition through program invocation.

(2) has nowadays mostly been replaced by composition through libraries.
(2) is an artifact of the small memory sizes (max. 64 KB) that a program
could use up to ca. 1987. Even on a computer with 1 MB of RAM, you
could not map 25 libraries, each of 10 KB size, into the address space
of the same process. Thus, they used program invocation as a way of
building larger software from small pieces.

Nowadays, composition through libraries is the predominant approach
for building larger software. Thus, most programming environments and
languages contain string manipulation libraries, numerical libraries,
networking libraries, etc. all available in the same process. But
the sh / bash language still doesn't: it still uses subprocesses and
pipes for even the most basic things.

Bruno

[1] https://en.wikipedia.org/wiki/Unix_philosophy

Re: why is gnulib-tool.sh slow

Reply via email to