Re: Parallelization of shell scripts for 'configure' etc.

2022-06-18 Thread L A Walsh

On 2022/06/13 15:39, Paul Eggert wrote:
In many Gnu projects, the 'configure' script is the biggest barrier to 
building because it takes s long to run. Is there some way that we 
could improve its performance without completely reengineering it, by 
improving Bash so that it can parallelize 'configure' scripts?
  


I don't know what type of instrumentations you've done over configure,
but before investing a much time in optimization, it might be interesting
to know where most of the time is being spent.

I.e cpu, I/O -- what types of I/O -- actual test-I/O or executable load 
time.


The reason I say that is that having run configure for the same projects
on linux and on cygwin -- I note that the cygwin version is MUCH slower
doing the same work on the same machine. 


A big slowdown in cygwin is loading & starting a new executable/binary.
I.e loading 100 programs 10x each will take a disproportionately higher time
on cygwin due to its exec-load penalty (Since windows has no fork, all 
of the

memory space duplication (and later copies on write) has to be done manually
in cygwin -- very painful.  But noting that one of the big boosts in
shell scripts can come from using a parallel option of a util vs. feeding in
file/pathnames one at a time like using 'find -exec "rm {}" \; or similar.

Similarly a big speed up in configure might be to use the bundled version of
coreutils (all binaries in 1 image invoked via different command names), and
put that in the same binary as bash, perhaps via a loadable command, with
any following core-util calls being routed "in-binary" to the already loaded
version.  Of course it would likely not be trivial assuring all the
commands can be re-invoked to assure they had their necessary 
initializations

redone on each "in-image" launch, but keeping all the coreutil binaries
"in-memory", I think would be a big win even if it wasn't multi-threaded. 

Of course it might be of benefit if the various utils were all thread 
safe, so

a more powerful dispatcher could use multi-threading w/o worries about
thread safety, but just eliminating most of the redundant util-loads 
might be
a huge win by itself.  That's sorta why I was wondering how much 
perf-profiling

had been done on config(.sh)...

Anyway -- just some random thoughts...


For ideas about this, please see PaSh-JIT:

Kallas K, Mustafa T, Bielak J, Karnikis D, Dang THY, Greenberg M, 
Vasilakis N. Practically correct, just-in-time shell script 
parallelization. Proc OSDI 22. July 2022. 
https://nikos.vasilak.is/p/pash:osdi:2022.pdf


I've wanted something like this for *years* (I assigned a simpler 
version to my undergraduates but of course it was too much to expect 
them to implement it) and I hope some sort of parallelization like this 
can get into production with Bash at some point (or some other shell if 
Bash can't use this idea).


  




Re: Parallelization of shell scripts for 'configure' etc.

2022-06-18 Thread Tim Rühsen

Hi all,

On 14.06.22 00:39, Paul Eggert wrote:
In many Gnu projects, the 'configure' script is the biggest barrier to 
building because it takes s long to run. Is there some way that we 
could improve its performance without completely reengineering it, by 
improving Bash so that it can parallelize 'configure' scripts?


A faster configure script execution indeed is something I'd love to see.
The title of this thread infers that we *only* want to discuss 
parallelization - maybe we can generalize this to "Making configure 
scripts run faster" ?


[A little side-note: the invocation of gnulib-tool is *far* slower than 
the running the configure scripts, for the projects that I work on.

But surely this is a problem on it own.]

I see two main setups when running configure scripts. How to speeding up 
the execution has several possible solutions for each of the setups (but 
with overlaps of course).


a) The maintainer/contributor/hacker setup
This is when you re-run configure relatively often for the same project(s).
I do this normally and and came up with 
https://gitlab.com/gnuwget/wget2/-/wikis/Developer-hints:-Increasing-speed-of-GNU-toolchain. 
It may be a bit outdated, but may help one or the other here.

Btw, I am down to 2.5s for a ./configure run from 25s originally.

b) The one-time build setup
This is people building + installing from tarball and automated build 
systems (e.g. CI) with regular OS updates. I also think of systems like 
Gentoo where you build everything from source.
As Alex Ameen pointed out, using a global configure cache across 
different projects may be insecure.

Also people often want to use optimization in this case.
Installing ccache is also not likely when people just want to 
build+install a single project.


I personally see a) as solved, at least for me.

b) is a problem because
1. People start to complain about the slow GNU build system (autotools), 
which drives new projects away from using autotools and possible it 
drives people away from GNU in general. Or in other words: let's not eat 
up people's precious time unnecessarily when building our software.


2. Building software in large scale eats tons of energy. If we could 
reduce the energy consumption, it gives us at least a better feeling.



What can we do to solve b)
I guess we first need to analyze/profile the configure execution.
For this I wrote a little tool some years ago: 
https://gitlab.com/rockdaboot/librusage.
It's simple to build and use and gives some number of which (external) 
commands are executed - fork+exec are pretty heavy.
[Configure for wget2 runs 'rm' and 'cat' each roughly 2000x - so I came 
up with enabling plugins for those two commands (had to write a plugin 
for 'rm', not sure if it never has been accepted by bash upstream).]
Maybe be we can create plugins for other highly used commands as well 
e.g. for sed !?


The output of the tool also roughly allows to see where the time goes - 
it's beyond my spare time to go deeper into this right now.

Please test yourself and share some numbers.

Another option is to group tests, e.g. if test 1 is X, we also know the 
results for tests 2,3,4,... Or we group several tests into a single C 
file, if possible. Just an idea (sounds a bit tedious, though).


Parallelism... can't we do that with &, at least for well-known / 
often-used tests ?


Family calls...

Regards, Tim


OpenPGP_signature
Description: OpenPGP digital signature