Re: Parallelization of shell scripts for 'configure' etc.
On 2022/06/13 15:39, Paul Eggert wrote: In many Gnu projects, the 'configure' script is the biggest barrier to building because it takes s long to run. Is there some way that we could improve its performance without completely reengineering it, by improving Bash so that it can parallelize 'configure' scripts? I don't know what type of instrumentations you've done over configure, but before investing a much time in optimization, it might be interesting to know where most of the time is being spent. I.e cpu, I/O -- what types of I/O -- actual test-I/O or executable load time. The reason I say that is that having run configure for the same projects on linux and on cygwin -- I note that the cygwin version is MUCH slower doing the same work on the same machine. A big slowdown in cygwin is loading & starting a new executable/binary. I.e loading 100 programs 10x each will take a disproportionately higher time on cygwin due to its exec-load penalty (Since windows has no fork, all of the memory space duplication (and later copies on write) has to be done manually in cygwin -- very painful. But noting that one of the big boosts in shell scripts can come from using a parallel option of a util vs. feeding in file/pathnames one at a time like using 'find -exec "rm {}" \; or similar. Similarly a big speed up in configure might be to use the bundled version of coreutils (all binaries in 1 image invoked via different command names), and put that in the same binary as bash, perhaps via a loadable command, with any following core-util calls being routed "in-binary" to the already loaded version. Of course it would likely not be trivial assuring all the commands can be re-invoked to assure they had their necessary initializations redone on each "in-image" launch, but keeping all the coreutil binaries "in-memory", I think would be a big win even if it wasn't multi-threaded. Of course it might be of benefit if the various utils were all thread safe, so a more powerful dispatcher could use multi-threading w/o worries about thread safety, but just eliminating most of the redundant util-loads might be a huge win by itself. That's sorta why I was wondering how much perf-profiling had been done on config(.sh)... Anyway -- just some random thoughts... For ideas about this, please see PaSh-JIT: Kallas K, Mustafa T, Bielak J, Karnikis D, Dang THY, Greenberg M, Vasilakis N. Practically correct, just-in-time shell script parallelization. Proc OSDI 22. July 2022. https://nikos.vasilak.is/p/pash:osdi:2022.pdf I've wanted something like this for *years* (I assigned a simpler version to my undergraduates but of course it was too much to expect them to implement it) and I hope some sort of parallelization like this can get into production with Bash at some point (or some other shell if Bash can't use this idea).
Re: Parallelization of shell scripts for 'configure' etc.
Hi all, On 14.06.22 00:39, Paul Eggert wrote: In many Gnu projects, the 'configure' script is the biggest barrier to building because it takes s long to run. Is there some way that we could improve its performance without completely reengineering it, by improving Bash so that it can parallelize 'configure' scripts? A faster configure script execution indeed is something I'd love to see. The title of this thread infers that we *only* want to discuss parallelization - maybe we can generalize this to "Making configure scripts run faster" ? [A little side-note: the invocation of gnulib-tool is *far* slower than the running the configure scripts, for the projects that I work on. But surely this is a problem on it own.] I see two main setups when running configure scripts. How to speeding up the execution has several possible solutions for each of the setups (but with overlaps of course). a) The maintainer/contributor/hacker setup This is when you re-run configure relatively often for the same project(s). I do this normally and and came up with https://gitlab.com/gnuwget/wget2/-/wikis/Developer-hints:-Increasing-speed-of-GNU-toolchain. It may be a bit outdated, but may help one or the other here. Btw, I am down to 2.5s for a ./configure run from 25s originally. b) The one-time build setup This is people building + installing from tarball and automated build systems (e.g. CI) with regular OS updates. I also think of systems like Gentoo where you build everything from source. As Alex Ameen pointed out, using a global configure cache across different projects may be insecure. Also people often want to use optimization in this case. Installing ccache is also not likely when people just want to build+install a single project. I personally see a) as solved, at least for me. b) is a problem because 1. People start to complain about the slow GNU build system (autotools), which drives new projects away from using autotools and possible it drives people away from GNU in general. Or in other words: let's not eat up people's precious time unnecessarily when building our software. 2. Building software in large scale eats tons of energy. If we could reduce the energy consumption, it gives us at least a better feeling. What can we do to solve b) I guess we first need to analyze/profile the configure execution. For this I wrote a little tool some years ago: https://gitlab.com/rockdaboot/librusage. It's simple to build and use and gives some number of which (external) commands are executed - fork+exec are pretty heavy. [Configure for wget2 runs 'rm' and 'cat' each roughly 2000x - so I came up with enabling plugins for those two commands (had to write a plugin for 'rm', not sure if it never has been accepted by bash upstream).] Maybe be we can create plugins for other highly used commands as well e.g. for sed !? The output of the tool also roughly allows to see where the time goes - it's beyond my spare time to go deeper into this right now. Please test yourself and share some numbers. Another option is to group tests, e.g. if test 1 is X, we also know the results for tests 2,3,4,... Or we group several tests into a single C file, if possible. Just an idea (sounds a bit tedious, though). Parallelism... can't we do that with &, at least for well-known / often-used tests ? Family calls... Regards, Tim OpenPGP_signature Description: OpenPGP digital signature