feature request: parallelize make builds.
current problem: make is serial in nature. there is room for making it 
series-parallel.


I have been toying with this idea of parallel builds to gain project compile 
speed (reducing time to a fraction) for quite a while.

compiles seem to spend more time on cpu and memory than they do disk for large 
monolithic files. but for the typical small files on most projects, I should 
think they would be more disk bound.

I have 2 machines I could test with (one with 2 threads and 32-bit and 38GB VM 
and 3GB RAM, one with 12 threads and 64-bit and 64GB memory and 64GB VM), both 
with about 50+ projects I could parallelize the builds on. but it would take 
some manual typing work to do this kind of testing if you want it done (I may 
do it anyway, I want to parallelize my builds somehow to save time and make use 
of this nice fast proc). right now I am transferring the files to my new 
machine. so it could be a month before I get anywhere. or less. the problem 
with my current build system is, they are batch files and I don't use make. I 
would have to convert the build systems for all of my 50+ projects, and also 
redo my build system somehow by making some sort of mingw make template, and I 
am no make expert. I am just putting this idea out there for someone to grab 
onto and implement. should I just feed it to the GNU folks via a bug report?
on to the idea...

if you want to compile anything big without losing hair, you should start 
compiling individual items in parallel where possible. in fact, within that, 
the compilers if possible should be multithreaded compilers where you can 
either set the limit on the number of number of threads (as long as it doesn't 
go over what the system has available) or by default auto-detect the number of 
threads. although it seems compilers do seem very much a serial thing from what 
little I remember in my compiler class from 20 years ago...
one large scale example is CPU-based HPC machines with EMC disks would benefit 
from this kind of feature. it would make provision for parallel builds for a 
given project. which brings me to my next point.

I am starting to parallelize my compiles BECAUSE IT MAKES THINGS FASTER up to a 
limit, which is probably some combination of disk speed and cpu threads.
this will afford more speed since most procs are multithreaded/multicore and 
windows treats them like cpus. Of course, this is not limited to windows. you 
can bring this to the mac and to linux also. any platform that has a C++ 
compiler that compiles lots of individual files.
this will make things disk bound very quickly however, since usually these 
files reside on a single disk. this is where RAID comes in very handy and 
provides the extra jump in speed. and this is where the EMC or even small-scale 
RAID boxes for personal use or 19" RAID racks for work use can come in. you can 
make build servers with this work even better by using the processors more 
efficiently. this is also where you can do things like buy a certain number of 
threads for faster build time, etc. for cloud build services.


an idea I have is you can have a fixed pool of worker threads assigned as 
compile job engines, each with their own spooler.
and you need to sync up the jobs internally when finishing a SERIALCOMPILE 
command.
for a SERIALCOMPILE job list such as a list of .o/obj files you want made from 
.c files, so that the SERIALCOMPILE command finishes with them all done.
instead of the usual compile command using .c.o: $(CC) etc, I think it was, you 
do COMPILESERIAL 12 THREADS for a 12-threaded CPU or COMPILESERIAL AUTO THREADS 
and then your list of .cpp files and what file extension you want them turned 
into (such as .obj), and what command you want to use to do it. maybe this 
would havce to take up several lines of make.
or something like that.

think of it like a glorified make.

at some point I should expect we might even see a parallel build system in 
place I would hope, once software developers begin to start thinking in terms 
of parallel builds instead of serial builds - it would cut time by a fraction, 
but you have to be careful HOW you do it.
some things just have to be done serially. that would just be regular make 
commands.


I am not sure if I am providing this idea to the right person or not. maybe I 
should be going to Intel or to GNU or to Microsoft or Apple or all of the 
above. but I didn't really want one vendor hogging all of the benefits. so I 
thought I would bring the idea to you. should you want to bring the 
specification of a parallel build system into the language, I would appreciate 
this (because I could certainly use it!).

and for us software developers, it would reduce our compile times.


Note that on windows machines, there is WaitForMultipleObjects() in the Win32 
API to handle the issue of waiting for [process, window, thread, whatever] 
HANDLEs without using while+for loops and some sort of conditional.

personally, I would like to see this in every make and build system and see it 
made generally available. everyone has multicore processors now. let's make use 
of them!

 
-------------
Jim Michaels
jmich...@yahoo.com
j...@renewalcomputerservices.com
http://RenewalComputerServices.com
http://JesusnJim.com (my personal site, has software)
---
IEC Units: Computer RAM & SSD measurements, microsoft disk size measurements 
(note: they will say GB or MB or KB or TB when it is IEC Units!):
[KiB] [MiB] [GiB] [TiB]
[2^10B=1,024^1B=1KiB]
[2^20B=1,024^2B=1,048,576B=1MiB]
[2^30B=1,024^3B=1,073,741,824B=1GiB]
[2^40B=1,024^4B=1,099,511,627,776B=1TiB]
[2^50B=1,024^5B=1,125,899,906,842,624B=1PiB]
SI Units: Hard disk industry disk size measurements:

[KB] [MB] [GB] [TB]
[10^3B=1,000B=1KB]
[10^6B=1,000,000B=1MB]
[10^9B=1,000,000,000B=1GB]
[10^12B=1,000,000,000,000B=1TB]
[10^15B=1,000,000,000,000,000B=1PB]
_______________________________________________
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Reply via email to