Andi Kleen <a...@linux.intel.com> wrote: >On Wed, Aug 21, 2013 at 04:17:48PM +0200, Jan Hubicka wrote: >> Hi, >> this is my attempt to bring GCC into wonderful era of multicore CPUs >:) >> It is a hack, but it seems to help quite a lot. About 50% of WPA >time is spent >> by streaming the individual ltrans .o files. This can be easily >parallelized >> by fork - we do nothing afterwards, just exit and pass the list to >the linker. > >One risk is if someone streams to a spinning disk it may add more seeks >for >the parallel IO. But I think it's a reasonable tradeoffs.
It'll also wreck all WPA dump files. >We should also use a faster compressor And we should avoid uncompressing the function sections... That said, the patch is enough of a hack that I don't ever want to debug a bug in it.... I also fail to see why threads should not work here. Maybe simply annotate gcc with openmp? Richard. >> For -flto=jobserver I simply fork all 32 processes. It may not be a >disaster,? >> but perhaps we should figure out how to communicate with jobserver. >At first >> glance on document on how it works, it seems easy to add. Perhaps we >can even >> convicne GNU Make folks to put simple helpers to libiberty? > >lto=jobserver is still broken and confuses tokens on large builds (ends >with a 0 read) I did some debugging recently, and I suspect a Linux >kernel >bug now. Still haven't tracked it down. > >Any workarounds would need make changs unfortunately. > >> >> We also may figure out number of CPUs (is it available i.e. from >libgomp) > >sysconf(_SC_NPROCESSORS_ONLN) ? > >> and use it by default even if user do not care to pass number of >processes. >> Naturally these streaming forks should be cheap memory wise. I hope >Martin >> will get me some actual numbers. >> >> With the patch the WPA time of firefox goes down to 2 minutes (4.8 >needs about >> 30 minutes and without the hack one needs about 5 minutes) > >Cool! > >I'll try it on my builds >> >> +fparallelism= >> +LTO Joined >> +Run the link-time optimizer in whole program analysis (WPA) mode. > >The description does not make sense > >Rest of patch looks good from a quick read, although I would prefer to >do the waiting for children in the "parent", not the "last one" > >-Andi