Hi! Sorry for the late reply again :P On Thu, Nov 15, 2018 at 8:29 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > <giuliano.belina...@usp.br> wrote: > > > > As a brief introduction, I am a graduate student that got interested > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1]). I > > am a newcommer in GCC, but already have sent some patches, some of > > them have already been accepted [2]. > > > > I brought this subject up in IRC, but maybe here is a proper place to > > discuss this topic. > > > > From my point of view, parallelizing GCC itself will only speed up the > > compilation of projects which have a big file that creates a > > bottleneck in the whole project compilation (note: by big, I mean the > > amount of code to generate). > > That's true. During GCC bootstrap there are some of those (see PR84402). >
> One way to improve parallelism is to use link-time optimization where > even single source files can be split up into multiple link-time units. But > then there's the serial whole-program analysis part. Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402 ? That is a lot of data :-) It seems that 'phase opt and generate' is the most time-consuming part. Is that the 'GIMPLE optimization pipeline' you were talking about in this thread: https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > Additionally, I know that GCC must not > > change the project layout, but from the software engineering perspective, > > this may be a bad smell that indicates that the file should be broken > > into smaller files. Finally, the Makefiles will take care of the > > parallelization task. > > What do you mean by GCC must not change the project layout? GCC > happily re-orders functions and link-time optimization will reorder > TUs (well, linking may as well). > That was a response to a comment made on IRC: On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely <jwakely....@gmail.com> wrote: >I think this is in response to a comment I made on IRC. Giuliano said >that if a project has a very large file that dominates the total build >time, the file should be split up into smaller pieces. I said "GCC >can't restructure people's code. it can only try to compile it >faster". We weren't referring to code transformations in the compiler >like re-ordering functions, but physically refactoring the source >code. Yes. But from one of the attachments from PR84402, it seems that such files exist on GCC, https://gcc.gnu.org/bugzilla/attachment.cgi?id=43440 > > My questions are: > > > > 1. Is there any project compilation that will significantly be improved > > if GCC runs in parallel? Do someone has data about something related > > to that? How about the Linux Kernel? If not, I can try to bring some. > > We do not have any data about this apart from experiments with > splitting up source files for PR84402. > > > 2. Did I correctly understand the goal of the parallelization? Can > > anyone provide extra details to me? > > You may want to search the mailing list archives since we had a > student application (later revoked) for the task with some discussion. > > In my view (I proposed the thing) the most interesting parts are > getting GCCs global state documented and reduced. The parallelization > itself is an interesting experiment but whether there will be any > substantial improvement for builds that can already benefit from make > parallelism remains a question. As I agree that documenting GCC's global states is good for the community and the development of GCC, I really don't think this a good motivation for parallelizing a compiler from a research standpoint. There must be something or someone that could take advantage of the fine-grained parallelism. But that data from PR84402 seems to have the answer to it. :-) On Thu, Nov 15, 2018 at 4:07 PM Szabolcs Nagy <szabolcs.n...@arm.com> wrote: > > On 15/11/18 10:29, Richard Biener wrote: > > In my view (I proposed the thing) the most interesting parts are > > getting GCCs global state documented and reduced. The parallelization > > itself is an interesting experiment but whether there will be any > > substantial improvement for builds that can already benefit from make > > parallelism remains a question. > > in the common case (project with many small files, much more than > core count) i'd expect a regression: > > if gcc itself tries to parallelize that introduces inter thread > synchronization and potential false sharing in gcc (e.g. malloc > locks) that does not exist with make parallelism (glibc can avoid > some atomic instructions when a process is single threaded). That is what I am mostly worried about. Or the most costly part is not parallelizable at all. Also, I would expect a regression on very small files, which probably could be avoided implementing this feature as a flag? On Fri, Nov 16, 2018 at 11:05 AM Martin Jambor <mjam...@suse.cz> wrote: > > Hi Giuliano, > > On Thu, Nov 15 2018, Richard Biener wrote: > > You may want to search the mailing list archives since we had a > > student application (later revoked) for the task with some discussion. > > Specifically, the whole thread beginning with > https://gcc.gnu.org/ml/gcc/2018-03/msg00179.html > > Martin > Yes, I will research this carefully ;-) Thank you