Hi, Richard On 03/28, Richard Biener wrote: > On Wed, Mar 27, 2019 at 2:55 PM Giuliano Belinassi > <giuliano.belina...@usp.br> wrote: > > > > Hi, > > > > On 03/26, Richard Biener wrote: > > > On Tue, 26 Mar 2019, David Malcolm wrote: > > > > > > > On Mon, 2019-03-25 at 19:51 -0400, nick wrote: > > > > > Greetings All, > > > > > > > > > > I would like to take up parallelize compilation using threads or make > > > > > c++/c > > > > > memory issues not automatically promote. I did ask about this before > > > > > but > > > > > not get a reply. When someone replies I'm just a little concerned as > > > > > my writing for proposals has never been great so if someone just > > > > > reviews > > > > > and doubt checks that's fine. > > > > > > > > > > As for the other things building gcc and running the testsuite is > > > > > fine. Plus > > > > > I already working on gcc so I've pretty aware of most things and this > > > > > would > > > > > be a great steeping stone into more serious gcc development work. > > > > > > > > > > If sample code is required that's in mainline gcc I sent out a trial > > > > > patch > > > > > for this issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88395 > > > > > > > > > > Cheers, > > > > > > > > > > Nick > > > > > > > > It's good to see that you've gotten as far as attaching a patch to BZ > > > > [1] > > > > > > > > I think someone was going to attempt the "parallelize compilation using > > > > threads" idea last year, but then pulled out before the summer; you may > > > > want to check the archives (or was that you?) > > > > > > There's also Giuliano Belinassi who is interested in the same project > > > (CCed). > > > > Yes, I will apply for this project, and I will submit the final version > > of my proposal by the end of the week. > > > > Currently, my target is the `expand_all_functions` routine, as most of > > the time is spent on it according to the experiments that I performed as > > part of my Master's research on compiler parallelization. > > (-O2, --disable-checking) > > Yes, more specifically I think the realistic target is the GIMPLE part > of execute_pass_list (cfun, g->get_passes ()->all_passes); done in > cgraph_node::expand. If you look at passes.def you'll see all_passes > also contains RTL expansion (pass_expand) and the RTL optimization > queue (pass_rest_of_compilation). The RTL part isn't a realistic target. > Without changing the pass hierarchy the obvious part that can be > handled would be the pass_all_optimizations pass sub-queue of > all_passes since those are all passes that perform transforms on the > GIMPLE IL where we have all functions in this state at the same time > and where no interactions between the functions happen anymore > and thus functions can be processed in parallel (as much as make > processes individual translation units in parallel). >
Great. So if I understood correctly, I will need to split cgraph_node::expand() into three parts: IPA, GIMPLE and RTL, and then refactor `expand_all_functions` so that the loop for (i = new_order_pos - 1; i >= 0; i--) use these three functions, then partition g->get_passes()->all_passes into get_passes()->gimple_passes and get_passes()->rtl_passes, so I can run RTL after GIMPLE is finished, to finally start the paralellization of per function GIMPLE passes. > To simplify the taks further a useful constraint is to not have > a single optimization pass executed multiple times at the same time > (otherwise you have to look at pass specific global states as well), > thus the parallel part could be coded in a way keeping per function > the state of what pass to execute next and have a scheduler pick > a function its next pass is "free", scheduling that to a fixed set of > worker threads. There's no dependences between functions > for the scheduling but each pass has only one execution resource > in the pipeline. You can start processing an arbitrarily large number > of functions but slow functions will keep others from advancing across > the pass it executes on. > Something like a pipeline? That is certainly a start, but if one pass is very slow wouldn't it bottleneck everything? > Passes could of course be individually marked as thread-safe > (multiple instances execute concurrently). > > Garbage collection is already in control of the pass manager which > would also be the thread scheduler. For GC the remaining issue > is allocation which passes occasionally do. Locking is the short > term solution for GSoC I guess, long-term per-thread GC pools > might be better (to not slow down non-threaded parts of the compiler). > > Richard. > > > > > Thank you, > > Giuliano. > > > > > > > > > > > IIRC Richard [CCed] was going to mentor, with me co-mentoring [2] - but > > > > I don't know if he's still interested/able to spare the cycles. > > > > > > I've offered mentoring to Giuliano, so yes. > > > > > > > That said, the parallel compilation one strikes me as very ambitious; > > > > it's not clear to me what could realistically be done as a GSoC > > > > project. I think a good proposal on that would come up with some > > > > subset of the problem that's doable over a summer, whilst also being > > > > useful to the project. The RTL infrastructure has a lot of global > > > > state, so maybe either focus on the gimple passes, or on fixing global > > > > state on the RTL side? (I'm not sure) > > > > > > That was the original intent for the experiment. There's also > > > the already somewhat parallel WPA stage in LTO compilation mode > > > (but it simply forks for the sake of simplicity...). > > > > > > > Or maybe a project to be more > > > > explicit about regions of the code that assume that the garbage- > > > > collector can't run within them?[3] (since the GC is state that would > > > > be shared by the threads). > > > > > > The GC will be one obstackle. The original idea was to drive > > > parallelization on the pass level by the pass manager for the > > > GIMPLE passes, so serialization points would be in it. > > > > > > Richard. > > > > > > > Hope this is constructive/helpful > > > > Dave > > > > > > > > [1] though typically our workflow involved sending patches to the gcc- > > > > patches mailing list > > > > [2] as libgccjit maintainer I have an interest in global state within > > > > the compiler > > > > [3] I posted some ideas about this back in 2013 IIRC; probably > > > > massively bit-rotted since then. I also gave a talk at Cauldron 2013 > > > > about global state in the compiler (with a view to gcc-as-a-shared- > > > > library); likewise I expect much of the ideas there to be out-of-date); > > > > for libgccjit I went with a different approach Thank you, Giuliano.