Automatic Parallelization & Graphite - future plans

Razya Ladelsky Tue, 10 Mar 2009 07:14:10 -0700

Hello,

Described here is  the future plan for automatic parallelization in GCC.


The current autopar pass is based on GOMP infrastructure; it distributes 
iterations of loops
 to several threads (the number is instructed by the user) if it was 
determined that 
they are independent. The only dependency allowed to exist is reduction, 
which is handled as a special case.
This pass was initially contributed to GCC4.3 by Zdenek Dvorak and 
Sebastian Pop.
 With the integration of Graphite (http://gcc.gnu.org/wiki/Graphite)
to GCC4.4, a strong loop nest analysis and transformation engine was 
introduced, 
and the notion of using the polyhedral model to expose loop parallelism in 
GCC becomes feasible
and relevant.

Our prospective goals are to incrementally integrate autopar and Graphite.
As in auto par, we'll initially focus on synchronization free 
parallelization.

The first step, as we see it, will teach Graphite that parallel code needs 
to be produced.
This means that Graphite will recognize simple parallel loops (using SCoP 
detection and data dependency analysis), 
and pass on that information.
 The information that needs to be conveyed expresses that a loop is 
parallelizable, and may also include annotations of  more 
detailed information e.g, the shared/private variables. 

There are two possible models for the code generation:
1. Graphite will annotate parallel loops and pass that information all the 
way through CLOOG 
to the current autopar code generator to produce the parallel, GOMP based 
code.
2. Graphite will annotate the parallel loops and CLOOG itself will be 
responsible of generating 
the parallel code.
A point to notice here is that scalars/reductions are currently not 
handled in Graphite.
In the first model, where Graphite calls autopar's code generation, 
scalars can be handled.
After Graphite finishes its analysis, it calls autopar's reduction 
analysis, and only then the code
generation is called (if the scalar analysis determines that the loop 
still parallelizable, of course).

Once the first step is accomplished, the following steps will focus on 
teaching Graphite 
to find loop transformations (such as skewing, interchange etc.) that 
expose coarse grain synchronization free parallelism.
This will be heavily based on the polyhedral data dependence and 
transformation infrastructures.
We have not determined which algorithm/ techniques we're going to use for 
this part.

 Having synchronization free parallelization integrated in Graphite, will 
set the ground for 
handling parallelism requiring a small amount of parallelization. 

This is a rough view for our planned work on autopar in GCC.
Please feel free to ask/comment.

Thanks,
Razya

Automatic Parallelization & Graphite - future plans

Reply via email to