I have started working on connecting Dmitry's OpenMP parser to
the middle-end so that we can start generating the basic runtime
calls, which Richard should be posting soon.  With any luck, we
should have some basic functionality in a few weeks.

Initially, we will be outlining parallel sections into their own
functions.  This is mostly for implementation convenience.
However, long term we are better off incorporating parallel
markers into the IL so that we can do a better job analyzing and
optimizing.

It may be marginally quicker to be able to launch threads that
execute the same body of code because it avoids the argument
passing overhead for shared stuff and the memory indirection in
the launched functions.  But mostly, I'm interested in the IL
elements for optimization and analysis.  Launching multiple
threads on the same function body may give us more headaches than
it's worth ATM.

Essentially, we will have an IL expression for every OpenMP
pragma.  These expressions are GENERIC and the gimplifier work is
mostly in the bodies.  With few exceptions, the controlling
predicates and clauses are required to be in more or less GIMPLE
form by the standard already.

The lowering will, for now, just create a new function and
replace the block of code along the lines of tree-nested.c.
However, in the future, the parallel sections will be
single-entry single-exit regions in the CFG with the controlling
GOMP_PARALLEL_... expression as the entry block and a latch-like
exit block.  The parallel region building can be modeled after
the loop structure, but there isn't as much nesting, so it
shouldn't be too complex.  As an aside, we do need CFG region
building and the ability to have the optimizers work on
sub-regions (currently being worked on, as I understand).

In fact, even if we don't end up launching threads on the same
function body, we can keep the parallel region inside the
function throughout the optimizers and outline it at a later
point (before RTL, perhaps).

Some runtime library calls (synchronization mostly), ought to be
recognizable as such by the optimizers.  I am not sure whether to
define them as builtins, provide an attribute or make them IL
expressions.  Any suggestions/ideas?

The IL constructs mostly mirror their #pragma counterparts.  Take
these as a design draft, I have only started working on the
implementation, so I expect the design to evolve as I implement
things.  There may also be several hidden assumptions that I
expect to become embarrassingly obvious in a few weeks.  Names
prefixed with "g_" below mean "the gimplified form of ...".


Parallel regions
----------------

#pragma omp parallel [clause1 ... clauseN]
------------------------------------------

  GENERIC
        GOMP_PARALLEL <parallel_clauses data_clauses, body>
  
  GIMPLE
        GOMP_PARALLEL <g_parallel_clauses g_data_clauses, L1, L2>
        L1:
          g_body
        L2:


#pragma omp for [clause1 ... clauseN]
-------------------------------------

  GENERIC
        GOMP_FOR <for_clauses data_clauses nowait_clause, init-expr, incr-expr, 
body>

  GIMPLE
        GOMP_FOR <g_for_clauses g_data_clauses nowait_clause, init-expr, 
incr-expr, L1, L2>
        L1:
          g_body
        L2:

        Both INIT-EXPR and INCR-EXPR are required to be in GIMPLE
        form by the standard already, so there's little that need
        to be done there.  Keeping them in the header itself
        makes it easy to reference later when we're generating
        code.


#pragma omp sections [clause1 ... clauseN]
------------------------------------------

  GENERIC
        GOMP_SECTIONS <data_clauses nowait_clause, body>

  GIMPLE
        GOMP_SECTIONS <g_data_clauses nowait_clause, L1, L2>
        L1:
          g_body
        L2:



#pragma omp section
-------------------

  GENERIC
        GOMP_SECTION <body>

  GIMPLE
        GOMP_SECTION <L1, L2>
        L1:
          g_body
        L2:



#pragma omp single [clause1 ... clauseN]
----------------------------------------

  GENERIC
        GOMP_SINGLE <data_clauses nowait_clause, body>

  GIMPLE
        GOMP_SINGLE <g_data_clauses nowait_clause, L1, L2>
        L1:
          g_body
        L2:



#pragma omp master
------------------

  GENERIC
        GOMP_MASTER <body>

  GIMPLE
        GOMP_MASTER <L1, L2>
        L1:
          g_body
        L2:


#pragma omp critical [name]
---------------------------

  GENERIC
        GOMP_CRITICAL <name, block>

  GIMPLE
        GOMP_CRITICAL <name, L1, L2>
        L1:
          g_body
        L2:

  Here, NAME is something the runtime needs to recognize.  It will
  essentially be the name of the lock to use when emitting the
  appropriate lock call.
        

#pragma omp barrier
-------------------

  GENERIC
  GIMPLE
        GOMP_BARRIER


#pragma omp atomic
-------------------

  GENERIC
  GIMPLE
        GOMP_ATOMIC <expression-statement>

  The standard is sufficiently strict that we don't need additional
  gimplification here.  EXPRESSION-STATEMENT can only be of the form
  'VAR binop= EXPR', where EXPR must be of scalar type.  ATM, it's not
  absolutely clear to me if EXPR needs to be a GIMPLE RHS already or
  if it could be more complex.  It certainly can't reference VAR.


#pragma omp flush (var-list)
----------------------------

  GENERIC
  GIMPLE
        GOMP_FLUSH <var-list>


#pragma omp ordered
-------------------

  GENERIC
        GOMP_ORDERED <body>

  GIMPLE
        GOMP_ORDERED <L1, L2>
        L1:
          g_body
        L2:


#pragma omp threadprivate
-------------------------

  This will just set an attribute in each affected _DECL.
  Accessible with GOMP_THREADPRIVATE.



for_clauses
-----------

* CLAUSE        ordered

  GENERIC       A boolean field in GOMP_FOR.  Accessible with
                GOMP_ORDERED.

  GIMPLE        Same.


* CLAUSE        schedule (kind, expr)

  GENERIC       A structure inside GOMP_FOR.  Accessible with
                GOMP_SCHEDULE:

                enum schedule_kind {
                  GOMP_SCHED_STATIC,
                  GOMP_SCHED_DYNAMIC,
                  GOMP_SCHED_GUIDED,
                  GOMP_SCHED_RUNTIME } kind;

                tree expr;


  GIMPLE        Same, with EXPR in GIMPLE form as per FE rules.
                If missing, it defaults to INTEGER_ONE_NODE for
                GOMP_SCHED_DYNAMIC and GOMP_SCHED_GUIDED.  It
                defaults to iteration-space / num-threads for
                GOMP_SCHED_STATIC and it emits getenv reads from
                environment for GOM_SCHED_RUNTIME.



nowait_clause
-------------

* CLAUSE        nowait

  GENERIC       A boolean field in GOMP_FOR.  Accessible with
                GOMP_NOWAIT.

  GIMPLE        Same.



parallel_clauses
----------------

* CLAUSE        if (expr)

  GENERIC       GOMP_IF <expr>

  GIMPLE        if (g_expr) goto L1; else goto L2;
                L1:
                  GOMP_PARALLEL <g_parallel_clauses, L2, L3>
                L2:
                  g_body
                L3:



* CLAUSE        num_threads (expr)

  GENERIC       A tree field in the GOMP_PARALLEL expression
                accessed with GOMP_NUM_THREADS.

  GIMPLE        Same, with EXPR gimplified as per FE rules.





data_clauses
------------

* CLAUSE        private (variable_list)
                copyprivate (variable_list)
                firstprivate (variable_list)
                lastprivate (variable_list)
                shared (variable_list)
                copyin (variable_list)
        
  GENERIC       These are fields in the GOMP_PARALLEL expression.
                Accessed with:

                GOMP_PRIVATE
                GOMP_FIRSTPRIVATE
                GOMP_SHARED
                GOMP_COPYIN

  GIMPLE        Same, with variable_list gimplified as per FE
                rules.



* CLAUSE        default (shared | none)

  GENERIC       This is a boolean field in the GOMP_PARALLEL
                expression.

  GIMPLE        Same.



* CLAUSE        reduction (operator : variable_list)

  GENERIC       A structure inside GOMP_PARALLEL with two fields

                enum tree_code operator -> PLUS_EXPR,
                                           MULT_EXPR,
                                           MINUS_EXPR,
                                           BIT_AND_EXPR,
                                           BIT_XOR_EXPR,
                                           BIT_IOR_EXPR,
                                           AND_EXPR,
                                           OR_EXPR
                tree variable_list

  GIMPLE        Same, with variable_list gimplified as per FE
                rules.



Diego.

Reply via email to