I have started working on connecting Dmitry's OpenMP parser to the middle-end so that we can start generating the basic runtime calls, which Richard should be posting soon. With any luck, we should have some basic functionality in a few weeks.
Initially, we will be outlining parallel sections into their own functions. This is mostly for implementation convenience. However, long term we are better off incorporating parallel markers into the IL so that we can do a better job analyzing and optimizing. It may be marginally quicker to be able to launch threads that execute the same body of code because it avoids the argument passing overhead for shared stuff and the memory indirection in the launched functions. But mostly, I'm interested in the IL elements for optimization and analysis. Launching multiple threads on the same function body may give us more headaches than it's worth ATM. Essentially, we will have an IL expression for every OpenMP pragma. These expressions are GENERIC and the gimplifier work is mostly in the bodies. With few exceptions, the controlling predicates and clauses are required to be in more or less GIMPLE form by the standard already. The lowering will, for now, just create a new function and replace the block of code along the lines of tree-nested.c. However, in the future, the parallel sections will be single-entry single-exit regions in the CFG with the controlling GOMP_PARALLEL_... expression as the entry block and a latch-like exit block. The parallel region building can be modeled after the loop structure, but there isn't as much nesting, so it shouldn't be too complex. As an aside, we do need CFG region building and the ability to have the optimizers work on sub-regions (currently being worked on, as I understand). In fact, even if we don't end up launching threads on the same function body, we can keep the parallel region inside the function throughout the optimizers and outline it at a later point (before RTL, perhaps). Some runtime library calls (synchronization mostly), ought to be recognizable as such by the optimizers. I am not sure whether to define them as builtins, provide an attribute or make them IL expressions. Any suggestions/ideas? The IL constructs mostly mirror their #pragma counterparts. Take these as a design draft, I have only started working on the implementation, so I expect the design to evolve as I implement things. There may also be several hidden assumptions that I expect to become embarrassingly obvious in a few weeks. Names prefixed with "g_" below mean "the gimplified form of ...". Parallel regions ---------------- #pragma omp parallel [clause1 ... clauseN] ------------------------------------------ GENERIC GOMP_PARALLEL <parallel_clauses data_clauses, body> GIMPLE GOMP_PARALLEL <g_parallel_clauses g_data_clauses, L1, L2> L1: g_body L2: #pragma omp for [clause1 ... clauseN] ------------------------------------- GENERIC GOMP_FOR <for_clauses data_clauses nowait_clause, init-expr, incr-expr, body> GIMPLE GOMP_FOR <g_for_clauses g_data_clauses nowait_clause, init-expr, incr-expr, L1, L2> L1: g_body L2: Both INIT-EXPR and INCR-EXPR are required to be in GIMPLE form by the standard already, so there's little that need to be done there. Keeping them in the header itself makes it easy to reference later when we're generating code. #pragma omp sections [clause1 ... clauseN] ------------------------------------------ GENERIC GOMP_SECTIONS <data_clauses nowait_clause, body> GIMPLE GOMP_SECTIONS <g_data_clauses nowait_clause, L1, L2> L1: g_body L2: #pragma omp section ------------------- GENERIC GOMP_SECTION <body> GIMPLE GOMP_SECTION <L1, L2> L1: g_body L2: #pragma omp single [clause1 ... clauseN] ---------------------------------------- GENERIC GOMP_SINGLE <data_clauses nowait_clause, body> GIMPLE GOMP_SINGLE <g_data_clauses nowait_clause, L1, L2> L1: g_body L2: #pragma omp master ------------------ GENERIC GOMP_MASTER <body> GIMPLE GOMP_MASTER <L1, L2> L1: g_body L2: #pragma omp critical [name] --------------------------- GENERIC GOMP_CRITICAL <name, block> GIMPLE GOMP_CRITICAL <name, L1, L2> L1: g_body L2: Here, NAME is something the runtime needs to recognize. It will essentially be the name of the lock to use when emitting the appropriate lock call. #pragma omp barrier ------------------- GENERIC GIMPLE GOMP_BARRIER #pragma omp atomic ------------------- GENERIC GIMPLE GOMP_ATOMIC <expression-statement> The standard is sufficiently strict that we don't need additional gimplification here. EXPRESSION-STATEMENT can only be of the form 'VAR binop= EXPR', where EXPR must be of scalar type. ATM, it's not absolutely clear to me if EXPR needs to be a GIMPLE RHS already or if it could be more complex. It certainly can't reference VAR. #pragma omp flush (var-list) ---------------------------- GENERIC GIMPLE GOMP_FLUSH <var-list> #pragma omp ordered ------------------- GENERIC GOMP_ORDERED <body> GIMPLE GOMP_ORDERED <L1, L2> L1: g_body L2: #pragma omp threadprivate ------------------------- This will just set an attribute in each affected _DECL. Accessible with GOMP_THREADPRIVATE. for_clauses ----------- * CLAUSE ordered GENERIC A boolean field in GOMP_FOR. Accessible with GOMP_ORDERED. GIMPLE Same. * CLAUSE schedule (kind, expr) GENERIC A structure inside GOMP_FOR. Accessible with GOMP_SCHEDULE: enum schedule_kind { GOMP_SCHED_STATIC, GOMP_SCHED_DYNAMIC, GOMP_SCHED_GUIDED, GOMP_SCHED_RUNTIME } kind; tree expr; GIMPLE Same, with EXPR in GIMPLE form as per FE rules. If missing, it defaults to INTEGER_ONE_NODE for GOMP_SCHED_DYNAMIC and GOMP_SCHED_GUIDED. It defaults to iteration-space / num-threads for GOMP_SCHED_STATIC and it emits getenv reads from environment for GOM_SCHED_RUNTIME. nowait_clause ------------- * CLAUSE nowait GENERIC A boolean field in GOMP_FOR. Accessible with GOMP_NOWAIT. GIMPLE Same. parallel_clauses ---------------- * CLAUSE if (expr) GENERIC GOMP_IF <expr> GIMPLE if (g_expr) goto L1; else goto L2; L1: GOMP_PARALLEL <g_parallel_clauses, L2, L3> L2: g_body L3: * CLAUSE num_threads (expr) GENERIC A tree field in the GOMP_PARALLEL expression accessed with GOMP_NUM_THREADS. GIMPLE Same, with EXPR gimplified as per FE rules. data_clauses ------------ * CLAUSE private (variable_list) copyprivate (variable_list) firstprivate (variable_list) lastprivate (variable_list) shared (variable_list) copyin (variable_list) GENERIC These are fields in the GOMP_PARALLEL expression. Accessed with: GOMP_PRIVATE GOMP_FIRSTPRIVATE GOMP_SHARED GOMP_COPYIN GIMPLE Same, with variable_list gimplified as per FE rules. * CLAUSE default (shared | none) GENERIC This is a boolean field in the GOMP_PARALLEL expression. GIMPLE Same. * CLAUSE reduction (operator : variable_list) GENERIC A structure inside GOMP_PARALLEL with two fields enum tree_code operator -> PLUS_EXPR, MULT_EXPR, MINUS_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR, BIT_IOR_EXPR, AND_EXPR, OR_EXPR tree variable_list GIMPLE Same, with variable_list gimplified as per FE rules. Diego.