Dear GCC developers. My name is José Manuel Andión Fernández and I have just started to work at University of A Coruña on porting XARK (eXtensible Compiler for Automatic Recognition of Computational Kernels, http://doi.acm.org/10.1145/1391956.1391959) to GCC. The main goal of this project is to be able to generate parallel code from a sequential code automatically.
One of the firsts tasks to accomplish is generating an OpenMP version from a loop which has previously been marked as parallelizable. So I have implemented a new pass named pass_xark_backend_omp (at tree-xark-backend-omp.c, enabled by -fxark-backend-omp) strongly based on pass_parallelize_loops (at tree-parloops.c). The main differences between them are: - tree-xark-backend-omp is executed in pass_early_local_passes group, while pass_parallelize_loops is executed in pass_tree_loop, in pass_all_optimizations group. We have chosen this location because SSA form has just been calculated and because we will carry out interprocedural anaysis of several functions at same time. - I don't create two versions of the loop. I manipulate the old loop, inserting OpenMP directives and another statements to achieve only one parallel loop. - The CFG which reaches my pass is different, so I make different modifications. But the intermediate representantion which reaches omp_expand_local is almost the same. - In create_loop_fn, I had to add a call to add_referenced_var(t); after t = build_decl (PARM_DECL, get_identifier (".paral_data_param"), ptr_type_node); (tree-xark-backend-omp.c:920) But when I apply my pass to the simple example code asig_reg.c, I obtain the following ICE: $ ../build/bin/gcc -v -Wall -fdump-tree-all-raw -fdump-ipa-all-raw -L../build/lib -Wl,-rpath -Wl,../build/lib -fxark-debug -fxark-backend-omp -fverify-ssa-debug -o asig_reg.bin asig_reg.c Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../src/configure --prefix=/home/jmandion/Programacion/gcc-gnu/build/ --disable-bootstrap --enable-languages=c,fortran Thread model: posix gcc version 4.5.0 20090521 (experimental) (GCC) COLLECT_GCC_OPTIONS='-v' '-Wall' '-fdump-tree-all-raw' '-fdump-ipa-all-raw' '-L../build/lib' '-fxark-debug' '-fxark-backend-omp' '-fverify-ssa-debug' '-o' 'asig_reg.bin' '-mtune=generic' /home/jmandion/Programacion/gcc-gnu/build/bin/../libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1 -quiet -v -iprefix /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/ asig_reg.c -quiet -dumpbase asig_reg.c -mtune=generic -auxbase asig_reg -Wall -version -fdump-tree-all-raw -fdump-ipa-all-raw -fxark-debug -fxark-backend-omp -fverify-ssa-debug -o /tmp/ccgETP9F.s GNU C (GCC) version 4.5.0 20090521 (experimental) (i686-pc-linux-gnu) compiled by GNU C version 4.3.3, GMP version 4.2.4, MPFR version 2.4.0 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 ignoring nonexistent directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include" ignoring duplicate directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/include" ignoring duplicate directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed" ignoring nonexistent directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/include /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed /usr/local/include /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../include /usr/include End of search list. GNU C (GCC) version 4.5.0 20090521 (experimental) (i686-pc-linux-gnu) compiled by GNU C version 4.3.3, GMP version 4.2.4, MPFR version 2.4.0 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: ebc62f7555f88405c8bde4c5a1af27a7 asig_reg.c: In function ‘main._loopfn.0’: asig_reg.c:53: error: missing definition for SSA_NAME: .paral_data_param_2 in statement: .paral_data_load.5_3 = (struct .paral_data.3 *) .paral_data_param_2; asig_reg.c:53: internal compiler error: verify_ssa failed Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. I have tried to solve this problem fixing the Call Graph (omp_expand_local don't do it) and printing all the functions across this graph (enabled with -fxark-debug), but it was unsuccesfull. So I have instrumented verify_ssa (tree-ssa.c:542 and 733, enabled by -fverify-ssa-debug) by dumping in file verify_ssa.dump the pass which invokes the function, the function associated to current_function_decl in that moment, the referenced_vars and the immediate uses. Furthermore, I write a line each time a basic block is processed. With this information and some dummy passes (pass_xark_print and pass_xark_ipa_print, also enabled by -fxark-debug), I have seen the compilation process ends before reaching the ipa_passes group because the referenced_vars and the immediate_uses are empty when analyzing main._loopfn.0. But just "a moment" before, verify_ssa checks succesfully this function too. I attach to this message a patch tree-xark-backend-omp.diff and the example asig_reg.c Thanks in advance, José Manuel Andión Fernández Computer Architecture Group Departement of Electronics and Systems University of A Coruña Spain http://gac.des.udc.es/index_en.html
#include <stdio.h> #include <unistd.h> #define MAX 1048576 int main() { int A[MAX]; int i; for (i = 0; i < MAX; ++i) { A[i] = 2; } printf("A[3] = %d\n", A[3]); return 0; }
Index: gcc/tree-pass.h =================================================================== --- gcc/tree-pass.h (revisión: 147760) +++ gcc/tree-pass.h (copia de trabajo) @@ -400,6 +400,11 @@ extern struct gimple_opt_pass pass_local_pure_const; extern struct gimple_opt_pass pass_tracer; +/* XARK Passes */ +extern struct gimple_opt_pass pass_xark_backend_omp; +extern struct gimple_opt_pass pass_xark_print; +extern struct simple_ipa_opt_pass pass_xark_ipa_print; + /* IPA Passes */ extern struct ipa_opt_pass_d pass_ipa_inline; extern struct ipa_opt_pass_d pass_ipa_cp; Index: gcc/tree-xark-backend-omp.c =================================================================== --- gcc/tree-xark-backend-omp.c (revisión: 0) +++ gcc/tree-xark-backend-omp.c (revisión: 0) @@ -0,0 +1,1485 @@ +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "tree-flow.h" +#include "cfgloop.h" +#include "ggc.h" +#include "tree-data-ref.h" +#include "diagnostic.h" +#include "tree-pass.h" +#include "tree-scalar-evolution.h" +#include "hashtab.h" +#include "langhooks.h" +#include "tree-vectorizer.h" +#include "cgraph.h" +#include "tree-dump.h" + +#define MAX_CHAR_LOOPFN_NAME 100 + +void +dump_dom_info_state(enum cdi_direction, FILE *); +void +dump_cgraph_state(FILE *); +void +dump_functions_across_cgraph(FILE*); +void +hello_from(const char *, FILE *); +void +bye_from(const char *, FILE *); +void +print_bbs(FILE *); +void +print_loops_info(FILE *); +void +my_print_loops(FILE *); +bool +is_parallelizable_loop(struct loop *); +void +parallelize_loop(struct loop * loop); + +void +dump_dom_info_state(enum cdi_direction dir, FILE * f) +{ + enum dom_state state = dom_info_state(dir); + fprintf(f, "dom_info_state("); + if (dir == CDI_DOMINATORS) + { + fprintf(f, "CDI_DOMINATORS"); + } + else if (dir == CDI_POST_DOMINATORS) + { + fprintf(f, "CDI_POST_DOMINATORS"); + } + else + { + fprintf(f, "Impossible!!!"); + } + fprintf(f, "): "); + if (state == DOM_NONE) + { + fprintf(f, "DOM_NONE"); + } + else if (state == DOM_NO_FAST_QUERY) + { + fprintf(f, "DOM_NO_FAST_QUERY"); + } + else if (state == DOM_OK) + { + fprintf(f, "DOM_OK"); + } + else + { + fprintf(f, "Impossible!!!"); + } + fprintf(f, "\n"); +} + +void +dump_cgraph_state(FILE * f) +{ + fprintf(f, "cgraph state is "); + switch (cgraph_state) + { + case CGRAPH_STATE_CONSTRUCTION: + fprintf(f, "CGRAPH_STATE_CONSTRUCTION"); + break; + case CGRAPH_STATE_IPA: + fprintf(f, "CGRAPH_STATE_IPA"); + break; + case CGRAPH_STATE_IPA_SSA: + fprintf(f, "CGRAPH_STATE_IPA_SSA"); + break; + case CGRAPH_STATE_EXPANSION: + fprintf(f, "CGRAPH_STATE_EXPANSION"); + break; + case CGRAPH_STATE_FINISHED: + fprintf(f, "CGRAPH_STATE_FINISHED"); + break; + default: + gcc_unreachable(); + } + fprintf(f, "\n"); + fflush(f); +} + +void +dump_functions_across_cgraph(FILE* f) +{ + struct cgraph_node *node; + + for (node = cgraph_nodes; node; node = node->next) + { + tree temp_fn; + + temp_fn = current_function_decl; + current_function_decl = node->decl; + push_cfun(DECL_STRUCT_FUNCTION (node->decl)); + + fprintf(f, "* Dump of function %s *\n", + lang_hooks.decl_printable_name(current_function_decl, 2)); + + dump_function_to_file(current_function_decl, f, 0); + + pop_cfun(); + current_function_decl = temp_fn; + } +} + +void +hello_from(const char * function_name, FILE * f) +{ + fprintf(f, "* Hello from %s!!! *\n", function_name); + fflush(f); +} + +void +bye_from(const char * function_name, FILE * f) +{ + fprintf(f, "* Bye from %s!!! *\n", function_name); + fflush(f); +} + +/* Adapted from tree-blocking */ + +void +print_bbs(FILE * f) +{ + basic_block bb, prevbb, nextbb; + unsigned int i; + + hello_from("print_bbs", f); + + FOR_EACH_BB (bb) + { + prevbb = bb->prev_bb; + nextbb = bb->next_bb; + fprintf(f, "-------- BASIC BLOCK %d ---------\n", bb->index); + if (get_immediate_dominator(CDI_DOMINATORS, bb) != NULL) + { + fprintf(f, "--- Immediate dominator is %d ---\n", + get_immediate_dominator(CDI_DOMINATORS, bb)->index); + } + gimple_dump_bb(bb, f, 2, TDF_RAW); + fprintf(f, "> EDGE_COUNT (bb->preds): %d basic blocks:\n", EDGE_COUNT (bb->preds)); + + for (i = 0; i < (EDGE_COUNT (bb->preds)); i++) + { + + prevbb = EDGE_PRED (bb, i)->src; + fprintf(f, " - Previous basic block number %d is bb %d\n", i, + prevbb->index); + } + + fprintf(f, "> EDGE_COUNT (bb->succs): %d basic blocks : \n", EDGE_COUNT (bb->succs)); + + for (i = 0; i < (EDGE_COUNT (bb->succs)); i++) + { + + nextbb = EDGE_SUCC (bb, i)->dest; + fprintf(f, " - Next basic block number %d is bb %d \n", i, + nextbb->index); + } + + fprintf(f, "\n"); + } + + fflush(f); + return; +} + +/* Adapted from tree-blocking */ + +void +print_loops_info(FILE * f) +{ + loop_iterator li; + struct loop *loop; + basic_block *bbs; + edge edge; + unsigned int j; + + hello_from("print_loops_info", f); + + verify_flow_info(); + + fprintf(f, "NUMBER OF LOOPS = %d \n\n", number_of_loops()); + + FOR_EACH_LOOP (li, loop, 0) + { + fprintf(f, "--------"); + fprintf(f, " LOOP NUMBER %d ", loop->num); + fprintf(f, "--------\n"); + if (loop->next != NULL) + { + /* Link to the next (sibling) loop. */ + fprintf(f, " Next loop: %d \n", loop->next->num); + } + else + { + fprintf(f, " There isn't next loop \n"); + } + if (loop->inner != NULL) + { + /* The first inner (child) loop or NULL if innermost loop. */ + fprintf(f, " Inner loop: %d \n", loop->inner->num); + } + else + { + fprintf(f, " There isn't inner loop \n"); + } + + if (loop_outer(loop) != NULL) + { + /* The outer (parent) loop or NULL if outermost loop. */ + fprintf(f, " Outer loop %d \n", loop_outer(loop)->num); + } + else + { + fprintf(f, " This is the outermost loop \n"); + } + + /* The loop nesting depth. */ + fprintf(f, " Loop depth %d \n", loop_depth(loop)); + fprintf(f, "--- Body in dom order --- \n"); + bbs = get_loop_body_in_dom_order(loop); + if (f) + { + for (j = 0; j < loop->num_nodes; j++) + { + gimple_dump_bb(bbs[j], f, 2, TDF_RAW); + } + fprintf(f, "--- Header --- \n"); + gimple_dump_bb(loop->header, f, 2, TDF_RAW); + + fprintf(f, "--- Latch --- \n"); + gimple_dump_bb(loop->latch, f, 2, TDF_RAW); + + fprintf(f, "\n"); + edge = BRANCH_EDGE (loop->header); + fprintf(f, " Flag of branch edge: %d \n", edge->flags); + fprintf(f, " Loop init: "); + print_gimple_stmt(f, last_stmt(edge->dest->prev_bb), 0, TDF_RAW); + fprintf(f, " Loop step: "); + print_gimple_stmt(f, last_stmt(loop->latch), 0, TDF_RAW); + fprintf(f, " Loop limit: "); + print_generic_expr(f, gimple_cond_rhs(last_stmt(loop->header)), + TDF_RAW); + fprintf(f, "\n"); + + free(bbs); + } + } + bye_from("print_loops_info", f); + + return; +} + +void +my_print_loops(FILE * file) +{ + hello_from("my_print_loops", file); + print_loops(file, 3); + bye_from("my_print_loops", file); +} + +/* Copied from tree-parloops.c + Element of hashtable of names to copy. */ + +struct name_to_copy_elt +{ + unsigned version; /* The version of the name to copy. */ + tree new_name; /* The new name used in the copy. */ + tree field; /* The field of the structure used to pass the + value. */ +}; + +/* Copied from tree-parloops.c + Equality and hash functions for hashtab code. */ + +static int +name_to_copy_elt_eq(const void *aa, const void *bb) +{ + const struct name_to_copy_elt *a = (const struct name_to_copy_elt *) aa; + const struct name_to_copy_elt *b = (const struct name_to_copy_elt *) bb; + + return a->version == b->version; +} + +/* Copied from tree-parloops.c */ + +static hashval_t +name_to_copy_elt_hash(const void *aa) +{ + const struct name_to_copy_elt *a = (const struct name_to_copy_elt *) aa; + + return (hashval_t) a->version; +} + +/* Copied from tree-parloops.c + Assigns the address of OBJ in TYPE to an ssa name, and returns this name. + The assignment statement is placed on edge ENTRY. DECL_ADDRESS maps decls + to their addresses that can be reused. The address of OBJ is known to + be invariant in the whole function. */ + +static tree +take_address_of(tree obj, tree type, edge entry, htab_t decl_address) +{ + int uid; + void **dslot; + struct int_tree_map ielt, *nielt; + tree *var_p, name, bvar, addr; + gimple stmt; + gimple_seq stmts; + + /* Since the address of OBJ is invariant, the trees may be shared. + Avoid rewriting unrelated parts of the code. */ + obj = unshare_expr(obj); + for (var_p = &obj; handled_component_p(*var_p); var_p = &TREE_OPERAND (*var_p, 0)) + continue; + uid = DECL_UID (*var_p); + + ielt.uid = uid; + dslot = htab_find_slot_with_hash(decl_address, &ielt, uid, INSERT); + if (!*dslot) + { + addr = build_addr(*var_p, current_function_decl); + bvar = create_tmp_var(TREE_TYPE (addr), get_name(*var_p)); + add_referenced_var(bvar); + stmt = gimple_build_assign (bvar, addr); + name = make_ssa_name(bvar, stmt); + gimple_assign_set_lhs(stmt, name); + gsi_insert_on_edge_immediate(entry, stmt); + + nielt = XNEW (struct int_tree_map); + nielt->uid = uid; + nielt->to = name; + *dslot = nielt; + } + else + name = ((struct int_tree_map *) *dslot)->to; + + if (var_p != &obj) + { + *var_p = build1 (INDIRECT_REF, TREE_TYPE (*var_p), name); + name = force_gimple_operand(build_addr(obj, current_function_decl), + &stmts, true, NULL_TREE); + if (!gimple_seq_empty_p (stmts)) + gsi_insert_seq_on_edge_immediate (entry, stmts); + } + + if (TREE_TYPE (name) != type) + { + name = force_gimple_operand (fold_convert (type, name), &stmts, true, + NULL_TREE); + if (!gimple_seq_empty_p (stmts)) + gsi_insert_seq_on_edge_immediate (entry, stmts); + } + + return name; + } + +/* Copied from tree-parloops.c */ + +struct elv_data +{ + struct walk_stmt_info info; + edge entry; + htab_t decl_address; + bool changed; +}; + +/* Copied from tree-parloops.c + Eliminates references to local variables in *TP out of the single + entry single exit region starting at DTA->ENTRY. + DECL_ADDRESS contains addresses of the references that had their + address taken already. If the expression is changed, CHANGED is + set to true. Callback for walk_tree. */ + +static tree +eliminate_local_variables_1(tree *tp, int *walk_subtrees, void *data) +{ + struct elv_data * const dta = (struct elv_data *) data; + tree t = *tp, var, addr, addr_type, type, obj; + + if (DECL_P (t)) + { + *walk_subtrees = 0; + + if (!SSA_VAR_P (t) || DECL_EXTERNAL (t)) + return NULL_TREE; + + type = TREE_TYPE (t); + addr_type = build_pointer_type(type); + addr = take_address_of(t, addr_type, dta->entry, dta->decl_address); + *tp = build1 (INDIRECT_REF, TREE_TYPE (*tp), addr); + + dta->changed = true; + return NULL_TREE; + } + + if (TREE_CODE (t) == ADDR_EXPR) + { + /* ADDR_EXPR may appear in two contexts: + -- as a gimple operand, when the address taken is a function invariant + -- as gimple rhs, when the resulting address in not a function + invariant + We do not need to do anything special in the latter case (the base of + the memory reference whose address is taken may be replaced in the + DECL_P case). The former case is more complicated, as we need to + ensure that the new address is still a gimple operand. Thus, it + is not sufficient to replace just the base of the memory reference -- + we need to move the whole computation of the address out of the + loop. */ + if (!is_gimple_val(t)) + return NULL_TREE; + + *walk_subtrees = 0; + obj = TREE_OPERAND (t, 0); + var = get_base_address(obj); + if (!var || !SSA_VAR_P (var) || DECL_EXTERNAL (var)) + return NULL_TREE; + + addr_type = TREE_TYPE (t); + addr = take_address_of(obj, addr_type, dta->entry, dta->decl_address); + *tp = addr; + + dta->changed = true; + return NULL_TREE; + } + + if (!EXPR_P (t)) + *walk_subtrees = 0; + + return NULL_TREE; +} + +/* Copied from tree-parloops.c + Moves the references to local variables in STMT out of the single + entry single exit region starting at ENTRY. DECL_ADDRESS contains + addresses of the references that had their address taken + already. */ + +static void +eliminate_local_variables_stmt(edge entry, gimple stmt, htab_t decl_address) +{ + struct elv_data dta; + + memset(&dta.info, '\0', sizeof(dta.info)); + dta.entry = entry; + dta.decl_address = decl_address; + dta.changed = false; + + walk_gimple_op(stmt, eliminate_local_variables_1, &dta.info); + + if (dta.changed) + update_stmt(stmt); + +} + +/* Copied from tree-parloops.c + Eliminates the references to local variables from the single entry + single exit region between the ENTRY and EXIT edges. + + This includes: + 1) Taking address of a local variable -- these are moved out of the + region (and temporary variable is created to hold the address if + necessary). + + 2) Dereferencing a local variable -- these are replaced with indirect + references. */ + +static void +eliminate_local_variables(edge entry, edge exit) +{ + basic_block bb; + VEC (basic_block, heap) *body = VEC_alloc (basic_block, heap, 3); + unsigned i; + gimple_stmt_iterator gsi; + htab_t decl_address = htab_create(10, int_tree_map_hash, int_tree_map_eq, + free); + basic_block entry_bb = entry->src; + basic_block exit_bb = exit->dest; + + gather_blocks_in_sese_region(entry_bb, exit_bb, &body); + + for (i = 0; VEC_iterate (basic_block, body, i, bb); i++) + if (bb != entry_bb && bb != exit_bb) + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) + eliminate_local_variables_stmt(entry, gsi_stmt(gsi), decl_address); + + htab_delete(decl_address); + VEC_free (basic_block, heap, body); +} + +/* Copied from tree-parloops.c + Returns true if expression EXPR is not defined between ENTRY and + EXIT, i.e. if all its operands are defined outside of the region. */ + +static bool +expr_invariant_in_region_p(edge entry, edge exit, tree expr) +{ + + basic_block entry_bb = entry->src; + basic_block exit_bb = exit->dest; + basic_block def_bb; + + if (is_gimple_min_invariant (expr)) + return true; + + if (TREE_CODE (expr) == SSA_NAME) + { + def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr)); + if (def_bb + && dominated_by_p (CDI_DOMINATORS, def_bb, entry_bb) + && !dominated_by_p (CDI_DOMINATORS, def_bb, exit_bb)) + return false; + + return true; + } + + return false; +} + +/* Copied from tree-parloops.c + If COPY_NAME_P is true, creates and returns a duplicate of NAME. + The copies are stored to NAME_COPIES, if NAME was already duplicated, + its duplicate stored in NAME_COPIES is returned. + + Regardless of COPY_NAME_P, the decl used as a base of the ssa name is also + duplicated, storing the copies in DECL_COPIES. */ + +static tree +separate_decls_in_region_name(tree name, htab_t name_copies, + htab_t decl_copies, bool copy_name_p) +{ + tree copy, var, var_copy; + unsigned idx, uid, nuid; + struct int_tree_map ielt, *nielt; + struct name_to_copy_elt elt, *nelt; + void **slot, **dslot; + + if (TREE_CODE (name) != SSA_NAME) + return name; + + idx = SSA_NAME_VERSION (name); + elt.version = idx; + slot = htab_find_slot_with_hash(name_copies, &elt, idx, copy_name_p ? INSERT + : NO_INSERT); + if (slot && *slot) + return ((struct name_to_copy_elt *) *slot)->new_name; + + var = SSA_NAME_VAR (name); + uid = DECL_UID (var); + ielt.uid = uid; + dslot = htab_find_slot_with_hash(decl_copies, &ielt, uid, INSERT); + if (!*dslot) + { + var_copy = create_tmp_var(TREE_TYPE (var), get_name(var)); + DECL_GIMPLE_REG_P (var_copy) = DECL_GIMPLE_REG_P (var); + add_referenced_var(var_copy); + nielt = XNEW (struct int_tree_map); + nielt->uid = uid; + nielt->to = var_copy; + *dslot = nielt; + + /* Ensure that when we meet this decl next time, we won't duplicate + it again. */ + nuid = DECL_UID (var_copy); + ielt.uid = nuid; + dslot = htab_find_slot_with_hash(decl_copies, &ielt, nuid, INSERT); + gcc_assert (!*dslot); + nielt = XNEW (struct int_tree_map); + nielt->uid = nuid; + nielt->to = var_copy; + *dslot = nielt; + } + else + var_copy = ((struct int_tree_map *) *dslot)->to; + + if (copy_name_p) + { + copy = duplicate_ssa_name(name, NULL); + nelt = XNEW (struct name_to_copy_elt); + nelt->version = idx; + nelt->new_name = copy; + nelt->field = NULL_TREE; + *slot = nelt; + } + else + { + gcc_assert (!slot); + copy = name; + } + + SSA_NAME_VAR (copy) = var_copy; + + return copy; +} + +/* Copied from tree-parloops.c + Finds the ssa names used in STMT that are defined outside the + region between ENTRY and EXIT and replaces such ssa names with + their duplicates. The duplicates are stored to NAME_COPIES. Base + decls of all ssa names used in STMT (including those defined in + LOOP) are replaced with the new temporary variables; the + replacement decls are stored in DECL_COPIES. */ + +static void +separate_decls_in_region_stmt(edge entry, edge exit, gimple stmt, + htab_t name_copies, htab_t decl_copies) +{ + use_operand_p use; + def_operand_p def; + ssa_op_iter oi; + tree name, copy; + bool copy_name_p; + + mark_virtual_ops_for_renaming(stmt); + + FOR_EACH_PHI_OR_STMT_DEF (def, stmt, oi, SSA_OP_DEF) + { + name = DEF_FROM_PTR (def); + gcc_assert (TREE_CODE (name) == SSA_NAME); + copy = separate_decls_in_region_name(name, name_copies, decl_copies, + false); + gcc_assert (copy == name); + } + + FOR_EACH_PHI_OR_STMT_USE (use, stmt, oi, SSA_OP_USE) + { + name = USE_FROM_PTR (use); + if (TREE_CODE (name) != SSA_NAME) + continue; + + copy_name_p = expr_invariant_in_region_p(entry, exit, name); + copy = separate_decls_in_region_name(name, name_copies, decl_copies, + copy_name_p); + SET_USE (use, copy); + } + +} + +/* Copied from tree-parloops.c + Callback for htab_traverse. Adds a field corresponding to a ssa name + described in SLOT. The type is passed in DATA. */ + +static int +add_field_for_name(void **slot, void *data) +{ + struct name_to_copy_elt * const elt = (struct name_to_copy_elt *) *slot; + tree type = (tree) data; + tree name = ssa_name (elt->version); + tree var = SSA_NAME_VAR (name); + tree field = build_decl (FIELD_DECL, DECL_NAME (var), TREE_TYPE (var)); + + insert_field_into_struct(type, field); + elt->field = field; + + return 1; +} + +/* Copied from tree-parloops.c */ + +struct clsn_data +{ + tree store; + tree load; + + basic_block store_bb; + basic_block load_bb; +}; + +/* Copied from tree-parloops.c + Callback for htab_traverse. Creates loads to a field of LOAD in LOAD_BB and + store to a field of STORE in STORE_BB for the ssa name and its duplicate + specified in SLOT. */ + +static int +create_loads_and_stores_for_name(void **slot, void *data) +{ + struct name_to_copy_elt * const elt = (struct name_to_copy_elt *) *slot; + struct clsn_data * const clsn_data = (struct clsn_data *) data; + tree t; + gimple stmt; + gimple_stmt_iterator gsi; + tree type = TREE_TYPE (elt->new_name); + tree struct_type = TREE_TYPE (TREE_TYPE (clsn_data->load)); + tree load_struct; + + gsi = gsi_last_bb(clsn_data->store_bb); + t = build3 (COMPONENT_REF, type, clsn_data->store, elt->field, NULL_TREE); + stmt = gimple_build_assign (t, ssa_name (elt->version)); + mark_virtual_ops_for_renaming(stmt); + gsi_insert_after(&gsi, stmt, GSI_NEW_STMT); + + gsi = gsi_last_bb(clsn_data->load_bb); + load_struct = fold_build1 (INDIRECT_REF, struct_type, clsn_data->load); + t = build3 (COMPONENT_REF, type, load_struct, elt->field, NULL_TREE); + stmt = gimple_build_assign (elt->new_name, t); + SSA_NAME_DEF_STMT (elt->new_name) = stmt; + gsi_insert_after(&gsi, stmt, GSI_NEW_STMT); + + return 1; +} + +/* Copied from tree-parloops.c + Moves all the variables used in LOOP and defined outside of it (including + the initial values of loop phi nodes, and *PER_THREAD if it is a ssa + name) to a structure created for this purpose. The code + + while (1) + { + use (a); + use (b); + } + + is transformed this way: + + bb0: + old.a = a; + old.b = b; + + bb1: + a' = new->a; + b' = new->b; + while (1) + { + use (a'); + use (b'); + } + + `old' is stored to *ARG_STRUCT and `new' is stored to NEW_ARG_STRUCT. The + pointer `new' is intentionally not initialized (the loop will be split to a + separate function later, and `new' will be initialized from its arguments). + LD_ST_DATA holds information about the shared data structure used to pass + information among the threads. It is initialized here, and + gen_parallel_loop will pass it to create_call_for_reduction that + needs this information. REDUCTION_LIST describes the reductions + in LOOP. */ + +static void +separate_decls_in_region(edge entry, edge exit, tree *arg_struct, + tree *new_arg_struct) + +{ + struct clsn_data clsn_data_struct; + struct clsn_data *ld_st_data = &clsn_data_struct; + basic_block bb1 = split_edge(entry); + basic_block bb0 = single_pred(bb1); + htab_t name_copies = htab_create(10, name_to_copy_elt_hash, + name_to_copy_elt_eq, free); + htab_t decl_copies = + htab_create(10, int_tree_map_hash, int_tree_map_eq, free); + unsigned i; + tree type, type_name, nvar; + gimple_stmt_iterator gsi; + VEC (basic_block, heap) *body = VEC_alloc (basic_block, heap, 3); + basic_block bb; + basic_block entry_bb = bb1; + basic_block exit_bb = exit->dest; + + entry = single_succ_edge(entry_bb); + gather_blocks_in_sese_region(entry_bb, exit_bb, &body); + + for (i = 0; VEC_iterate (basic_block, body, i, bb); i++) + { + if (bb != entry_bb && bb != exit_bb) + { + for (gsi = gsi_start_phis(bb); !gsi_end_p(gsi); gsi_next(&gsi)) + separate_decls_in_region_stmt(entry, exit, gsi_stmt(gsi), + name_copies, decl_copies); + + for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) + separate_decls_in_region_stmt(entry, exit, gsi_stmt(gsi), + name_copies, decl_copies); + } + } + + VEC_free (basic_block, heap, body); + + if (htab_elements(name_copies) == 0) + { + /* It may happen that there is nothing to copy (if there are only + loop carried and external variables in the loop). */ + *arg_struct = NULL; + *new_arg_struct = NULL; + } + else + { + /* Create the type for the structure to store the ssa names to. */ + type = lang_hooks.types.make_type(RECORD_TYPE); + type_name = build_decl (TYPE_DECL, create_tmp_var_name (".paral_data"), + type); + TYPE_NAME (type) = type_name; + + htab_traverse(name_copies, add_field_for_name, type); + layout_type(type); + + /* Create the loads and stores. */ + *arg_struct = create_tmp_var(type, ".paral_data_store"); + add_referenced_var(*arg_struct); + nvar = create_tmp_var(build_pointer_type(type), ".paral_data_load"); + add_referenced_var(nvar); + *new_arg_struct = make_ssa_name(nvar, NULL); + + ld_st_data->store = *arg_struct; + ld_st_data->load = *new_arg_struct; + ld_st_data->store_bb = bb0; + ld_st_data->load_bb = bb1; + + htab_traverse(name_copies, create_loads_and_stores_for_name, ld_st_data); + + } + + htab_delete(decl_copies); + htab_delete(name_copies); +} + +/* Copied from tree-parloops.c + Bitmap containing uids of functions created by parallelization. We cannot + allocate it from the default obstack, as it must live across compilation + of several functions; we make it gc allocated instead. */ + +static GTY(()) bitmap parallelized_functions; + +/* Adapted from tree-parloops.c: added comments and add_referenced_var + Creates and returns an empty function that will receive the body of + a parallelized loop. */ + +static tree +create_loop_fn(void) +{ + char buf[MAX_CHAR_LOOPFN_NAME]; + char *tname; + tree decl, type, name, t; + struct function *act_cfun = cfun; + static unsigned loopfn_num; + + snprintf(buf, MAX_CHAR_LOOPFN_NAME, "%s.$loopfn", current_function_name()); + ASM_FORMAT_PRIVATE_NAME(tname, buf, loopfn_num++); + clean_symbol_name(tname); + name = get_identifier(tname); + type = build_function_type_list(void_type_node,ptr_type_node, NULL_TREE); + + decl = build_decl (FUNCTION_DECL, name, type); + + if (!parallelized_functions) + parallelized_functions = BITMAP_GGC_ALLOC (); + bitmap_set_bit(parallelized_functions, DECL_UID (decl)); + + /* Nonzero if function has been defined */ + TREE_STATIC (decl) = 1; + + /* Nonzero if the name is used in its scope */ + TREE_USED (decl) = 1; + + /* Represents a compiler-generated entity. */ + DECL_ARTIFICIAL (decl) = 1; + + /* Nonzero for a given ..._DECL node means that the name of this node + should be ignored for symbolic debug purposes. */ + DECL_IGNORED_P (decl) = 0; + + /* Nonzero if name is accessible from outside this module */ + TREE_PUBLIC (decl) = 0; + + /* In a FUNCTION_DECL, nonzero if the function cannot be inlined. */ + DECL_UNINLINABLE (decl) = 1; + + /* In a VAR_DECL or FUNCTION_DECL, nonzero means external reference: + do not allocate storage, and refer to a definition elsewhere. Note that + this does not necessarily imply the entity represented by NODE + has no program source-level definition in this translation unit. For + example, for a FUNCTION_DECL, DECL_SAVED_TREE may be non-NULL and + DECL_EXTERNAL may be true simultaneously; that can be the case for + a C99 "extern inline" function. */ + DECL_EXTERNAL (decl) = 0; + + /* This points to either the FUNCTION_DECL for the containing function + the RECORD_TYPE or UNION_TYPE for the containing type, or + NULL_TREE or a TRANSLATION_UNIT_DECL if the given decl has "file + scope". */ + DECL_CONTEXT (decl) = NULL_TREE; + + /* Holds the tree of BINDINGs. */ + DECL_INITIAL (decl) = make_node (BLOCK); + + t = build_decl (RESULT_DECL, NULL_TREE, void_type_node); + /* TODO: Is add_referenced_var(t) needed? */ + DECL_ARTIFICIAL (t) = 1; + DECL_IGNORED_P (t) = 1; + /* In FUNCTION_DECL, holds the decl for the return value. */ + DECL_RESULT (decl) = t; + + t = build_decl (PARM_DECL, get_identifier (".paral_data_param"), + ptr_type_node); + /* TODO: Why add_referenced_var(t) is needed? */ + add_referenced_var(t); + DECL_ARTIFICIAL (t) = 1; + /* For a PARM_DECL, records the data type used to pass the argument, + which may be different from the type seen in the program. */ + DECL_ARG_TYPE (t) = ptr_type_node; + + DECL_CONTEXT (t) = decl; + TREE_USED (t) = 1; + DECL_ARGUMENTS (decl) = t; + + allocate_struct_function(decl, false); + + /* The call to allocate_struct_function clobbers CFUN, so we need to restore + it. */ + set_cfun(act_cfun); + + /* + * ALTERNATIVE CODE: + * + * push_struct_function (decl); + * pop_cfun (); + */ + + return decl; +} + +/* Dummy analysis */ +bool +is_parallelizable_loop(struct loop * loop) +{ + bool analysis_result = false; + + if (flag_xark_debug && dump_file) + { + hello_from("is_parallelizable_loop", dump_file); + fprintf(dump_file, "flag_openmp: %d\n", flag_openmp); + fprintf(dump_file, "flag_tree_parallelize_loops: %d\n", + flag_tree_parallelize_loops); + fprintf(dump_file, + "IDENTIFIER_POINTER(DECL_NAME(current_function_decl)): %s\n", + IDENTIFIER_POINTER(DECL_NAME(current_function_decl))); + fprintf(dump_file, "loop->num: %d\n", loop->num); + } + + if (!(flag_openmp) && (flag_tree_parallelize_loops <= 1) && (strcmp( + IDENTIFIER_POINTER(DECL_NAME(current_function_decl)), "main") == 0) && (loop->num == 1)) + { + analysis_result = true; + } + else + { + analysis_result = false; + } + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "analysis_result: %d\n", analysis_result); + bye_from("is_parallelizable_loop", dump_file); + } + + return analysis_result; +} + +/* Adapted from tree-parloops.c:gen_parallel_loop and tree-parloops.c:create_parallel_loop */ +void +parallelize_loop(struct loop * loop) +{ + + basic_block bb_latch, bb_header; + VEC (edge, heap) *exits; + edge exit; + unsigned i; + edge true_edge, false_edge; + edge lpe, lle; + edge start; + edge region_entry, region_exit; + tree arg_struct, new_arg_struct; + static tree loop_fn; + basic_block bb_omp_parallel_head, bb_omp_parallel_return; + gimple_stmt_iterator gsi; + gimple stmt_omp_parallel_head, cond_stmt, stmt; + tree param; + tree cvar, cvar_init, initvar, cvar_next, cvar_base, type; + gimple phi; + basic_block for_bb, ex_bb; + edge guard, end, e; + tree t; + gimple for_stmt; + tree old_current_function_decl; + struct cgraph_node * node; + + bb_latch = loop->latch; + bb_header = loop->header; + + if (flag_xark_debug && dump_file) + { + hello_from("parallelize_loop", dump_file); + fprintf(dump_file, "Loop number %d (header = %d, latch = %d)\n", + loop->num, bb_header->index, bb_latch->index); + + exits = get_loop_exit_edges(loop); + for (i = 0; VEC_iterate (edge, exits, i, exit); i++) + { + fprintf(dump_file, + "Loop edge exit number %d -> src: bb%d dest: bb%d\n", i, + exit->src->index, exit->dest->index); + } + VEC_free (edge, heap, exits); + } + + extract_true_false_edges_from_block(loop->header, &true_edge, &false_edge); + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "Edges from block loop->header:\n"); + fprintf(dump_file, " True: src: bb%d dest: bb%d\n", + true_edge->src->index, true_edge->dest->index); + fprintf(dump_file, " False: src: bb%d dest: bb%d\n", + false_edge->src->index, false_edge->dest->index); + } + + lpe = loop_preheader_edge(loop); + lle = loop_latch_edge(loop); + + if (flag_xark_debug && dump_file) + { + my_print_loops(dump_file); + fprintf(dump_file, "Loop %d is going to be cancelled\n", loop->num); + } + /* FIXME: Do I want to cancel all loop tree or only outermost? */ + cancel_loop_tree(loop); + + if (flag_xark_debug && dump_file) + { + my_print_loops(dump_file); + dump_dom_info_state(CDI_DOMINATORS, dump_file); + dump_dom_info_state(CDI_POST_DOMINATORS, dump_file); + } + + /* FIXME: It will fail if lpe->src has various predecessors + * create_empty_bb()? */ + start = single_pred_edge(lpe->src); + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "start edge src: bb%d dest: bb%d\n", + start->src->index, start->dest->index); + } + split_edge(start); + if (flag_xark_debug && dump_file) + { + + fprintf(dump_file, "start_edge (after splitting) src: bb%d dest: bb%d\n", + start->src->index, start->dest->index); + } + + region_entry = single_succ_edge(start->dest); + region_exit = false_edge; + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, + "region_entry edge (after splitting) src: bb%d dest: bb%d\n", + region_entry->src->index, region_entry->dest->index); + my_print_loops(dump_file); + } + + eliminate_local_variables(start, region_exit); + + separate_decls_in_region(region_entry, region_exit, &arg_struct, + &new_arg_struct); + + if (flag_xark_debug && dump_file) + { + my_print_loops(dump_file); + + fprintf(dump_file, "true edge: src: bb%d dest: bb%d\n", + true_edge->src->index, true_edge->dest->index); + fprintf(dump_file, "false edge: src: bb%d dest: bb%d\n", + false_edge->src->index, false_edge->dest->index); + fprintf(dump_file, "lpe edge: src: bb%d dest: bb%d\n", lpe->src->index, + lpe->dest->index); + fprintf(dump_file, "lle edge: src: bb%d dest: bb%d\n", lle->src->index, + lle->dest->index); + fprintf(dump_file, "start edge: src: bb%d dest: bb%d\n", + start->src->index, start->dest->index); + fprintf(dump_file, "region_entry edge: src: bb%d dest: bb%d\n", + region_entry->src->index, region_entry->dest->index); + fprintf(dump_file, "region_exit edge: src: bb%d dest: bb%d\n", + region_exit->src->index, region_exit->dest->index); + } + + loop_fn = create_loop_fn(); + bb_omp_parallel_head = region_entry->src; + gsi = gsi_last_bb(bb_omp_parallel_head); + stmt_omp_parallel_head = gimple_build_omp_parallel(NULL, NULL, loop_fn, + arg_struct); + gsi_insert_after(&gsi, stmt_omp_parallel_head, GSI_NEW_STMT); + + if (arg_struct) + { + gsi = gsi_after_labels(region_entry->dest); + + param = make_ssa_name(DECL_ARGUMENTS (loop_fn), NULL); + stmt = gimple_build_assign (param, build_fold_addr_expr (arg_struct)); + gsi_insert_before(&gsi, stmt, GSI_SAME_STMT); + SSA_NAME_DEF_STMT (param) = stmt; + + stmt = gimple_build_assign (new_arg_struct, + fold_convert (TREE_TYPE (new_arg_struct), param)); + gsi_insert_before(&gsi, stmt, GSI_SAME_STMT); + SSA_NAME_DEF_STMT (new_arg_struct) = stmt; + } + + /* Emit GIMPLE_OMP_RETURN for GIMPLE_OMP_PARALLEL. */ + /* TODO: Would it be enough with split_edge()? + * At this moment, I don't have memory tags */ + bb_omp_parallel_return = split_loop_exit_edge(region_exit); + verify_loop_closed_ssa(); + gsi = gsi_last_bb(bb_omp_parallel_return); + gsi_insert_after(&gsi, gimple_build_omp_return(false), GSI_NEW_STMT); + + /* Extract data for GIMPLE_OMP_FOR */ + gcc_assert(bb_header == region_exit->src); + cond_stmt = last_stmt(bb_header); + + /* Iteration var */ + cvar = gimple_cond_lhs(cond_stmt); + + cvar_base = SSA_NAME_VAR (cvar); + /* Phi which defines cvar*/ + phi = SSA_NAME_DEF_STMT (cvar); + /* We get the initial value from lpe + * FIXME: I'm not propagating the value of initial assignment. */ + cvar_init = PHI_ARG_DEF_FROM_EDGE (phi, lpe); + + initvar = make_ssa_name(cvar_base, NULL); + SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, lpe), + initvar); + /* We get the new value from lle */ + cvar_next = PHI_ARG_DEF_FROM_EDGE (phi, lle); + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "\n"); + fprintf(dump_file, "cvar: "); + print_generic_expr(dump_file, cvar, 0); + fprintf(dump_file, "\n"); + fprintf(dump_file, "cvar: "); + print_generic_expr(dump_file, cvar, 0); + fprintf(dump_file, "\n"); + fprintf(dump_file, "cvar_init: "); + print_generic_expr(dump_file, cvar_init, 0); + fprintf(dump_file, "\n"); + fprintf(dump_file, "initvar: "); + print_generic_expr(dump_file, initvar, 0); + fprintf(dump_file, "\n"); + fprintf(dump_file, "cvar_next: "); + print_generic_expr(dump_file, cvar_next, 0); + fprintf(dump_file, "\n"); + } + + /* Removes incr stmt */ + gsi = gsi_last_bb(bb_latch); + gcc_assert (gsi_stmt (gsi) == SSA_NAME_DEF_STMT (cvar_next)); + gsi_remove(&gsi, true); + + /* Prepare cfg. */ + if (flag_xark_debug && dump_file) + { + my_print_loops(dump_file); + } + for_bb = split_edge(lpe); + ex_bb = split_loop_exit_edge(region_exit); + gcc_assert (false_edge == region_exit); + + guard = make_edge(for_bb, ex_bb, 0); + single_succ_edge(bb_latch)->flags = 0; + /* Edge between latch and ex_bb */ + end = make_edge(bb_latch, ex_bb, EDGE_FALLTHRU); + + if (flag_xark_debug && dump_file) + { + my_print_loops(dump_file); + } + + for (gsi = gsi_start_phis(ex_bb); !gsi_end_p(gsi); gsi_next(&gsi)) + { + phi = gsi_stmt(gsi); + /* From header */ + stmt = SSA_NAME_DEF_STMT (PHI_ARG_DEF_FROM_EDGE (phi, false_edge)); + /* From for_bb if it doesn't enter in loop */ + add_phi_arg(phi, PHI_ARG_DEF_FROM_EDGE (stmt, lpe), guard); + /* From latch */ + add_phi_arg(phi, PHI_ARG_DEF_FROM_EDGE (stmt, lle), + end); + } + + e = redirect_edge_and_branch(false_edge, true_edge->dest); + PENDING_STMT (e) = NULL; + + /* Emit GIMPLE_OMP_FOR. */ + gimple_cond_set_lhs(cond_stmt, cvar_base); + type = TREE_TYPE (cvar); + t = build_omp_clause(OMP_CLAUSE_SCHEDULE); + /* TODO: Isn't it the default? */ + OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC; + + for_stmt = gimple_build_omp_for(NULL, t, 1, NULL); + gimple_omp_for_set_index(for_stmt, 0, initvar); + gimple_omp_for_set_initial(for_stmt, 0, cvar_init); + gimple_omp_for_set_final(for_stmt, 0, gimple_cond_rhs(cond_stmt)); + gimple_omp_for_set_cond(for_stmt, 0, gimple_cond_code(cond_stmt)); + gimple_omp_for_set_incr(for_stmt, 0, build2 (PLUS_EXPR, type, + cvar_base, + build_int_cst (type, 1))); + + gsi = gsi_last_bb(for_bb); + gsi_insert_after(&gsi, for_stmt, GSI_NEW_STMT); + SSA_NAME_DEF_STMT (initvar) = for_stmt; + + /* Emit GIMPLE_OMP_CONTINUE. */ + gsi = gsi_last_bb(bb_latch); + stmt = gimple_build_omp_continue(cvar_next, cvar); + gsi_insert_after(&gsi, stmt, GSI_NEW_STMT); + SSA_NAME_DEF_STMT (cvar_next) = stmt; + + /* Emit GIMPLE_OMP_RETURN for GIMPLE_OMP_FOR. */ + gsi = gsi_last_bb(ex_bb); + gsi_insert_after(&gsi, gimple_build_omp_return(true), GSI_NEW_STMT); + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "\nBefore omp_expand_local\n"); + my_print_loops(dump_file); + + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + + } + + /* OpenMP expansion */ + omp_expand_local(bb_omp_parallel_head); + + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "\nAfter omp_expand_local\n"); + my_print_loops(dump_file); + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + } + + /* At this point, cgraph is innaccurate. + * But it doesn't seem to be necessary. */ + if (flag_xark_debug && dump_file) + { + fprintf(dump_file, "\nRebuilding cgraph\n"); + } + + /* Rebuild cgraph_edges with cfun */ + rebuild_cgraph_edges(); + + if (flag_xark_debug && dump_file) + { + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + fprintf(dump_file, "*--*\n"); + } + + /* Rebuild cgraph_edges with loop_fn */ + push_cfun(DECL_STRUCT_FUNCTION (loop_fn)); + old_current_function_decl = current_function_decl; + current_function_decl = loop_fn; + + node = cgraph_node(loop_fn); + node->analyzed = true; + node->lowered = true; + if (flag_xark_debug && dump_file) + { + dump_cgraph_node(dump_file, cgraph_node(loop_fn)); + fprintf(dump_file, "*--*\n"); + } + rebuild_cgraph_edges(); + current_function_decl = old_current_function_decl; + pop_cfun(); + + if (flag_xark_debug && dump_file) + { + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + } + verify_cgraph(); + + /* Check for unreachable nodes */ + cgraph_remove_unreachable_nodes(false, dump_file); + if (flag_xark_debug && dump_file) + { + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + fprintf(dump_file, "*--*\n"); + dump_functions_across_cgraph(dump_file); + } + + update_ssa(TODO_update_ssa); + verify_ssa(true); + + verify_flow_info(); + verify_dominators(CDI_DOMINATORS); + verify_loop_structure(); + verify_loop_closed_ssa(); + +} + +unsigned int +execute_xark_backend_omp(void) +{ + + loop_iterator li; + struct loop * loop; + + if (flag_xark_debug && dump_file) + { + hello_from("execute_xark_backend_omp", dump_file); + /* print_bbs(dump_file); */ + } + + loop_optimizer_init(LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); + if (flag_xark_debug && dump_file) + { + /* print_loops_info(dump_file); */ + my_print_loops(dump_file); + } + + FOR_EACH_LOOP(li, loop, 0) + { + if (is_parallelizable_loop(loop)) + { + parallelize_loop(loop); + } + } + + loop_optimizer_finalize(); + + update_ssa(TODO_update_ssa); + verify_ssa(true); + if (flag_xark_debug && dump_file) + { + bye_from("execute_xark_backend_omp", dump_file); + } + + /* + * In tree-parloops.c + * + * return TODO_cleanup_cfg | TODO_rebuild_alias; + */ + return 0; +} + +unsigned int +execute_xark_print(void) +{ + + if (dump_file) + { + hello_from("execute_xark_print", dump_file); + + verify_ssa(true); + + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + dump_functions_across_cgraph(dump_file); + + bye_from("execute_xark_print", dump_file); + } + + return 0; +} + +unsigned int +execute_xark_ipa_print(void) +{ + + if (dump_file) + { + hello_from("execute_xark_ipa_print", dump_file); + + dump_cgraph(dump_file); + dump_cgraph_state(dump_file); + dump_functions_across_cgraph(dump_file); + + bye_from("execute_xark_ipa_print", dump_file); + } + + return 0; + +} + +static bool +gate_xark_backend_omp(void) +{ + return flag_xark_backend_omp; +} + +static bool +gate_xark_print(void) +{ + return flag_xark_debug; +} + +static bool +gate_xark_ipa_print(void) +{ + return flag_xark_debug; +} + +struct gimple_opt_pass pass_xark_backend_omp = + { + { GIMPLE_PASS, "xark-backend-omp", /* name */ + gate_xark_backend_omp, /* gate */ + execute_xark_backend_omp, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + 0, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0 + /* | TODO_cleanup_cfg | TODO_update_ssa | TODO_rebuild_alias */ + | TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts | TODO_dump_func /* todo_flags_finish */ + } }; + +struct gimple_opt_pass pass_xark_print = + { + { GIMPLE_PASS, "xark-print", /* name */ + gate_xark_print, /* gate */ + execute_xark_print, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + 0, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0 | TODO_dump_func /* todo_flags_finish */ + } }; + +struct simple_ipa_opt_pass pass_xark_ipa_print = + { + { SIMPLE_IPA_PASS, "xark-ipa-print", /* name */ + gate_xark_ipa_print, /* gate */ + execute_xark_ipa_print, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + 0, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0 /* todo_flags_finish */ + } }; + +#include "gt-tree-xark-backend-omp.h" Index: gcc/common.opt =================================================================== --- gcc/common.opt (revisión: 147760) +++ gcc/common.opt (copia de trabajo) @@ -1274,6 +1274,18 @@ Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees +fxark-debug +Common Report Var(flag_xark_debug) +XARK debug information + +fxark-backend-omp +Common Report Var(flag_xark_backend_omp) +Generates OpenMP code based on XARK recognition + +fverify-ssa-debug +Common Report Var(flag_verify_ssa_debug) +verify_ssa debug information + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Optimization Compile whole compilation unit at a time Index: gcc/tree-ssa.c =================================================================== --- gcc/tree-ssa.c (revisión: 147760) +++ gcc/tree-ssa.c (copia de trabajo) @@ -539,7 +539,30 @@ tree op; enum dom_state orig_dom_state = dom_info_state (CDI_DOMINATORS); bitmap names_defined_in_bb = BITMAP_ALLOC (NULL); + FILE *verify_ssa_dump_file = NULL; + + if (flag_verify_ssa_debug) + { + verify_ssa_dump_file = fopen("verify_ssa.dump", "a"); + fprintf(verify_ssa_dump_file, + "*-* *-* *-* *-* *-* *-* *-* *-* *-* *-* *-* *-*\n"); + fprintf(verify_ssa_dump_file, + "*-* verify_ssa:print_current_pass *-*\n"); + print_current_pass(verify_ssa_dump_file); + fprintf(verify_ssa_dump_file, + "*-* verify_ssa:dump_current_function_decl *-*\n"); + dump_function_to_file(current_function_decl, verify_ssa_dump_file, + TDF_RAW); + fprintf(verify_ssa_dump_file, + "*-* verify_ssa:dump_referenced_vars *-*\n"); + dump_referenced_vars(verify_ssa_dump_file); + fprintf(verify_ssa_dump_file, + "*-* verify_ssa:dump_immediate_uses *-*\n"); + dump_immediate_uses(verify_ssa_dump_file); + fflush(verify_ssa_dump_file); + } + gcc_assert (!need_ssa_update_p (cfun)); verify_stmts (); @@ -706,6 +729,16 @@ } bitmap_clear (names_defined_in_bb); + + if (flag_verify_ssa_debug) + { + fprintf(verify_ssa_dump_file, "*-* verify_ssa:"); + fprintf(verify_ssa_dump_file, "bb%d ", bb->index); + fprintf(verify_ssa_dump_file, "of %s ", current_function_name()); + fprintf(verify_ssa_dump_file, "funtion processed *-*\n"); + fflush(verify_ssa_dump_file); + } + } free (definition_block); Index: gcc/tree-flow.h =================================================================== --- gcc/tree-flow.h (revisión: 147760) +++ gcc/tree-flow.h (copia de trabajo) @@ -586,6 +586,12 @@ extern void phinodes_print_statistics (void); #endif +/* XARK Passes */ +/* In tree-xark-backend-omp.c */ +extern unsigned int execute_xark_backend_omp(void); +extern unsigned int execute_xark_print(void); +extern unsigned int execute_xark_ipa_print(void); + /* In gimple-low.c */ extern void record_vars_into (tree, tree); extern void record_vars (tree); Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in (revisión: 147760) +++ gcc/Makefile.in (copia de trabajo) @@ -1286,6 +1286,7 @@ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + tree-xark-backend-omp.o \ tree.o \ value-prof.o \ var-tracking.o \ @@ -2199,6 +2200,13 @@ $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H) $(GGC_H) \ $(BASIC_BLOCK_H) tree-ssa-propagate.h $(FLAGS_H) $(TREE_DUMP_H) \ $(CFGLOOP_H) tree-chrec.h $(TIMEVAR_H) $(TOPLEV_H) intl.h + +tree-xark-backend-omp.o : tree-xark-backend-omp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ + $(RTL_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h $(DIAGNOSTIC_H) \ + $(TREE_FLOW_H) $(TIMEVAR_H) $(FLAGS_H) $(EXPR_H) $(TOPLEV_H) tree-pass.h \ + $(GGC_H) except.h $(SPLAY_TREE_H) $(OPTABS_H) $(CFGLOOP_H) $(CGRAPH_H)\ + tree-iterator.h gt-tree-xark-backend-omp.h + tree-cfg.o : tree-cfg.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(FLAGS_H) output.h \ $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \ @@ -3328,6 +3336,7 @@ $(srcdir)/gimple.h $(srcdir)/gimple.c \ $(srcdir)/tree-mudflap.c $(srcdir)/tree-flow.h \ $(srcdir)/tree-ssanames.c $(srcdir)/tree-eh.c $(srcdir)/tree-ssa-address.c \ + $(srcdir)/tree-xark-backend-omp.c \ $(srcdir)/tree-cfg.c \ $(srcdir)/tree-dfa.c \ $(srcdir)/tree-iterator.c $(srcdir)/gimplify.c \ Index: gcc/passes.c =================================================================== --- gcc/passes.c (revisión: 147760) +++ gcc/passes.c (copia de trabajo) @@ -544,6 +544,11 @@ NEXT_PASS (pass_referenced_vars); NEXT_PASS (pass_build_ssa); + /* XARK starts here */ + NEXT_PASS (pass_xark_print); + NEXT_PASS (pass_xark_backend_omp); + NEXT_PASS (pass_xark_print); + NEXT_PASS (pass_early_warn_uninitialized); NEXT_PASS (pass_all_early_optimizations); { @@ -568,7 +573,9 @@ NEXT_PASS (pass_release_ssa_names); NEXT_PASS (pass_rebuild_cgraph_edges); NEXT_PASS (pass_inline_parameters); + NEXT_PASS (pass_xark_print); } + NEXT_PASS (pass_xark_ipa_print); NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_matrix_reorg); NEXT_PASS (pass_ipa_cp);