On Tue, 14 May 2013, Jakub Jelinek wrote: > Hi! > > This patch adds safelen field to struct loop, teaches expand_omp_simd > to set it on the simd loops and then uses it in a few places: > 1) because the loops are explicitly marked for vectorization by the user, > we'll try to ifconvert them and vectorize even without -O3, -Ofast or > -ftree-vectorize (but explicit -fno-tree-vectorize will still disable > that behavior) > 2) the data dependency analysis uses it to decide about unknown and bad > data dependencies > 3) unrolling is disabled for those loops, I think we don't want to unroll > those loops until vectorization, and after vectorization we just clear > the safelen, so that it can be unrolled afterwards > > In the end we'll want to do much more on the vectorizer side, handle calls > to elemental functions, handle conditionalized calls to elemental functions, > or even vectorize loops where some part of the loop isn't really > vectorizable and needs to be sequential, but other parts of the loop are > vectorizable. for (...) { vectorizable_bb; non-vectorizable_bb; > vectorizable_bb; } > can be turned into for (...) { vectorized_bb; for (temp = 0; temp < vf; > temp++) non-vectorizable_bb; vectorized_bb; } etc. > > Does this look ok? > > 2013-05-14 Jakub Jelinek <ja...@redhat.com> > > * cfgloop.h (struct loop): Add safelen field. > * omp-low.c (expand_omp_simd): If !broken_loop, fix_loop_structure > to create loop for the simd region and set safelen field. > * tree-vectorizer.c (vectorize_loops): If loop has safelen set, > vectorize it even if flag_vectorize isn't set. Clear loop->safelen > after vectorization. > * tree-ssa-loop.c (gate_tree_vectorize): Return true even for > flag_openmp if -fno-tree-vectorize hasn't been specified. > * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Don't > unroll loops with non-NULL loop->safelen. > * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): For unknown > or bad data dependency, if loop->safelen is non-NULL, just decrease > *max_vf to loop->safelen if needed and return false. > * tree-if-conv.c (main_tree_if_conversion): If-convert also loops with > non-NULL loop->safelen. > (gate_tree_if_conversion): Return true even for > flag_openmp if -fno-tree-vectorize hasn't been specified. > > --- gcc/cfgloop.h.jj 2013-05-13 16:49:44.000000000 +0200 > +++ gcc/cfgloop.h 2013-05-13 17:30:18.630883633 +0200 > @@ -176,6 +176,12 @@ struct GTY ((chain_next ("%h.next"))) lo > > /* Number of iteration analysis data for RTL. */ > struct niter_desc *simple_loop_desc; > + > + /* If non-NULL, an INTEGER_CST, where the user asserted that for any > + I in [ 0, nb_iterations ) and for any J in > + [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations > + of the loop can be safely evaluated concurrently. */ > + tree safelen;
Can you make this a double_int (or a HOST_WIDE_INT or an int) instead please? It should map to data-dependence analysis distance vectors which currently is a vector of 'int'. Is there a magic value to tell safelen is "infinity"? As I read above safelen == 0 would mean all iterations are dependent. Are negative safelen values well-defined? The comment doesn't seem to disallow them. Also make sure to copy the field in copy_loop_info and stream it in output/input_cfg in lto-streamer-in/out.c. > }; > > /* Flags for state of loop structure. */ > --- gcc/omp-low.c.jj 2013-05-13 16:37:05.000000000 +0200 > +++ gcc/omp-low.c 2013-05-13 18:46:18.310405585 +0200 > @@ -4960,6 +4960,8 @@ expand_omp_simd (struct omp_region *regi > edge e, ne; > tree *counts = NULL; > int i; > + tree safelen = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt), > + OMP_CLAUSE_SAFELEN); > type = TREE_TYPE (fd->loop.v); > entry_bb = region->entry; > @@ -5157,6 +5159,22 @@ expand_omp_simd (struct omp_region *regi > set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb); > set_immediate_dominator (CDI_DOMINATORS, l2_bb, l1_bb); > set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb); > + > + if (!broken_loop) > + { > + struct loop *loop; > + calculate_dominance_info (CDI_DOMINATORS); > + fix_loop_structure (NULL); Ick. Didn't I properly add loops everywhere? > + loop = l1_bb->loop_father; > + if (safelen == NULL_TREE) > + { > + safelen = build_nonstandard_integer_type (TYPE_PRECISION (type), 1); > + safelen = TYPE_MAX_VALUE (safelen); > + } > + else > + safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen); > + loop->safelen = safelen; > + } > } > > > --- gcc/tree-vectorizer.c.jj 2013-05-13 16:49:03.000000000 +0200 > +++ gcc/tree-vectorizer.c 2013-05-13 20:44:58.721863725 +0200 > @@ -101,7 +101,8 @@ vectorize_loops (void) > than all previously defined loops. This fact allows us to run > only over initial loops skipping newly generated ones. */ > FOR_EACH_LOOP (li, loop, 0) > - if (optimize_loop_nest_for_speed_p (loop)) > + if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop)) > + || loop->safelen) So you vectorize all loops with a safelen? I'd say this warrants an extra flag in struct loop, force_vect. > { > loop_vec_info loop_vinfo; > vect_location = find_loop_location (loop); > @@ -122,6 +123,9 @@ vectorize_loops (void) > LOC_FILE (vect_location), LOC_LINE (vect_location)); > vect_transform_loop (loop_vinfo); > num_vectorized_loops++; > + /* Now that the loop has been vectorized, allow it to be unrolled > + etc. */ > + loop->safelen = NULL_TREE; > } > > vect_location = UNKNOWN_LOC; > --- gcc/tree-ssa-loop.c.jj 2013-05-13 16:46:36.000000000 +0200 > +++ gcc/tree-ssa-loop.c 2013-05-13 19:12:57.301538324 +0200 > @@ -225,7 +225,8 @@ tree_vectorize (void) > static bool > gate_tree_vectorize (void) > { > - return flag_tree_vectorize; > + return flag_tree_vectorize > + || (flag_openmp && !global_options_set.x_flag_tree_vectorize); And a flag in cfun here, whether any loop has force_vect set (or a flag in current_loops) > } > > struct gimple_opt_pass pass_vectorize = > --- gcc/tree-ssa-loop-ivcanon.c.jj 2013-05-13 16:46:36.000000000 +0200 > +++ gcc/tree-ssa-loop-ivcanon.c 2013-05-13 20:06:44.176519188 +0200 > @@ -1123,6 +1123,11 @@ tree_unroll_loops_completely_1 (bool may > if (changed) > return true; > > + /* Don't unroll #pragma omp simd loops until the vectorizer > + attempts to vectorize those. */ > + if (loop->safelen) > + return false; > + > /* Try to unroll this loop. */ > loop_father = loop_outer (loop); > if (!loop_father) > --- gcc/tree-vect-data-refs.c.jj 2013-05-13 16:49:08.000000000 +0200 > +++ gcc/tree-vect-data-refs.c 2013-05-13 20:41:51.579889330 +0200 > @@ -255,6 +255,16 @@ vect_analyze_data_ref_dependence (struct > /* Unknown data dependence. */ > if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know) > { > + /* If user asserted there safelen consecutive iterations can be > + executed concurrently, and safelen >= *max_vf, assume > + independence. */ > + if (loop->safelen) > + { > + if (compare_tree_int (loop->safelen, *max_vf) < 0) > + *max_vf = tree_low_cst (loop->safelen, 0); > + return false; > + } > + > if (STMT_VINFO_GATHER_P (stmtinfo_a) > || STMT_VINFO_GATHER_P (stmtinfo_b)) > { > @@ -291,6 +301,16 @@ vect_analyze_data_ref_dependence (struct > /* Known data dependence. */ > if (DDR_NUM_DIST_VECTS (ddr) == 0) > { > + /* If user asserted there safelen consecutive iterations can be > + executed concurrently, and safelen >= *max_vf, assume > + independence. */ > + if (loop->safelen) > + { > + if (compare_tree_int (loop->safelen, *max_vf) < 0) > + *max_vf = tree_low_cst (loop->safelen, 0); > + return false; > + } > + Rather than looking at safelen from data-dependence analysis consumers data-dependence analysis itself should use the information. Which is why I'd like the 'safelen' thing to map to the distance vector representation of dependence analysis. Thanks, Richard.