On Tue, 14 May 2013, Jakub Jelinek wrote:
> Hi!
>
> This patch adds safelen field to struct loop, teaches expand_omp_simd
> to set it on the simd loops and then uses it in a few places:
> 1) because the loops are explicitly marked for vectorization by the user,
> we'll try to ifconvert them and vectorize even without -O3, -Ofast or
> -ftree-vectorize (but explicit -fno-tree-vectorize will still disable
> that behavior)
> 2) the data dependency analysis uses it to decide about unknown and bad
> data dependencies
> 3) unrolling is disabled for those loops, I think we don't want to unroll
> those loops until vectorization, and after vectorization we just clear
> the safelen, so that it can be unrolled afterwards
>
> In the end we'll want to do much more on the vectorizer side, handle calls
> to elemental functions, handle conditionalized calls to elemental functions,
> or even vectorize loops where some part of the loop isn't really
> vectorizable and needs to be sequential, but other parts of the loop are
> vectorizable. for (...) { vectorizable_bb; non-vectorizable_bb;
> vectorizable_bb; }
> can be turned into for (...) { vectorized_bb; for (temp = 0; temp < vf;
> temp++) non-vectorizable_bb; vectorized_bb; } etc.
>
> Does this look ok?
>
> 2013-05-14 Jakub Jelinek <[email protected]>
>
> * cfgloop.h (struct loop): Add safelen field.
> * omp-low.c (expand_omp_simd): If !broken_loop, fix_loop_structure
> to create loop for the simd region and set safelen field.
> * tree-vectorizer.c (vectorize_loops): If loop has safelen set,
> vectorize it even if flag_vectorize isn't set. Clear loop->safelen
> after vectorization.
> * tree-ssa-loop.c (gate_tree_vectorize): Return true even for
> flag_openmp if -fno-tree-vectorize hasn't been specified.
> * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Don't
> unroll loops with non-NULL loop->safelen.
> * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): For unknown
> or bad data dependency, if loop->safelen is non-NULL, just decrease
> *max_vf to loop->safelen if needed and return false.
> * tree-if-conv.c (main_tree_if_conversion): If-convert also loops with
> non-NULL loop->safelen.
> (gate_tree_if_conversion): Return true even for
> flag_openmp if -fno-tree-vectorize hasn't been specified.
>
> --- gcc/cfgloop.h.jj 2013-05-13 16:49:44.000000000 +0200
> +++ gcc/cfgloop.h 2013-05-13 17:30:18.630883633 +0200
> @@ -176,6 +176,12 @@ struct GTY ((chain_next ("%h.next"))) lo
>
> /* Number of iteration analysis data for RTL. */
> struct niter_desc *simple_loop_desc;
> +
> + /* If non-NULL, an INTEGER_CST, where the user asserted that for any
> + I in [ 0, nb_iterations ) and for any J in
> + [ I, min ( I + safelen, nb_iterations ) ), the Ith and Jth iterations
> + of the loop can be safely evaluated concurrently. */
> + tree safelen;
Can you make this a double_int (or a HOST_WIDE_INT or an int) instead
please? It should map to data-dependence analysis distance vectors
which currently is a vector of 'int'.
Is there a magic value to tell safelen is "infinity"? As I read
above safelen == 0 would mean all iterations are dependent. Are
negative safelen values well-defined? The comment doesn't seem
to disallow them.
Also make sure to copy the field in copy_loop_info and stream
it in output/input_cfg in lto-streamer-in/out.c.
> };
>
> /* Flags for state of loop structure. */
> --- gcc/omp-low.c.jj 2013-05-13 16:37:05.000000000 +0200
> +++ gcc/omp-low.c 2013-05-13 18:46:18.310405585 +0200
> @@ -4960,6 +4960,8 @@ expand_omp_simd (struct omp_region *regi
> edge e, ne;
> tree *counts = NULL;
> int i;
> + tree safelen = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt),
> + OMP_CLAUSE_SAFELEN);
> type = TREE_TYPE (fd->loop.v);
> entry_bb = region->entry;
> @@ -5157,6 +5159,22 @@ expand_omp_simd (struct omp_region *regi
> set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb);
> set_immediate_dominator (CDI_DOMINATORS, l2_bb, l1_bb);
> set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb);
> +
> + if (!broken_loop)
> + {
> + struct loop *loop;
> + calculate_dominance_info (CDI_DOMINATORS);
> + fix_loop_structure (NULL);
Ick. Didn't I properly add loops everywhere?
> + loop = l1_bb->loop_father;
> + if (safelen == NULL_TREE)
> + {
> + safelen = build_nonstandard_integer_type (TYPE_PRECISION (type), 1);
> + safelen = TYPE_MAX_VALUE (safelen);
> + }
> + else
> + safelen = OMP_CLAUSE_SAFELEN_EXPR (safelen);
> + loop->safelen = safelen;
> + }
> }
>
>
> --- gcc/tree-vectorizer.c.jj 2013-05-13 16:49:03.000000000 +0200
> +++ gcc/tree-vectorizer.c 2013-05-13 20:44:58.721863725 +0200
> @@ -101,7 +101,8 @@ vectorize_loops (void)
> than all previously defined loops. This fact allows us to run
> only over initial loops skipping newly generated ones. */
> FOR_EACH_LOOP (li, loop, 0)
> - if (optimize_loop_nest_for_speed_p (loop))
> + if ((flag_tree_vectorize && optimize_loop_nest_for_speed_p (loop))
> + || loop->safelen)
So you vectorize all loops with a safelen? I'd say this warrants an
extra flag in struct loop, force_vect.
> {
> loop_vec_info loop_vinfo;
> vect_location = find_loop_location (loop);
> @@ -122,6 +123,9 @@ vectorize_loops (void)
> LOC_FILE (vect_location), LOC_LINE (vect_location));
> vect_transform_loop (loop_vinfo);
> num_vectorized_loops++;
> + /* Now that the loop has been vectorized, allow it to be unrolled
> + etc. */
> + loop->safelen = NULL_TREE;
> }
>
> vect_location = UNKNOWN_LOC;
> --- gcc/tree-ssa-loop.c.jj 2013-05-13 16:46:36.000000000 +0200
> +++ gcc/tree-ssa-loop.c 2013-05-13 19:12:57.301538324 +0200
> @@ -225,7 +225,8 @@ tree_vectorize (void)
> static bool
> gate_tree_vectorize (void)
> {
> - return flag_tree_vectorize;
> + return flag_tree_vectorize
> + || (flag_openmp && !global_options_set.x_flag_tree_vectorize);
And a flag in cfun here, whether any loop has force_vect set (or
a flag in current_loops)
> }
>
> struct gimple_opt_pass pass_vectorize =
> --- gcc/tree-ssa-loop-ivcanon.c.jj 2013-05-13 16:46:36.000000000 +0200
> +++ gcc/tree-ssa-loop-ivcanon.c 2013-05-13 20:06:44.176519188 +0200
> @@ -1123,6 +1123,11 @@ tree_unroll_loops_completely_1 (bool may
> if (changed)
> return true;
>
> + /* Don't unroll #pragma omp simd loops until the vectorizer
> + attempts to vectorize those. */
> + if (loop->safelen)
> + return false;
> +
> /* Try to unroll this loop. */
> loop_father = loop_outer (loop);
> if (!loop_father)
> --- gcc/tree-vect-data-refs.c.jj 2013-05-13 16:49:08.000000000 +0200
> +++ gcc/tree-vect-data-refs.c 2013-05-13 20:41:51.579889330 +0200
> @@ -255,6 +255,16 @@ vect_analyze_data_ref_dependence (struct
> /* Unknown data dependence. */
> if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
> {
> + /* If user asserted there safelen consecutive iterations can be
> + executed concurrently, and safelen >= *max_vf, assume
> + independence. */
> + if (loop->safelen)
> + {
> + if (compare_tree_int (loop->safelen, *max_vf) < 0)
> + *max_vf = tree_low_cst (loop->safelen, 0);
> + return false;
> + }
> +
> if (STMT_VINFO_GATHER_P (stmtinfo_a)
> || STMT_VINFO_GATHER_P (stmtinfo_b))
> {
> @@ -291,6 +301,16 @@ vect_analyze_data_ref_dependence (struct
> /* Known data dependence. */
> if (DDR_NUM_DIST_VECTS (ddr) == 0)
> {
> + /* If user asserted there safelen consecutive iterations can be
> + executed concurrently, and safelen >= *max_vf, assume
> + independence. */
> + if (loop->safelen)
> + {
> + if (compare_tree_int (loop->safelen, *max_vf) < 0)
> + *max_vf = tree_low_cst (loop->safelen, 0);
> + return false;
> + }
> +
Rather than looking at safelen from data-dependence analysis consumers
data-dependence analysis itself should use the information. Which
is why I'd like the 'safelen' thing to map to the distance vector
representation of dependence analysis.
Thanks,
Richard.